
Whole game
Our goal in this part of the book is to give you a rapid overview of the main tools of data science: importing, tidying, transforming, and visualizing data, as shown in Figure 1. We want to show you the “whole game” of data science giving you just enough of all the major pieces so that you can tackle real, if simple, datasets. The later parts of the book will hit each of these topics in more depth, increasing the range of data science challenges that you can tackle.
在本书这一部分,我们的目标是快速概览数据科学的主要工具:导入 (importing)、整洁化 (tidying)、转换 (transforming) 和 可视化 (visualizing),如 Figure 1 所示。 我们希望向你展示数据科学的 “完整全局 (whole game)”,让你对所有关键环节都有足够认识,以便能够处理真正但相对简单的数据集。 后续章节将更深入地探讨这些主题,逐步提升你可应对的数据科学挑战的广度和深度。

Four chapters focus on the tools of data science:
以下四个章节集中介绍数据科学的核心工具:
Visualization is a great place to start with R programming, because the payoff is so clear: you get to make elegant and informative plots that help you understand data. In 1 Data visualization you’ll dive into visualization, learning the basic structure of a ggplot2 plot, and powerful techniques for turning data into plots.
以可视化入门 R 编程是绝佳选择,因为它的回报十分直观:你可以绘制优雅而信息丰富的图形来理解数据。 在 1 Data visualization 中,你将深入学习可视化,掌握 ggplot2 图形的基本结构,以及将数据转化为图形的强大技巧。Visualization alone is typically not enough, so in 3 Data transformation, you’ll learn the key verbs that allow you to select important variables, filter out key observations, create new variables, and compute summaries.
单靠可视化通常不足以完成分析任务,因此在 3 Data transformation 中,你将学习一组关键 “动词 (verbs)”:选择重要变量、筛选关键观测、创建新变量并生成汇总结果。In 5 Data tidying, you’ll learn about tidy data, a consistent way of storing your data that makes transformation, visualization, and modelling easier. You’ll learn the underlying principles, and how to get your data into a tidy form.
在 5 Data tidying 中,你将了解 “整洁数据 (tidy data)” 的概念,这是一种一致的存储方式,可让后续转换、可视化与建模工作更轻松。 你将学习其核心原则,以及如何将数据整理成整洁格式。Before you can transform and visualize your data, you need to first get your data into R. In 7 Data import you’ll learn the basics of getting
.csv
files into R.
在转换和可视化数据之前,你需要先将数据导入 R。 7 Data import 将教你把.csv
文件读入 R 的基础方法。
Nestled among these chapters are four other chapters that focus on your R workflow. In 2 Workflow: basics, 4 Workflow: code style, and 6 Workflow: scripts and projects you’ll learn good workflow practices for writing and organizing your R code. These will set you up for success in the long run, as they’ll give you the tools to stay organized when you tackle real projects. Finally, 8 Workflow: getting help will teach you how to get help and keep learning.
除了上述内容,本部分还穿插了四个专注于 R 工作流的章节。 在 2 Workflow: basics、4 Workflow: code style 以及 6 Workflow: scripts and projects 中,你将学习撰写并组织 R 代码的良好工作流实践。 这些技能将在长期项目中助你保持条理与高效。 最后,8 Workflow: getting help 将指导你如何寻求帮助并持续学习。