Freshly Printed - allow 4 days lead
Couldn't load pickup availability
Introduction to Data Science for Social and Policy Research
Collecting and Organizing Data with R and Python
This comprehensive guide provides a step-by-step approach to data collection, cleaning, formatting, and storage, using Python and R.
Jose Manuel Magallanes Reyes (Author)
9781107540255, Cambridge University Press
Paperback / softback, published 21 September 2017
314 pages
22.9 x 15.2 x 1.8 cm, 0.46 kg
'This book will be of great assistance to public policy and management scholars desiring a rigorous introduction to Data Science, particularly with regard to the intricacies of data management. The step-by-step approach will help teachers and students, in both undergraduate and graduate programs, become familiar with essential programming skills, particularly with respect to analyzing Big Data and making it available through Open Government initiatives. The author also provides a very helpful service in using both R and Python to show how to accomplish the same task, which allows readers to decide which of these languages will best serve their needs.' Craig W. Thomas, Evans School of Public Policy and Governance, University of Washington
Real-world data sets are messy and complicated. Written for students in social science and public management, this authoritative but approachable guide describes all the tools needed to collect data and prepare it for analysis. Offering detailed, step-by-step instructions, it covers collection of many different types of data including web files, APIs, and maps; data cleaning; data formatting; the integration of different sources into a comprehensive data set; and storage using third-party tools to facilitate access and shareability, from Google Docs to GitHub. Assuming no prior knowledge of R and Python, the author introduces programming concepts gradually, using real data sets that provide the reader with practical, functional experience.
Part I. Get Started: 1. Introduction
2. Setting up the tools
3. Basics of R and Python
Part II. Collect and Clean: 4. Collecting data
5. Cleaning data
Part III. Format and Storage: 6. Formatting the 'clean' data
7. Integrating and storing.
Subject Areas: Probability & statistics [PBT], Social research & statistics [JHBC], Research methods: general [GPS]
