Libros bestsellers hasta 50% dcto  Ver más

menu

0
  • argentina
  • chile
  • colombia
  • españa
  • méxico
  • perú
  • estados unidos
  • internacional
portada Cleaning Data for Effective Data Science: Doing the Other 80% of the Work With Python, r, and Command-Line Tools (in English)
Type
Physical Book
Year
2021
Language
English
Pages
498
Format
Paperback
Dimensions
23.5 x 19.1 x 2.5 cm
Weight
0.85 kg.
ISBN13
9781801071291

Cleaning Data for Effective Data Science: Doing the Other 80% of the Work With Python, r, and Command-Line Tools (in English)

David Mertz (Author) · Packt Publishing · Paperback

Cleaning Data for Effective Data Science: Doing the Other 80% of the Work With Python, r, and Command-Line Tools (in English) - Mertz, David

Physical Book

$ 37.04

$ 43.99

You save: $ 6.95

16% discount
  • Condition: New
It will be shipped from our warehouse between Monday, May 20 and Tuesday, May 21.
You will receive it anywhere in United States between 1 and 3 business days after shipment.

Synopsis "Cleaning Data for Effective Data Science: Doing the Other 80% of the Work With Python, r, and Command-Line Tools (in English)"

A comprehensive guide for data scientists to master effective data cleaning tools and techniquesKey Features: Think about your data intelligently and ask the right questionsMaster data cleaning techniques using hands-on examples belonging to diverse domainsWork with detailed, commented, well-tested code samples in Python and RBook Description: In data science, data analysis, or machine learning, most of the effort needed to achieve your actual purpose lies in cleaning your data. Using Python, R, and command-line tools, you will learn the essential cleaning steps performed in every production data science or data analysis pipeline. This book not only teaches you data preparation but also what questions you should ask of your data.The book dives into the practical application of tools and techniques needed for data ingestion, anomaly detection, value imputation, and feature engineering. It also offers long-form exercises at the end of each chapter to practice the skills acquired.You will begin by looking at data ingestion of a range of data formats. Moving on, you will impute missing values, detect unreliable data and statistical anomalies, and generate synthetic features that are necessary for successful data analysis and visualization goals.By the end of this book, you will have acquired a firm understanding of the data cleaning process necessary to perform real-world data science and machine learning tasks.What You Will Learn: Ingest and work with common tabular, hierarchical, and other data formatsApply useful rules and heuristics for assessing data quality and detecting biasIdentify and handle unreliable data and outliers in their many formsImpute sensible values into missing data and use sampling to fix imbalancesGenerate synthetic features that help to draw out patterns in your dataPrepare data competently and correctly for analytic and machine learning tasksWho this book is for: This book is designed to benefit software developers, data scientists, aspiring data scientists, and students who are interested in data analysis or scientific computing. Basic familiarity with statistics, general concepts in machine learning, knowledge of a programming language (Python or R), and some exposure to data science are helpful. The text will also be helpful to intermediate and advanced data scientists who want to improve their rigor in data hygiene and wish for a refresher on data preparation issues.

Customers reviews

More customer reviews
  • 0% (0)
  • 0% (0)
  • 0% (0)
  • 0% (0)
  • 0% (0)

Frequently Asked Questions about the Book

All books in our catalog are Original.
The book is written in English.
The binding of this edition is Paperback.

Questions and Answers about the Book

Do you have a question about the book? Login to be able to add your own question.

Opinions about Bookdelivery

More customer reviews