Data Wrangling and Cleaning

KMUTNB

About This Course

Data Wrangling and Cleaning are crucial processes in the data science pipeline, transforming raw, messy data into a clean, structured format suitable for analysis. While often used interchangeably, they serve distinct purposes. Data wrangling is a broader process that involves collecting, organizing, and manipulating data to make it more accessible and usable. It includes tasks such as merging data from different sources, reshaping data structures, and handling missing values. Data cleaning, a subset of wrangling, focuses on identifying and correcting errors, inconsistencies, and inaccuracies in the dataset. This involves removing duplicates, standardizing formats, and addressing outliers. Both processes ensure data quality and reliability, leading to more accurate insights and informed decision-making. As the volume and complexity of data continue to grow, mastering these skills becomes increasingly important for data scientists and analysts working across various industries.

Requirements

Prerequisites for Data Wrangling and Cleaning include basic programming skills (Python/R), understanding of data structures, and familiarity with database concepts (SQL).

Course Staff

Staff Member #1

Biography of instructor/staff member #1

Staff Member #2

Biography of instructor/staff member #2

Frequently Asked Questions

What web browser should I use?

The Open edX platform works best with current versions of Chrome, Edge, Firefox, or Safari.

See our list of supported browsers for the most up-to-date information.

KMUTNB: ITD-DataWrangling Data Wrangling and Cleaning