Overview This Kaggle notebook is dedicated to the data cleaning and exploratory data analysis (EDA) of the Netflix dataset. The notebook focuses on leveraging Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn to clean the dataset and gain insights through visualization.
Dataset The dataset used in this notebook is sourced from Netflix and contains information about various movies and TV shows available on the platform. It includes features such as title, type (movie or TV show), director, cast, country, release year, rating, duration, and listed in various genres.
Contents
- Data Loading: Loading the Netflix dataset into the notebook.
- Data Cleaning: Preprocessing the dataset to handle missing values, inconsistencies, and formatting issues.
- Exploratory Data Analysis (EDA): Conducting a thorough analysis of the dataset to uncover patterns, trends, and relationships between variables.
- Data Visualization: Utilizing Matplotlib and Seaborn to create visualizations that aid in understanding the data better.
- Insights: Summarizing key findings and insights derived from the analysis.
Libraries Used
- Pandas: For data manipulation and analysis.
- NumPy: For numerical operations and array handling.
- Matplotlib: For creating basic plots and visualizations.
- Seaborn: For statistical data visualization and more advanced plots.
Author [Omar Ayman/omaraymanata] - [https://www.kaggle.com/code/omaraymanatia/netflix-data-cleaning-and-eda]