Member-only story
Python — Analyze Your Own Netflix Data
Build Your Own Dataset Using Netflix Data & Python Pandas

Are you looking for datasets to start a new data science project? While it's not always easy to find relevant data, why not start with your own? Welcome to Python Data Science December #1.
Netflix lets you download your complete watching history and you can build cool data science projects on top of it. I’ll guide you through
- getting the raw data from Netflix
- cleaning & transforming the data
- visualizing the data.
We will make use of Python Pandas and Matplotlib. At the end of this story, you will have finished a simple but full-blown data science project that you can add to your portfolio.
If you do not use Netflix, I will share my sampled & anonymized watching history with you. You can find it at the end of the story, in the chapter Summary & Resources.
📓 Getting The Raw Data
Netflix allows you to request your own data for download.
- Navigate to Netflix’s get my info page & request your data
- After sending the request, you get an email that you need to confirm
- After a while, you get another mail and the download is ready (for me it took not more than 1 day)
- You will get a .zip file with the following folder structure
- Unzip it and navigate to CONTENT_INTERACTION to open the file ViewingActivity.csv which contains the full list of your viewing history.
Alright, time to get our hands dirty.
🔍 Examine the data
I am using Python Jupyter notebook to examine the data, but you can also just take a regular Python script. To get a first impression of the data, let’s install & import Pandas and read the file into a Pandas DataFrame.
import pandas as pd
df = pd.read_csv('ViewingActivity.csv')
First, let us understand how many rows & columns we have using…
df.shape #returns the number of rows and columns
> (9273, 10)