Free views, likes and subscribers at YouTube. Now!
Get Free YouTube Subscribers, Views and Likes

Solving 100 Python Pandas Problems! (from easy to very difficult)

Follow
Keith Galli

In this tutorial, you'll gain handson experience with the python pandas library, building experience with data manipulation and analysis skills important for data science. You'll learn how to create, modify, and analyze DataFrames, handle missing data (NaNs), clean messy data, and generate some visualizations. By tackling a variety of problems, from basic data handling to advanced DataFrame techniques, you'll build a solid foundation in managing and interpreting realworld data sets using pandas.

Repo we're working off of (credit to Alex Riley who put repo together):
https://github.com/ajcr/100pandaspu...

My code solutions (use repo above for blank starting template):
https://github.com/KeithGalli/100pan...

Hope that you enjoy this video. If you do, make sure to like it and subscribe to not miss future videos like this!

Video Timeline!
0:00 Intro & Setup
2:14 Problems (13) Initial pandas setup
4:42 Problems (410) DataFrame operations
4:52 4) Create a dataframe from dictionary
5:24 5) Display dataframe summary
5:41 6) First 3 rows of the dataframe
6:02 7) Select ‘animal’ and ‘age’ columns
7:42 8) Data in specific rows and columns
9:06 9) Rows with visits greater than 3
9:57 10) Rows with NaN in age
10:56 11) Cats younger than 3 years
11:35 12) Age between 2 and 4
12:45 13) Change age in row ‘f’
15:56 14) Sum of all visits
16:41 15) Average age by animal
20:21 16) Modify and revert rows
24:06 17) Count by animal type
25:28 Quick review
26:17 18) Sort by age and visits
28:07 19) Convert 'priority' to boolean
29:42 20) Replace 'snake' with 'python'
30:53 21) Mean age by animal and visits
33:49 Advanced DataFrame techniques
33:57 22) Filter duplicate integers
43:18 23) Subtract row mean
45:42 24) Column with smallest sum
50:39 25) Count unique rows
53:17 26) Column with third NaN
1:10:27 Solution review for 26
1:17:13 27) Sum of top three values
1:24:01 28) Sum by column condition
1:40:11 Recent problem review
1:42:53 29) Count differences since last zero
1:56:19 30) Locate largest values
2:08:38 31) Replace negatives with mean
2:17:43 32) Rolling mean over groups
2:23:10 Series and DatetimeIndex
2:23:12 33) DatetimeIndex for 2015
2:27:56 34) Sum values on Wednesdays
2:45:04 35) Monthly mean values
2:46:16 36) Best value in fourmonth groups
2:50:26 37) DatetimeIndex of third Thursdays
2:59:03 Cleaning Data
2:59:40 38) Fill missing FlightNumber
3:02:45 39) Split column by delimiter
3:06:47 40) Fix city name capitalization
3:08:30 41) Reattach columns
3:13:11 42) Fix airline name punctuation
3:17:45 43) Expand RecentDelays into columns
3:27:31 MultiIndexes in Pandas
3:27:34 44) Construct a MultiIndex
3:30:37 Solution review
3:32:44 45) Lexicographically sorted check
3:32:58 46) Select specific MultiIndex labels
3:34:23 47) Slice Series with MultiIndex
3:35:24 48) Sum by first level
3:37:47 49) Alternative sum method
3:40:08 Additional solution insights
3:41:22 50) Swap MultiIndex levels
3:45:27 Minesweeper problems
3:45:44 51) Generate coordinate grid
4:00:28 52) Add 'safe' or 'mine' column
4:03:04 53) Count adjacent mines
4:27:33 Review solution to 53
4:33:02 Skipped problems 54 & 55
4:33:11 Plotting
4:33:12 56) Scatter plot with black x markers
4:41:26 57) Plot four data types
4:52:50 58) Overlay multiple graphs
5:03:11 59) Hourly stock data summary
5:14:12 60) Candlestick plot


Practice your Python Pandas data science skills with problems on StrataScratch!
https://stratascratch.com/?via=keith

posted by n2u3i2s5