SlideShare a Scribd company logo
analyzing MLB data with
ggplot
Greg Lamp
ggplot
● What is it?
● Alternatives
● How it works
● Why should I use it?
● Brief case study
● Questions
Here I am on
the Internet.
Founder/CTO @ Yhat
Hi, I’m Greg!
What is
ggplot?
Analyzing mlb data with ggplot
DSL for graphics
DSL for graphics
scatterplot
histogram
labels
color
shape
What about
matplotlib?
Analyzing mlb data with ggplot
a quick example
Analyzing mlb data with ggplot
matplotlib ggplot
it’s not all bad!
matplotlib
syntax, api,
default themes,
learning curve
matplotlib
maturity, ipython,
customization, community
syntax, api,
default themes,
learning curve
What about
d3.js?
d3.js
ggplot
ggplot d3.js
How it works
Format
ggplot
Analyzing mlb data with ggplot
data frame
“aesthetics”
Aesthetics
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
color
shape
size
...fill, alpha, slope,
intercept, ymin,
ymax, ...
Geoms,
Stats, &
Scales
geom_point
geom_area
...there are many
stat_smooth
...there are a few
scale_color_brewer
scale_color_gradient
...there are many
Layers
ggplot()
+
ggplot() geom_point()
+ +
ggplot() geom_point() stat_smooth()
+ +
ggplot() geom_point() stat_smooth()+ +
ggplot() +
geom_point() +
stat_smooth()
Why is this
good?
Makes “reasonable
assumptions”
not real colors
matplotlib freaks
still not real colors
...but i can guess
what you mean
Analyzing mlb data with ggplot
Concise yet
expressive
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Looks pretty good
(and is easy to customize)
Analyzing mlb data with ggplot
Seaborngithub.com/mwaskom/seaborn
Case Study
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
pitch speed
Analyzing mlb data with ggplot
103.4 mph
Analyzing mlb data with ggplot
Load ggplot and pandas
Read in our pitch f/x data
define the x-
axis
pass in your data frame
add a histogram
How does fatigue
impact velocity?
...not helpful
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
What about at the
individual level?
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Justin
Verlander
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
Analyzing mlb data with ggplot
ggplot let’s you
fail quicker
Finding Help
/tagged/python-ggplot
http://ggplot.yhathq.com
What’s next?
Analyzing mlb data with ggplot
Thanks!
@theglamp
greg@yhathq.com

More Related Content

PDF
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
PDF
ggplot for python
PDF
Optimal Tooling for Machine Learning and AI
PDF
R for Python Users
PDF
Think machine-learning-with-scikit-learn-chetan
PDF
A step towards machine learning at accionlabs
PDF
Limits of Machine Learning
PDF
Getting a Data Science Job
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
ggplot for python
Optimal Tooling for Machine Learning and AI
R for Python Users
Think machine-learning-with-scikit-learn-chetan
A step towards machine learning at accionlabs
Limits of Machine Learning
Getting a Data Science Job

Viewers also liked (20)

PDF
Electron - Build desktop apps using javascript
PDF
Ggplot in python
PDF
Table of Useful R commands.
PDF
Python at yhat (august 2013)
PPTX
Analyze this
PDF
Hadley verse
PDF
Using R for Social Media and Sports Analytics
PPTX
What is r in spanish.
PPTX
Summer school python in spanish
PDF
Rcpp
PDF
Kush stats alpha
PDF
Logical Fallacies
PDF
Yhat - Applied Data Science - Feb 2016
PPTX
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
PDF
Training in Analytics and Data Science
PDF
Software Testing for Data Scientists
PPTX
Training in Analytics, R and Social Media Analytics
PDF
PDF
Advanced R cheat sheet
PPTX
Introduction to sas in spanish
Electron - Build desktop apps using javascript
Ggplot in python
Table of Useful R commands.
Python at yhat (august 2013)
Analyze this
Hadley verse
Using R for Social Media and Sports Analytics
What is r in spanish.
Summer school python in spanish
Rcpp
Kush stats alpha
Logical Fallacies
Yhat - Applied Data Science - Feb 2016
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Training in Analytics and Data Science
Software Testing for Data Scientists
Training in Analytics, R and Social Media Analytics
Advanced R cheat sheet
Introduction to sas in spanish
Ad

More from Austin Ogilvie (6)

PDF
2013 - Yhat - YC app.pdf
PDF
2013 05-27-yhat-about
PDF
Yhat 2017 Investor Deck
PDF
Finding Lanes for Self-Driving Cars - PyData Berlin Jul 2017- Ross Kippenbroc...
PDF
Applied Data Science with Yhat
PDF
Predictive Models for Production Apps with Yhat
2013 - Yhat - YC app.pdf
2013 05-27-yhat-about
Yhat 2017 Investor Deck
Finding Lanes for Self-Driving Cars - PyData Berlin Jul 2017- Ross Kippenbroc...
Applied Data Science with Yhat
Predictive Models for Production Apps with Yhat
Ad

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Getting Started with Data Integration: FME Form 101
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Machine Learning_overview_presentation.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
MIND Revenue Release Quarter 2 2025 Press Release
Assigned Numbers - 2025 - Bluetooth® Document
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Spectroscopy.pptx food analysis technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Group 1 Presentation -Planning and Decision Making .pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Network Security Unit 5.pdf for BCA BBA.
MYSQL Presentation for SQL database connectivity
SOPHOS-XG Firewall Administrator PPT.pptx
Unlocking AI with Model Context Protocol (MCP)
Dropbox Q2 2025 Financial Results & Investor Presentation
Getting Started with Data Integration: FME Form 101
Reach Out and Touch Someone: Haptics and Empathic Computing
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Machine Learning_overview_presentation.pptx
Machine learning based COVID-19 study performance prediction
Advanced methodologies resolving dimensionality complications for autism neur...

Analyzing mlb data with ggplot