SlideShare a Scribd company logo
Data Visualization
April 3, 2015
• When you should graph
• What you should graph
• Given some data, how would you graph it
When should you graph your data?
2Data Visualization
Always
Don’t just make graphs for client reports -- graph your data for
yourself, so you understand it.
If you use a table in a report, see if you can make it into a graph.
Why graphs?
Because of the environment that humans evolved in, we are much
better at getting info from color, size, shape, and position than from
reading text.
3Data Visualization
Find the dangerous creatures!
Why graphs work
• Color
•Size
• Shape
• Position
4Data Visualization
Why else do people like graphs?
People like cool-looking stuff.
5Data Visualization
Not cool Cool
What are we currently doing?
• Making lots of tables
6Data Visualization
Group Mean 25% 50% 75%
Bananas 11.3 2.7 4.6 23.1
Kittens 4.0 0.9 3.6 7.5
Phones -3.1 -11.0 -2.9 2.2
Variable Parameter
Estimate
Cuteness 0.6***
Ability to Fly 1.4***
Deadliness 11.2***
Telepathy -9.8***
Big Ears -17.3***
What is wrong with tables?
Tables give only a partial picture – means only tell us so much.
Figuring out what’s bigger, and by how much, requires more work.
The information is not necessarily in any order, so we need to read
all the numbers.
7Data Visualization
What kinds of graphs should you make?
• The distribution, instead of
giving just mean, median, etc.
• The relationship between two
variables – the conditional
distribution
• Graph estimation results’ point
estimates and confidence
intervals
8Data Visualization
What to expect out of this presentation
1. Discussion of the type of graph (e.g. distributions)
2. How the type of graph applies to continuous vs. categorical data
3. Extensions (e.g. graphing more than one at a time)
What not to expect: how to do these in any particular software.
9Data Visualization
Distributions
10Data Visualization
Distributions – Continuous variables
Make density plots/histograms for continuous variables. These give
much more information than means, medians, etc.
Two distributions with the same mean, but which are dramatically different.
11Data Visualization
Density vs. histogram
A density plot is basically a smoothed histogram.
12Data Visualization
Distributions – Categorical variables
Make bar charts for categorical variables.
Tip: if your categories don’t have any inherent order, order them
from largest to smallest.
13Data Visualization
Compare distributions using color
Suppose we want to compare the distribution of income among
different occupations. Plot all the distributions, distinguished by
color, and use transparency to make them all visible simultaneously.
14Data Visualization
Highlighting important facts
Add vertical lines to highlight the means.
15Data Visualization
Relationships
16Data Visualization
Relationships between variables
If we’re asking, for example, what GDP growth looks like at different
levels of government spending, we can show this using a
scatterplot.
17Data Visualization
How to show trends
We can highlight the trend using scatterplot smoothing, which
adapts the shape of the trend line to the data.
18Data Visualization
How to show multiple groups
We can see if the relationship differs among groups by giving each
group a color.
19Data Visualization
Another use for colors
Suppose we want to come up with rules to identify people’s favorite
food based on population density and elevation (bear with me)
Can we see this on a graph?
20Data Visualization
Graphing relationships with categorical data
With categorical data, you typically can’t use scatterplots because
points fall right on top of each other (‘overplotting’).
However! We can use jittering to move the plotted points slightly.
21Data Visualization
Without jittering With jittering
Graphing relationships with categorical data
The next step beyond jittering is to use a boxplot, which shows
– The mean,
– 25th and 75th percentiles,
– 1.5 times the inter-quartile range (IQR)
– outliers (plotted as points)
22Data Visualization
mean
75th pctile
mean + 1.5 *IQR
outlier
Looping back
A boxplot isn’t, after all, all that different from the multi-colored
density plot we showed earlier. Which is better depends on what
you’re trying to show.
23Data Visualization
Use log scale if your data spans a wide range
Let’s say you have a large
range of values, but most of
your data is concentrated to
one part of the range.
It’s easier to see what’s
going when we use log
scale.
24Data Visualization
Estimation results
25Data Visualization
Graphing estimation results
We make a lot of regression tables, but we can make them easier to
understand by putting them into graphs.
26Data Visualization
ggplot(df, aes(population_density, elevation, color = favorite_food)) +
geom_point()
27Data Visualization
dataset x variable y variable
make scatterplot
color variable
All graphs made in R and ggplot2
Data Visualization Checklist
• Always graph
• Use color, size, shape, and position
• Three important types of graph:
– Distribution
– Relationship
– Estimation results
• Highlight important facts
• Make it cool-looking
28Data Visualization

More Related Content

PDF
Optimizing the Visual Presentation of Your Data
PPT
Different types of graphs
PDF
Data visualization 101_how_to_design_charts_and_graphs
PDF
Clustering - Machine Learning Techniques
PPTX
How Does Math Matter in Data Science
PPTX
Dimension Reduction: What? Why? and How?
PDF
Types of graphs and charts and their uses with examples and pics
PDF
Machine Learning Algorithm - Decision Trees
Optimizing the Visual Presentation of Your Data
Different types of graphs
Data visualization 101_how_to_design_charts_and_graphs
Clustering - Machine Learning Techniques
How Does Math Matter in Data Science
Dimension Reduction: What? Why? and How?
Types of graphs and charts and their uses with examples and pics
Machine Learning Algorithm - Decision Trees

What's hot (17)

PPTX
Top 8 Different Types Of Charts In Statistics And Their Uses
PDF
Machine Learning Algorithm - KNN
PPTX
Data Handling
PPTX
Statistics
PDF
Effective Business Presentations with Storyboarding and Data Visualization
PDF
Communicating Effectively with Data Visualization
DOCX
Different types of charts
PPTX
Advanced excel
PPTX
Types of Charts
PPT
Interpret data for use in charts and graphs
PPT
Charts And Graphs
PPTX
Making a Pie Chart
PDF
Data visualization & Story Telling with Data
PPT
TID Chapter 5 Introduction To Charts And Graph
PPTX
Basics of Educational Statistics (Graphs & its Types)
PDF
Summary data visualization
PPT
Types Of Charts
Top 8 Different Types Of Charts In Statistics And Their Uses
Machine Learning Algorithm - KNN
Data Handling
Statistics
Effective Business Presentations with Storyboarding and Data Visualization
Communicating Effectively with Data Visualization
Different types of charts
Advanced excel
Types of Charts
Interpret data for use in charts and graphs
Charts And Graphs
Making a Pie Chart
Data visualization & Story Telling with Data
TID Chapter 5 Introduction To Charts And Graph
Basics of Educational Statistics (Graphs & its Types)
Summary data visualization
Types Of Charts
Ad

Similar to Data Visualization by David Kretch (20)

PPTX
QQ Plot.pptx
PDF
DATA VISUALIZATION
PPT
Visual Analytics in Big Data
PDF
Data Visualization Techniques
PPTX
Datascape Introduction
PDF
Data Visualisation Top 5 Techniques And Tools.pdf
PPTX
Introduction to Data Visualization_Day 1.pptx
PDF
data science important material..........
PDF
Tableau Final Presentation
PPTX
Tableau Presentation
PDF
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
PDF
Art and Science of Dashboard Design
PDF
Design for Delight
PDF
Data visualization
PPTX
Lecture 6 Data Visualisation.pptxsfsfsfsfsdfs
PDF
The Data Stroytelling Handbook
PPTX
Unit III.pptx
PPTX
Diowane2003
PDF
A Tour through the Data Vizualization Zoo - Communications of the ACM
PDF
Data Visualization Techniques
QQ Plot.pptx
DATA VISUALIZATION
Visual Analytics in Big Data
Data Visualization Techniques
Datascape Introduction
Data Visualisation Top 5 Techniques And Tools.pdf
Introduction to Data Visualization_Day 1.pptx
data science important material..........
Tableau Final Presentation
Tableau Presentation
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
Art and Science of Dashboard Design
Design for Delight
Data visualization
Lecture 6 Data Visualisation.pptxsfsfsfsfsdfs
The Data Stroytelling Handbook
Unit III.pptx
Diowane2003
A Tour through the Data Vizualization Zoo - Communications of the ACM
Data Visualization Techniques
Ad

Recently uploaded (20)

PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPT
DATA COLLECTION METHODS-ppt for nursing research
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Managing Community Partner Relationships
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
importance of Data-Visualization-in-Data-Science. for mba studnts
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
IBA_Chapter_11_Slides_Final_Accessible.pptx
DATA COLLECTION METHODS-ppt for nursing research
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
A Complete Guide to Streamlining Business Processes
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Mega Projects Data Mega Projects Data
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Qualitative Qantitative and Mixed Methods.pptx
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Galatica Smart Energy Infrastructure Startup Pitch Deck
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Pilar Kemerdekaan dan Identi Bangsa.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Managing Community Partner Relationships
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Data_Analytics_and_PowerBI_Presentation.pptx

Data Visualization by David Kretch

  • 1. Data Visualization April 3, 2015 • When you should graph • What you should graph • Given some data, how would you graph it
  • 2. When should you graph your data? 2Data Visualization Always Don’t just make graphs for client reports -- graph your data for yourself, so you understand it. If you use a table in a report, see if you can make it into a graph.
  • 3. Why graphs? Because of the environment that humans evolved in, we are much better at getting info from color, size, shape, and position than from reading text. 3Data Visualization Find the dangerous creatures!
  • 4. Why graphs work • Color •Size • Shape • Position 4Data Visualization
  • 5. Why else do people like graphs? People like cool-looking stuff. 5Data Visualization Not cool Cool
  • 6. What are we currently doing? • Making lots of tables 6Data Visualization Group Mean 25% 50% 75% Bananas 11.3 2.7 4.6 23.1 Kittens 4.0 0.9 3.6 7.5 Phones -3.1 -11.0 -2.9 2.2 Variable Parameter Estimate Cuteness 0.6*** Ability to Fly 1.4*** Deadliness 11.2*** Telepathy -9.8*** Big Ears -17.3***
  • 7. What is wrong with tables? Tables give only a partial picture – means only tell us so much. Figuring out what’s bigger, and by how much, requires more work. The information is not necessarily in any order, so we need to read all the numbers. 7Data Visualization
  • 8. What kinds of graphs should you make? • The distribution, instead of giving just mean, median, etc. • The relationship between two variables – the conditional distribution • Graph estimation results’ point estimates and confidence intervals 8Data Visualization
  • 9. What to expect out of this presentation 1. Discussion of the type of graph (e.g. distributions) 2. How the type of graph applies to continuous vs. categorical data 3. Extensions (e.g. graphing more than one at a time) What not to expect: how to do these in any particular software. 9Data Visualization
  • 11. Distributions – Continuous variables Make density plots/histograms for continuous variables. These give much more information than means, medians, etc. Two distributions with the same mean, but which are dramatically different. 11Data Visualization
  • 12. Density vs. histogram A density plot is basically a smoothed histogram. 12Data Visualization
  • 13. Distributions – Categorical variables Make bar charts for categorical variables. Tip: if your categories don’t have any inherent order, order them from largest to smallest. 13Data Visualization
  • 14. Compare distributions using color Suppose we want to compare the distribution of income among different occupations. Plot all the distributions, distinguished by color, and use transparency to make them all visible simultaneously. 14Data Visualization
  • 15. Highlighting important facts Add vertical lines to highlight the means. 15Data Visualization
  • 17. Relationships between variables If we’re asking, for example, what GDP growth looks like at different levels of government spending, we can show this using a scatterplot. 17Data Visualization
  • 18. How to show trends We can highlight the trend using scatterplot smoothing, which adapts the shape of the trend line to the data. 18Data Visualization
  • 19. How to show multiple groups We can see if the relationship differs among groups by giving each group a color. 19Data Visualization
  • 20. Another use for colors Suppose we want to come up with rules to identify people’s favorite food based on population density and elevation (bear with me) Can we see this on a graph? 20Data Visualization
  • 21. Graphing relationships with categorical data With categorical data, you typically can’t use scatterplots because points fall right on top of each other (‘overplotting’). However! We can use jittering to move the plotted points slightly. 21Data Visualization Without jittering With jittering
  • 22. Graphing relationships with categorical data The next step beyond jittering is to use a boxplot, which shows – The mean, – 25th and 75th percentiles, – 1.5 times the inter-quartile range (IQR) – outliers (plotted as points) 22Data Visualization mean 75th pctile mean + 1.5 *IQR outlier
  • 23. Looping back A boxplot isn’t, after all, all that different from the multi-colored density plot we showed earlier. Which is better depends on what you’re trying to show. 23Data Visualization
  • 24. Use log scale if your data spans a wide range Let’s say you have a large range of values, but most of your data is concentrated to one part of the range. It’s easier to see what’s going when we use log scale. 24Data Visualization
  • 26. Graphing estimation results We make a lot of regression tables, but we can make them easier to understand by putting them into graphs. 26Data Visualization
  • 27. ggplot(df, aes(population_density, elevation, color = favorite_food)) + geom_point() 27Data Visualization dataset x variable y variable make scatterplot color variable All graphs made in R and ggplot2
  • 28. Data Visualization Checklist • Always graph • Use color, size, shape, and position • Three important types of graph: – Distribution – Relationship – Estimation results • Highlight important facts • Make it cool-looking 28Data Visualization