Makes sure you avoid these common mistakes when visualizing data as well as some best practices to follow.
As someone who’s been doing data visualization for over 10 years, I’ve come across so many mistakes people make. The most common errors include:
Using the wrong graphs/charts for their particular purpose
Not making the best use out of colors.
Creating misleading graphs/charts
Trying to incorporate too much information in one graph
Here are some examples of each so you can learn to avoid them.
Using the wrong graphs/charts:
“Which graphs should I be using?”
As a general rule of thumb:
Bar charts are for showing the relationship between 1 categorical variable (e.g. color, car model, gender) against 1 numerical variable (height, test scores, IQ and other measurements).
Pie charts are for the same thing, but aren’t very good for data that contains more than 2-3 categories. E.g. it’s fine for gender since it only has male, female and other, but it’s terrible for listing all car models.
Scatterplots are for finding correlations between 2 numerical variables.
Time series are for showing changes over time (time vs. numerical variable).
There’s many more than this, but those are the main ones. To learn about this in greater detail, read our post about data visualization techniques.
Bad Example #1: Presenting Qualitative Data
Not all data can be visualized into graphs or charts. For instance, data pertaining to employee details: including first & last name, email address, ethnicity, job title etc.
The biggest mistake would be to present the raw data like this:
Just because a dataset contains a bunch of qualitative data like "name" and "email address" doesn't mean it can't be visualized.
There are two ways to visualize it:
Card View
Gallery View
Card view is good for visualizing raw data:
Gallery view is good for visualizing data with images (for instance: employee headshot photos). An example of gallery view is FlixGem.
Both of these visualizations aren't just to make things "look nicer." But they allow you to easily filter through the data with interactive tags. This is important for both data analysis and presentation.
How to create visualizations for qualitative data:
It might look complicated to create, but existing tools make your job dead simple:
Choose the desired layout: Grid view, card view or gallery view
Bad example #2: Pie chart with too many categories
Pie charts are best used when there are 2-3 items that make up a whole. Any more than that, and it’s difficult for the human eye to distinguish between the parts of a circle.
Notice how it’s hard to distinguish the size of these parts.
Is “China” bigger than “Other”?
It’s hard for our eyes to tell the difference. Instead, replace this with a bar chart:
Good example: Proper Bar Chart
Notice how “China” and “Other” are far apart, but we can easily distinguish that one is larger than the other? That’s because our eyes are more sensitive to length of bars than parts of a circle.
Bar charts will be your go-to chart for data visualization.
Bad example #3: Multi-colored bar charts
It might look pretty, and you might be wondering “what’s wrong with it?”
The more colors you use, the less comprehensible the visualization will be. More colors = more categories the brain must process.
On top of that, there’s a better way to handle colors:
Good example: Proper color design
Colors allow us to highlight whatever information we want.
If we wanted to highlight the country with the biggest CO2 emissions, we can use red vs. grey:
Notice how China immediately sticks out and we get the point across.
Other times, it’s a good idea to use multiple shades of the same color.
Another Good example: Proper pie chart
Pie charts are best used when there are 2-3 items that make up a whole. Any more than that, and it’s difficult for the human eye to distinguish between the parts of a circle.
Bad example #4: Horizontal bar charts
Horizontal bar charts suffer from the same issue as pie charts: once there are too many categories, you run out of space to include text and it becomes hard to digest:
Instead, it’s better to use vertical bar charts (by switching the axes around):
Good example: Vertical bar charts
This gives unlimited space for including text and is easier for the brain to digest.
Bad Example #5: Too much information
Here’s an example of someone trying to include too much information on one chart:
Including too much information ruins the point of data visualization in the first place. The purpose of data visualization is to allow the audience to easily digest the information and this graph does the opposite of that.
Instead, take the time to rearrange your data and create multiple graphs to convey your point.
Bad example #6: 3D graphs
Studies have shown that 3D rendering can negatively affect graph comprehension. It might be tempting to be creative and ‘3D’ your graphs, but there are better ways to get creative.
Bad example #7: Charts that don’t start at zero (misleading)
Sometimes it’s okay to break this rule, but in general:
Bar charts should always start at 0, because our eyes are very sensitive to the size of bars.
Scatterplots and time series should almost always start at 0.
Line graphs can sometimes break this rule.
Since the y-axis doesn’t start at 0, it’s easy to fool someone that product 2 is failing, but in actuality:
The same applies to other graphs like time series:
Since the y-axis doesn’t start at 0, it’s easy to fool someone that the price of something is exponentially rising where in actuality, the increase is only about 10-20%.
Bad Example #8: Tables With no Context
Spreadsheets and pivot tables with no context are meaningless.
Look at this pivot table:
It's a pivot table showing which product line and gender are generating the most income. Even though it's ordered from highest to lowest income generated, what exactly do these numbers mean? How high is $1580?
These numbers are meaningless without context.
Good Example: Table with Context
Instead of just giving a raw number, it's highly recommended to provide a mean deviation, that is, how far a number is from the average:
Now we can look and go "Oh $1580 is 23% above the mean."
Creating these might be off-putting to some people since it takes more time and effort, but a tool like Polymer Search does all of this automatically for you - and creates pivot tables faster than Excel.
Overall
Once you learn the many data visualization techniques, know when to use each graph and become aware of all the good and bad practices, you’ll be a pro data analyst in no time!
There are many ways to enhance your visualization skills - with Polymer Search you’ll be able to instantly generate interactive graphs/charts/pivot tables in a matter of seconds. You simply upload your data and the AI will automatically turn it into an interactive spreadsheet and provide quick and easy data visualization tools that are available in no other tool.
You’ll also be able to create your own web app in a couple of minutes with no coding experience required. Simply upload your dataset and Polymer will automatically transform it into a web application where you can share all your visualizations.