7A: Visualization

Minard Map of Napoleon's 1812 Campaign

This week we are going to explore some dos and don’ts of data visualization as you prepare for your final projects.  Edward Tufte is widely considered one of the world’s leading data visualization gurus, and has been called everything from “Leonardo da Vinci of data” to the “Galileo of graphics.”  Tufte will be our guide as we think through what good visualizations say and how bad data displays can lie and distort or even undermine your intended argument.


 The Minard Map


It may well be the best statistical graphic ever drawn.

—Edward Tufte, The Visual Display of Quantitative Information (1983)

Minard Map of Napoleon's 1812 Campaign
Charles Minard’s Map of Napoleon’s Russian Campaign of 1812

The Classic Discussed

Static variants  (Do they add anything?)

Interactive variants (Does it help to be able to manipulate the map?)



  • Why is this considered such a landmark visualization, if not the best ever?
  • What are the key features that make it stand out?
  • How would you improve on it, if you were to take a stab?


Keeping it Honest: How Not to Lie with Pictures


This may well be the worst graphic ever to find its way into print.

—Edward Tufte, The Visual Display of Quantitative Information (1983)

We’ve already discussed how not to lie with maps, but it’s easy to do with visualizations as well.  One of the biggest issues that Tufte stresses in his seminal work is how to stay honest with infographics.  One of the easiest errors to make, for instance, is to scale the radius of circles, or one axis of two dimensional shapes, which results in massively larger areas than your data actually warrants.

  • Explore this gallery of images illustrating “The Lie Factor”
    • Think about how the literal measurements of the images contradict the implicit argument the graphic is trying to make.
  • Explore some more of Michael Friendly’s gallery of the Best and the Worst of Statistical Graphics
    • What mistakes did you not think of before that you might want to avoid?
    • What examples might you like to emulate for your own projects?
    • Why?



Google Motion Charts (Gapminder)

One of the most impressive data visualization breakthroughs of recent years was Hans Rosling’s invention of Gapminder: an application that really unleashed the “fourth dimension” of time and allowed data to be animated in an immediately understandable and powerful way.  His TED talk below illustrating global health data with the tool is legendary.

Google bought the technology and made it available for all to use as Motion Charts.



We’ve already explored some visualization environments, but here are two more very impressive tools to check out:

Choose one and check it out to see what people are creating.  Download or join a service and see if you can create something.

  • Upload some of your own data, or download data sets from around the web
  • One interesting source is JSTOR for Research, which aggregates all the scholarly literature housed by JStor