Subscribe to EagerEyes.org feed
Updated: 1 day 4 hours ago

Link: Data Stories Podcast 2014 Review

Thu, 01/22/2015 - 15:17



Episode 46 of the Data Stories podcast features Andy Kirk and yours truly in an epic battle for podcast dominance a review of the year 2014. This complements well my State of Information Visualization posting, and of course there is a bit of overlap (I wrote that posting after we recorded the episode – Moritz and Enrico are so slow). There are lots of differences though, and the podcast has the advantage of not just me talking. We covered a lot of ground there, starting from a general down about the year, to end up finding quite a few things to talk about (just check out the long list of links in the show notes!).

Link: Data Viz Done Right

Wed, 01/21/2015 - 15:17



Andy Kriebel’s Data Viz Done Right is a remarkable little website. He collects good examples of data visualization and talks about what works and what doesn’t. He does have bits of criticism sometimes, but he always has more positive than negative things to say about his picks. Good stuff.

Why Is Paper-Writing Software So Awful?

Mon, 01/19/2015 - 03:58



The tools of the trade for academics and others who write research papers are among the worst software has to offer. Whether it’s writing or citation management, there are countless issues and annoyances. How is it possible that this fairly straightforward category of software is so outdated and awful?

Microsoft Word

The impetus for this posting came from yet another experience with one of the most widely used programs in the world. Among some other minor edits on the final version of a paper, I tried to get rid of the blank page after the last one. Easy, just delete the space that surely must be there, right? No, deleting the space does nothing. It doesn’t get deleted, or it comes back, or I don’t know what.

So I select the entire line after the last paragraph and delete that. Now the last page is gone, but the entire document was also just switched from a two-column layout to a single column. Great.

People on Twitter tell me that Word stores formatting information in invisible characters at the end of paragraphs. That may the case, I really do not care. But that it’s possible for me to delete something I can’t see and thus screw up my entire document has  to be some sort of cruel joke. Especially for a program that has been around for so long and is used by millions of people every day.

Word has a long history (it was first released in 1983, over 30 years ago), and carries an enormous amount of baggage. Even simple things like figure captions and references are broken in interesting ways. Placing and moving figures is problematic to say the least. Just how poorly integrated some of Word’s features are becomes apparent when you try to add comments to a figure inside a text box (you can’t) or replace the spaces before the square brackets inserted by a citation manager with non-breaking ones (Word replaces the entire citation rather than just the opening bracket, even though only the bracket matches the search).

In trying to be everything to everybody, Word does many things very, very poorly. I have tried alternatives, but they are universally worse. I generally like Pages, but its lack of integration with a citation manager (other than the godawful Endnote) makes it a no-go.


We all know that you write serious papers in LaTeX, right? Any self-respecting computer scientist composes his formula-laden treatises in the only program that can insert negative spaces exactly where you need them. LaTeX certainly doesn’t have the issues Word has, but it has its own set of problems that make it only marginally better (if at all).

It is also starting to seriously show its age. TeX, which is the basic typesetting system LaTeX is based on, was released in 1978 – almost 40 years ago. LaTeX made its debut in 1984, over 30 years ago. These are some of the oldest programs still in widespread use, and LaTeX isn’t getting anywhere near the development resources Word does.

While a lot of work has been done to keep it from falling behind entirely (just be thankful that you can create PDFs directly without even having to know what a dvi file is, or how bad embedded bitmap fonts were), there are also tons of issues. Need a reference inside a figure caption? Better know what \protect does, or the 1970s-era parser will yell at you. Forgot a closing brace? Too bad, you’ll have to find it by scanning through the entire document manually, even though TeX’s parser could easily tell you if it had been updated in the last 20 years. Want to move a figure? Spend 15 minutes moving the figure block around in the text and hope there’s a place where it’ll fall where you want it. And the list goes on.

And then there are the errors you can’t even fix directly. The new packages that insert links into the bibliography are great, except when the link breaks over a column boundary, which causes an error that you can’t avoid. All you can do is add or remove text so the column boundary falls differently. Great fun when this happens right before a deadline.

Citation Managers

In the old days, putting your references together was a ton of work: you had to collect them in one place, keep the list updated when you wanted to add or remove one, then sort and format them, and maybe turn the placeholder references in the paper text into numbers. Any time you’d add or remove one, you had to do it over again.


Enter bibliography software. In the dinosaur corner, we have bibTeX. As the name suggests, it works with (La)TeX. And it’s almost as old, having been released in 1985. It uses a plain text file with a very simple (and brittle) format for all its data, and you have to run LaTeX three times to make sure all references are really correct. This puts even the old two-pass compilers to shame, but that’s how bibTeX works.

There are programs that provide frontends for these text files, and they’re mostly ugly and terrible. A notable exception here is BibDesk, especially if you’re in the life sciences. It works really well and doesn’t get in the way. It’s an unassuming little program, and it gets updated pretty continuously. What it does, it does really quite well.

But the rest of the field is as horrifying a train wreck as the writing part.


I can’t quite share in the doomsday-is-here wailing that started when Elsevier bought Mendeley, and I haven’t seen any terrible decisions yet. What drives me up the wall are simply the bugs and the slowness and the things you expect to work but don’t.

Why does All Documents not include all documents? Why do I have to drag a paper I imported into a group into All Documents so it shows up there? Why are papers in groups copies instead of references, so that when I update one, the other one doesn’t get updated? The most basic things are so incredibly frustrating.

To be fair, Mendeley is constantly improving and is nowhere near as terrible as it was a year or two ago. It still has a ways to go, though. And I really hope they get serious about that iPad app at some point.


I’m trying to love Papers. I really do. It’s a native Mac app (though there’s now also a Windows version). It looks good. But it manages to be buggy and annoying in many places where Mendeley works well.

For one, the search in Papers is broken. I cannot rely on it to find stuff. It’s an amazingly frustrating experience when you search for an author and can’t see a particular paper you’re sure is there, and then search for its title and there it is! The ‘All Fields’ setting in the search also doesn’t seem to include nearly all fields, like the author. And matching papers against the global database has its own set of pitfalls and annoyances (like being able to edit fields in a matched paper only to have your edits cheerfully thrown away when you’re not looking). The list goes on (don’t open the grid view if you have large PDFs in your collection, etc.).


Listed only for completeness. Beyond terrible. Written by some sort of committee that understands neither paper writing nor software. I really can’t think of any non-academic commercial software that’s worse (within the category of software for academic users, it’s neck and neck with that nightmare that is Banner).

A Better Way?

How is it possible that the tools of the trade for academics are so outdated, insufficient, and just plain terrible? Is there really nothing in writing that is smarter than treating text as a collection of letters and spaces? Can’t we have a tool that manages reasonable layout (the stuff that LaTeX is good at without the parts it sucks at) with a decent reference manager?

This isn’t rocket surgery. All these things have well-known algorithms and approaches (partly due to the work that went into TeX and other systems). There have also been advances since the days when Donald Knuth wrote TeX. Having classics to look back at is great, but being stuck with them is not. And it’s particularly infuriating in what is supposed to be high technology.

What I understand even less is that there are no tools that consider text in a way that’s more semantic. Why can’t I treat paragraphs or sections as objects? Why doesn’t a section know that its title is part of it and thus needs to be included when I do something to it? Why don’t word processor allow me to fold a paragraph or section or chapter, like some source code editors do? Why can’t figures float while I move them and anchor to only certain positions given the constraints in a template?

There are so many missed opportunities here, it’s breathtaking. There has to be a world beyond the dumb typewriters with fancy clipart we have today. Better, more structured writing tools (like Scrivener, but with a reference manager) have got to be possible and viable as products.

We can’t continue writing papers with technology that hasn’t had any meaningful updates in 30 years (LaTeX) or that tries to cover everything that contains text in some form (Word). There has got to be a better way.

Links: 2014 News Graphics Round-Ups

Wed, 01/14/2015 - 15:17



In the past, it used to be difficult to find news graphics from the main news organizations. In the last few years, they have started to post year-end lists of their work, which are always a treat to walk through. With the new year a few weeks behind us, this is a good time to look at these as collections of news graphics.

Slightly different, but worth a special mention, is NZZ’s amazing visualization of all their articles from the year, Das Jahr 2014 in der «Neuen Zürcher Zeitung» (in German).

The State of Information Visualization, 2015

Mon, 01/12/2015 - 05:14



It seems to be a foregone conclusion that 2014 was not an exciting year in visualization. When we recorded the Data Stories episode looking back at 2014 last week (to be released soon), everybody started out with a bit of a downer. But plenty of things happened, and they point to even more new developments in 2015.

If this was such a boring year, how come Andy Kirk have  a round-up of the first six months, and another posting for the second half of the year with many good examples? Or how about Nathan Yau’s list of the best data vis projects of the year? So yeah, things happened. New things, even.

Academic InfoVis

I’m still awed by the quality of InfoVis 2014. It wasn’t even just the content of the papers that was really good, it was the whole package: present interesting new findings, present them well, make your data and/or code available. This had never happened with such consistency and at that level of quality before.

The direction of much of the research is also different. There were barely any new technique papers, which is largely a good thing. For a while, there were lots of new techniques that didn’t actually solve any real problems, but were assumed to be the way forward. Now we’re seeing more of a theoretical bent (like the AlgebraicVis paper), more basic research that looks very promising (e.g., the Weber’s Law paper), and papers questioning long-held assumptions (the bar charts perception paper, the error bars paper, the paper on staged animation, etc.).

Thoughtfully replicating, critiquing, and improving upon oft-cited older papers should be a valid and common way of doing research in InfoVis. The only way forward in science is to keep questioning beliefs and ideas. It’s good to see more of this happening, and I hope that this trend continues.


I talked about storytelling at the beginning of last year, and 2014 was clearly a big one for that. Besides the Story Points feature in Tableau 8.2, there have been many interesting new approaches to building more compelling stories from data.

Some new formats are also emerging, like Bloomberg View’s Data View series (unfortunately, there doesn’t seem to be a way to list all of them). I’m not yet convinced by the ever more common “scrollytelling” format, and have seen some really annoying and distracting examples. I don’t entirely agree with Mike Bostock’s argument that scrolling is easier than clicking, but he at least has some good advice for people building these sorts of things.

There was also a bit of a discussion about stories between Moritz Stefaner and myself, with Moritz firing the first shot, my response plus a definition of story, and finally a Data Stories episode about data stories where we sorted it all out.

There is no doubt that we’ll see more of this in the coming years. The tools are improving and people are starting to experiment and learn what works and what doesn’t. I hope that we will also see more and deeper academic work in this area.

Non-Academic Conferences

Speaking of conferences, like InfoVis, only different: these may not be new, but they are continuing. Tapestry, OpenVis, Visualized, eyeo, etc. are all connecting people from different disciplines. People talking to each other is good. Conferences are good.

That all these conferences are viable (and eyeo is basically impossible to get into) is actually quite remarkable. There is an interest in learning more. The people speaking there are also interesting, because they are not all the usual suspects. Journalists in particular did not use to speak much outside of journalism conferences. They have interesting things to say. People want to hear it.

The Rise of Data Journalism

FiveThirtyEight. Vox. The UpShot. They all launched (or relaunched) last year. Has it all been good? No. Nate Silver’s vow to make the news nerdier is off to a good start, but there is still a long ways to go. Vox has gotten too many things wrong and, quite frankly, needs to slow down and rethink their approach of publish-first-check-later. There is also a bit of a cargo cult going on, where every story involving numbers is suddenly considered data journalism.

But even with some of the false starts and teething problems, it’s clear that data in journalism is happening, and it is becoming more visible.

What Else 2015 Will Bring

In addition to the above, I think it’s clear that the use of visualization for communication and explanation of data will continue outside of journalism as well. Analysis is not going away of course, but more of its results will be visual rather than turned into tables or similar. The value of visualization is hardly limited to a single person staring at a screen.

This is also being picked up on the academic side. I think we will see more research published in this direction, more focused on particular ideas and more useful than what has been done so far (which has been mostly analysis).

Finally, I’m looking forward to more good writing about visualization. Tamara Munzner’s book came out last year, but since I haven’t read it yet, I can’t say anything other than that I have very high expectations. Several other people are also working on books, including Cole NussbaumerAndy Kirk, and Alberto Cairo (the latter two are slated to come out in 2016, though).

I didn’t think that 2014 was a bad year for information visualization. And I think 2015 and beyond will be even better.

The Island of Knowledge and the Shoreline of Wonder

Mon, 01/05/2015 - 04:17



In his keynote at IEEE VIS in Paris two months ago, Alberto Cairo talked about journalism, visual explanations, and what makes a good news visualization. But mostly, he talked about curiosity.

When I wrote my IEEE VIS report for Tuesday of that week, I knew that I could either do a shoddy job of describing the keynote and get the posting done, or have to push the entire thing back by a few days. So I decided to turn this into a separate posting.

The goal of writing up the talk here is not to provide a full recap – even though I could probably give the talk for him now, having seen variations of it three times in as many months. Instead, I want to pick out a few topics I find particularly interesting and universally relevant.


He started the talk with questions his kids ask him, like one from his 7-year-old daughter: why don’t planets stop spinning? That’s an amazingly deep question when you think about it, and even more so for a 7-year-old.

Alberto then went through some explanations, at the end of which he drew an interesting comparison: he likened the momentum of a planet’s rotation to the way answers can set his daughter’s mind in motion to produce more questions. Both keep spinning unless there’s a force to slow them down.

I particularly like the succinct way he put it: Good answers lead to more good questions. That sounds a lot like data analysis to me. And also to science. It’s quite satisfying to see a unifying theme between explanation and analysis: curiosity.

More knowledge leading to more questions is a fascinating idea. Cairo uses a quote by Ralph W. Sockman (also the basis for a book by Marcelo Gleiser), The larger the island of knowledge, the longer the shoreline of wonder. The island of knowledge is surrounded by an infinite sea of mystery. As the island grows, so does its shoreline, which is where wonder and new ideas happen.

I love this because it describes exactly the way science works. More knowledge always leads to more questions. Curiosity feeds itself. And it goes contrary to the idea that science takes away the mystery or beauty of nature by explaining things.

It’s More Complicated Than That

Getting back to journalism, Alberto lists a series of principles for a good visualization. It has to be…

  • Truthful
  • Functional
  • Beautiful
  • Insightful
  • Enlightening

This set of criteria is strongly based on journalistic practice and principles, and I think it makes a great package for the evaluation of any kind of visualization. Some of the criteria will look odd to the typical visualization person, such as the inclusion of beauty. But this is also what makes Alberto’s book so useful in teaching visualization courses: it goes beyond the typical limited horizon of the technical and largely analytical (rather than communication-oriented) mindset that is still prevalent in visualization.

Another part of this section was my final take-away, another great little sentence that I think needs to be appreciated more when working with data: it’s more complicated than that. Many times, it’s hard to appreciate the complexity and complications in the data, especially when things look convincing and seem to all fit together. But simple explanations can often be misleading and hide a more complex truth. The curious mind keeps digging and asking more questions.

Images from Alberto Cairo’s slides, which he kindly allowed me to use.

eagereyes will be bloggier in 2015

Tue, 12/30/2014 - 04:17



I always mess with my site around the new year, and this year is no exception. In addition to a new theme, I’ve also been thinking about content. Here are some thoughts on what I want to do in 2015.

I don’t know what it is, but I always start hating my website theme after about a year. We’ll see if this one is any different. Either way, it’s new. If you’re curious, this is the new Twenty Fifteen Theme that’s part of WordPress 4.1, with some minor tweaks. It’s nice, simple, clean, and has a few subtle little features.

It’s also decidedly a blog theme, with a focus on images. I’ve been using teaser images for most postings for a while now, and will make a bigger effort to find good and fitting ones. These may not even show up in your newsreader, especially for link posts (though you will see them on facebook and in the Twitter cards). But they make the site a lot nicer to look at and navigate.

As for content, there are mainly two things. One is that I want to make some more use of the post formats in WordPress, in particular links. These are different in that their title link goes to the page I want to link to, rather than a posting. The text that goes with each will also be short, so you’ll be able to see the entire thing on the front page. If you care to comment, you can click on the image to go to the posting page.

I already posted the first one recently, and have a few more scheduled for the coming weeks. The idea is to post a few of these a month, in addition to the regular content. If you’re following me on Twitter, it’s likely that you will have seen these links there before, but there will be a tad more context here, and there won’t be nearly as many.

As for the other content, my plan is to make a clearer distinction between blog postings and articles. I already have that in the way the categories are set up, but that isn’t very visible. I’m aiming for more consistent posting (i.e., one posting a week, every week), with the blog postings being shorter and more informal, while the articles will be longer and more organized.

Link titles will start with “Link:” from now on, but I don’t want to do that for blog postings or articles. I’m not sure yet how I will indicate the distinction, but it should at least be clear from the length and maybe the tone.

The goal is to make the content easier to consume, since I know that anything beyond a few paragraphs is much less likely to be read in its entirety (or at all). And perhaps I’ll even find a use for those other post types, like quote, image, and aside.

Review: Wainer, Picturing the Uncertain World

Tue, 12/23/2014 - 06:01



Picturing the Uncertain World by Howard Wainer is a book about statistics and statistical thinking, aided by visual depictions of data. Each article in the collection starts by stating a question or phenomenon, which is then investigated further using some clever statistics.

I bought the book after Scott Murray pointed me to it as the source of his assertion that in order to show uncertainty, the best way was to use blurry dots. I was surprised by that, since my own work had shown people to be pretty bad at judging blurriness, so that didn’t seem to be a particularly good choice (at least if you want people to be able to judge the amount of uncertainty).

The Author

I had never heard of Howard Wainer before reading this book. It turns out that he has been an outspoken critic of bad charts for a long time, much longer than blogs have been around to do that. In fact, Wainer wrote an article for American Statistician in 1984 that could have been the blueprint for blogs like junk charts.

And it turns out that there is even a connection between Wainer and Kaiser Fung, who runs junk charts.

@eagereyes Howard introduced me to Tufte principles in my first stats course almost 20 yr ago!

— Kaiser Fung (@junkcharts) December 9, 2014

This is also interesting because the book reminded me of Kaiser’s Numbers Rule Your World and Numbersense. It all makes sense.

The Book

After Scott pointing it out, the book immediately intrigued me: had somebody figured out how to show uncertainty well? How did I not know about this? Well, it turns out he hasn’t. But there is a lot of other good stuff in this book that makes it very worthwhile.

Wainer’s idea of uncertainty is much broader than the usual error metrics (though he addresses those as well). In fact, he describes statistics as the science of uncertainty. That makes a lot of sense, and he makes the case repeatedly about how statistics provides means of dealing with uncertainty about facts and observations.

As a consequence, the book is really about statistical thinking, aided by visual depictions of the data. In several chapters, Wainer takes data and either redraws an existing chart, or argues that by simply looking at the data the right way, it becomes much easier to understand what is going on.

The key chapter from my perspective was chapter 13, Depicting Error. Wainer shows a number of ways to depict error, from tables to a number of charts. Some of these are well-known, others not. They are all interesting, though there isn’t much that is surprising (especially after having seen the Error Bars Considered Harmful paper by Michael Correll and Michael Gleicher at InfoVis earlier this year).

There is a lot of other good stuff in the book too, though. Chapter 16, Galton’s Normal, talks about the way the normal distribution drops to very, very small probabilities in the tails. It’s a short chapter, but it really drove home a point for me about how hard it is to intuitively understand distributions, even the ubiquitous normal distribution.

The final chapter, The Remembrance of Things Past, is probably the best. It’s the deepest, most human, and I think it has the best writing. It describes the statistical graphics produced by population of the jewish ghetto in Kovno, Lithuania, during the Holocaust. It’s chilling and fascinating, and the charts they created are incredible. Wainer does an admirable job of framing the entire chapter and navigating between becoming overly sentimental and being too sterile in his descriptions.

The book is really a collection of articles Wainer wrote for Chance Magazine and American Statistician in the mid–2000s (with one exception from 1996). As a result, it isn’t really more than the sum of its parts: it doesn’t have any cohesion between the chapters. But on the other hand, each chapter is a nicely self-contained piece, easy to read, and it’s easy to pick the book up to read a chapter or two. Wainer also writes very well. The chapters are easy to read, and his explanations of statistical phenomena and procedures are very good and easy to follow even if you don’t know much about statistics.

Ultimately, my question about the blurry dots was not answered, because Wainer points to Alan MacEachren’s book How Maps Work as the source of the blurriness argument. I can’t find my copy of that book at the moment though, so following this lead further will have to wait for another day.

VIS 2014 Observations and Thoughts

Tue, 11/18/2014 - 03:19



While I’ve covered individual talks and events at IEEE VIS 2014, there are also some overall observations – positive and negative – I thought would be interesting to write down to see what others were thinking.

I wrote summaries for every day I was actually at the conference: Monday, Tuesday, Wednesday, Thursday, and Friday. VIS actually now starts on Saturday with a few early things like the Doctoral Colloquium, and Sunday is a full day of workshops and tutorials.

Just to be clear: my daily summaries are by no means comprehensive. I did not go to a single VAST or SciVis session this year, only saw two out of five panels, did not go to a single one of the ten workshops, attended only one of the nine tutorials, and didn’t even see all the talks in some of the sessions I did go to. I also left out some of the papers I actually saw, because I didn’t find them relevant enough.

Things I Don’t Like

I’m starting with these, because I like a lot more things than I don’t, and listing the bad stuff at the end always makes these things sound like they are much more important and severe than they really are.

The best paper has been quite odd at InfoVis for a while. Some of the selections made a lot of sense, but some were just downright weird. This year’s best paper was not bad, but I don’t think it was the best one that was presented. Even more, some of the really good ones didn’t even get honorable mentions.

While it’s easy to blame the best paper committee, I think we program committee members also need to get better at nominating the good ones so they can be considered. I know I didn’t nominate any of the ones I was primary reviewer on, and I really should have for one of them. We tend to be too obsessed with criticizing the problems and don’t spend enough time making sure the good stuff gets the recognition it deserves.

Another thing I find irritating is the new organization of the proceedings. I don’t get why TVCG papers need to be in a separate category entirely, that just makes finding them harder. It also only reinforces the mess that is the conference vs. journal paper distinction at VAST. Also, why are invited TVCG papers listed under conference rather than TVCG? How does that make any sense? There has to be a better way both for handling VAST papers (and ensuring the level of quality) and integrating all papers in the electronic proceedings. There is just too much structure and bureaucracy here that I have no interest in and that only gets in the way. Just let me get to the papers.

Speaking of TVCG, I don’t think that cramming presentations for journal papers into an already overfull schedule is a great idea. That just takes time away from other things that make more sense for a conference (like having a proper session for VisLies). While I appreciate the fact that VIS papers are journal papers (with some annoying exceptions), I think doing the opposite really doesn’t make sense. Also, none of the TVCG presentations I saw this year were remarkable (though I admittedly only saw a few).

The Good Stuff

On to the good stuff. This was the best InfoVis conference in a while. There were a few papers I didn’t like, but they were outweighed by a large number of very strong ones, and some really exceptional ones. I think this year’s crop of papers will have a lasting impact on the field.

In addition to the work being good, presentations are also getting much better. I only saw two or three bad or boring presentations, most were very solid. That includes the organization of the talk, the slides (nobody seems to be using the conference style, which is a good thing), and the speaking (i.e., proper preparation and rehearsals). A bad talk can really distract from the quality of the work, and that’s just too bad.

Several talks also largely consisted of well-structured demos, which is great. A good demo is much more effective than breaking the material up into slides. It’s also much more engaging to watch, and leaves a much stronger impression. And with some testing and rehearsals, the risk that things will crash and burn is really not that great (still not a bad idea to have a backup, though).

A number of people have talked about the need for sharing more materials beyond just the paper for a while, and it is now actually starting to happen. A good number of presentations ended with a pointer to a website with at least the paper and teaser video, and often more, like data and materials for studies, and source code. After the Everything But The Chart tutorial, I wonder how many papers next year will have a press kit.

The number of systems that are implemented in JavaScript and run in the browser is also increasing. That makes it much easier to try them out without the hassle of having to download software. Since many of these are prototypes that will never be turned into production software, it doesn’t matter nearly as much that they won’t be as easily maintained or extended.

VIS remains a very friendly and healthy community. There are no warring schools of thought, and nobody tries to tear down somebody else’s work in the questions after a talk. The social aspect is also getting ever stronger with the increasing number of parties. That might sound trivial, but the main point of a conference are communication and the connections that are made, not the paper presentations.

There is also a vibrant community on Twitter, at least for InfoVis and VAST talks. I wonder what it will take to get some SciVis people onto Twitter, though, or help them figure out how to use WordPress.

VIS 2014 – Friday

Fri, 11/14/2014 - 15:46



Wow, that was fast! VIS 2014 is already over. This year’s last day was shorter than in previous years, with just one morning session and then the closing session with the capstone talk.

Running Roundup

We started the day with another run. Friday saw the most runners (six), bringing the total for the week to 15, with a count distinct of about 12. I hereby declare the first season of VIS Runnners a resounding success.

InfoVis: Documents, Search & Images

The first session was even more sparsely attended than on Thursday, which was really too bad. The first paper was Overview: The Design, Adoption, and Analysis of a Visual Document Mining Tool For Investigative Journalists by Matthew Brehmer, Stephen Ingram, Jonathan Stray, and Tamara Munzner, and it was great. Overview is a tool for journalists to sift through large collections of documents, like those returned from Freedom of Information Act (FOIA) requests. Instead of doing automated processing, it allows the journalists to tag and use keywords, since many of these documents are scanned PDFs. It’s a design study as well as a real tool that was developed over a long time and multiple releases. This is probably the first paper at InfoVis to report on such an extensively developed system (and the only one directly involved in somebody becoming a Pulitzer Prize finalist).

The Overview paper also wins in the number of websites category: in addition to checking out the paper and materials page, you can use the tool online, examine the source code, or read the blog

How Hierarchical Topics Evolve in Large Text Corpora, Weiwei Cui, Shixia Liu, Zhuofeng Wu, Hao Wei presents an interesting take on topic modeling and the ThemeRiver. Their system is called RoseRiver, and is much more user-driven. The system finds topics, but lets the user combine or split them, and work with them much more than other systems I’ve seen.

I’m a bit skeptical about Exploring the Placement and Design of Word-Scale Visualizations by Pascal Goffin, Wesley Willett, Jean-Daniel Fekete, and Petra Isenberg. The idea is to create a number of ways to include small charts within documents to show some more information for context. They have an open-source library called Sparklificator to easily add such charts to a webpage. I wonder how distracting small charts would be in most contexts, though.

A somewhat odd paper was Effects of Presentation Mode and Pace Control on Performance in Image Classification by Paul van der Corput and Jarke J. van Wijk. They investigated a new way of rapid serial visual presentation (RSVP) for images, which continuously scrolls rather than flips through page of images. It’s a mystery to me why they only tried sideways scrolling, which seems much more difficult than vertical scrolling.

Capstone: Barbara Tversky, Understanding and Conveying Events

The capstone was given by cognitive psychology professor Barbara Tversky. She talked about the difference between events and activities (events are delimited, activities are continuous), and how we think about them in when listening to a story. She has done some work on how people delineate events on both a high level and a very detailed level.

This is interesting in the context of storytelling, and particularly in comics, which break up time and space using space, and need to do so at logical boundaries. Tversky also discussed some of the advantages and disadvantages of story: that it has a point of view, causal links, emotion, etc. She listed all of those as both advantages and disadvantages, which I thought was quite clever.

It was a very fast talk, packed with lots of interesting thoughts and information nuggets. It worked quite well as a counterpoint to Alberto Cairo’s talk, and despite the complete lack of direct references to visualization (other than a handful of images), it was very appropriate and useful. Many people were taking pictures of her slides during the talk.

Next Years

IEEE VIS 2015 will be held in Chicago, October 25–30. The following years had already been announced last year (2016: Washington, DC; 2017: Santa Fe, NM), but it was interesting to see them publicly say that 2018 might see VIS in Europe again.

This concludes the individual day summaries. I will also post some more general thoughts on VIS 2014 in the next few days.

VIS 2014 – Thursday

Fri, 11/14/2014 - 07:16



Thursday was the penultimate day of VIS 2014. I ended up only going to InfoVis sessions, and unfortunately missed a panel I had been planning to see. The papers were a bit more mixed, but there were agains some really good ones.

InfoVis: Evaluation

Thursday was off to a slow start (partly because of the effects of the party the night before that had the room mostly empty at first), but eventually got interesting.

Staggered animation is commonly understood to be a good idea: don’t start all movement in a transition at once, but with a bit of delay. It’s supposed to help people track the objects as they are moving. The Not-so-Staggering Effect of Staggered Animated Transitions on Visual Tracking by Fanny Chevalier, Pierre Dragicevic, and Steven Franconeri describes a very well-designed study that looked into that. They developed a number of criteria that make tracking harder, then tested those with regular motion. After having established their effect, they used Monte-Carlo simulation to find the most best configuration for staggered animation of a field of points (since there are many choices to be made about which to move first, etc.), and then tested those. It turns out that the effect from staggering is very small, if it exists at all. That’s quite interesting.

Since they tested this on a scatterplot with identical-looking dots, it’s not clear how this would apply to, for example, a bar chart or a line chart, where the elements are easier to identify. But the study design is very unusual and interesting, and a great model for future experiments.

Another unexpected result comes from The Influence of Contour on Similarity Perception of Star Glyphs by Johannes Fuchs, Petra Isenberg, Anastasia Bezerianos, Fabian Fischer, and Enrico Bertini. They tested the effect of outlines in star glyphs, and found that the glyph works better without it, just showing the spokes. That is interesting, since the outline supposedly would help with shape perception. There are also some differences between novices and experts, which are interesting in themselves.

The only technique paper that I have seen so far this year was Order of Magnitude Markers: An Empirical Study on Large Magnitude Number Detection by Rita Borgo, Joel Dearden, and Mark W. Jones. The idea is to design a glyph of sorts to show orders of magnitude, so values across a huge range can be shown without making most of the smaller values impossible to read. The glyphs are fairly straightforward and require some training, but seem to be working quite well.

InfoVis: Perception & Design

While there were some good papers in the morning, overall the day felt a bit slow. The last session of the day brought it back with a vengeance, though.

Learning Perceptual Kernels for Visualization Design by Çağatay Demiralp, Michael Bernstein, and Jeffrey Heer describes a method for designing palettes of shapes, sizes, colors, etc, based on studies. The idea is to measure responses to differences, and then train a model to figure out which of them can be differentiated better or worse, and then pick the best ones.

The presentation that took the cake for the day though was Ranking Visualization of Correlation Using Weber’s Law by Lane Harrison, Fumeng Yang, Steven Franconeri, and Remco Chang. It’s known that scatterplots allow people to judge correlation quite well, with precision following what is called Weber’s Law (which describes which end of the scale is easier to differentiate). In their experiments, the authors found that this is also true for ten other techniques, including line charts, bar charts, parallel coordinates, and more. This is remarkable because Weber’s law really describes very basic perception rather than cognition, and it paves the way for a number of new ways to judge correlation in almost any chart.

The Relation Between Visualization Size, Grouping, and User Performance by Connor Gramazio, Karen Schloss, and David Laidlaw looked at the role of mark size in visualizations, and whether it changes people’s performance. They found that mark size does improve performance, but only to a point. From there, it doesn’t make any more difference. Grouping also helps reduce the negative effect of an increase in the number of marks.

Everybody talks about visual literacy in visualization, but nobody really does anything about it. That is, until A Principled Way of Assessing Visualization Literacy by Jeremy Boy, Ronald Rensink, Enrico Bertini, and Jean-Daniel Fekete. They developed a framework for building visual literacy tests, and showed that this could work with an actual example. This is just the first step certainly, and there are no established visual literacy levels for the general population, etc. But having a way to gauge visual literacy would be fantastic and inform a lot of research, use of visualization in the media, education, etc.

The Podcasting Life

Moritz and Enrico asked me to help them record a segment for the VIS review episode of the Data Stories podcast. You can listen to that in all its raw, uncut glory by downloading the audio file.

VIS 2014 – Wednesday

Thu, 11/13/2014 - 14:29



Wednesday is more than the halfway point of the conference, and was clearly the high point so far. There were some great papers, the arts program, and I got to see the Bertin exhibit.

InfoVis: Interaction and Authoring

Revisiting Bertin matrices: New Interactions for Crafting Tabular Visualizations by Charles Perin, Pierre Dragicevic, and Jean- Daniel Fekete was the perfect paper for this year. They implemented a very nice, web-based version of Bertin’s reorderable matrix, very closely following the purely black-and-white aesthetic of the original. They are also starting to build additional things on top of that, though, using color, glyphs, etc.

The reason it fits so well is not just that VIS is in Paris this year (and Bertin actually lived just around the corner from the conference hotel), but it also ties in with the Bertin exhibit (see below). They also made the right choice in calling the tool Bertifier, a name I find endlessly entertaining (though they clearly missed the opportunity to name it Bertinator, a name both I and Mike Bostock suggested after the fact – great minds clearly think alike).

iVisDesigner: Expressive Interactive Design of Information Visualizations by Donghao Ren, Tobias Höllerer, and Xiaoru Yuan is a tool for creating visualization views on a shared canvas. It borrows quite a bit from Tableau, Lyra, and other tools, but has some interesting ways of quickly creating complex visualizations that are linked together so brushing between them works. They even showed streaming data in their tool. It looked incredibly slick in the demo, though I have a number of questions about some of the steps I didn’t understand. Since it’s available online and open-source, that’s easy to follow up on, though.

VIS Arts Program

I saw a few of the papers in the VIS Arts Program (oddly abbreviated VISAP), though not as many as I would have liked. There were some neat projects using flow visualization to paint images, some more serious ones raising awareness for homelessness with a large installation, etc.

The one that stood out in the ones I saw was PhysicSpace, a project where physicists and artists worked together to make it possible to experience some of the weird phenomena in quantum physics. The pieces are very elaborate and beautiful, and go way beyond simple translations. There is a lot of deep thinking and an enormous amount of creativity in them. It’s also remarkable how open the physicists seem to be to these projects. It’s well worth watching all the videos on their website, they’re truly stunning. This is the sort of work that really shows how transcending art and science can produce amazing results.

InfoVis: Exploratory Data Analysis

This session was truly outstanding. All the papers were really good, and the presentations matched the quality of the content (almost all the presentations I saw yesterday were really good). InfoVis feels really strong this year, both in terms of the work and the way it is presented.

The Effects of Interactive Latency on Exploratory Visual Analysis by Zhicheng Liu and Jeffrey Heer looks at the effect latency has on people’s exploration of data. They added a half-second delay to their system and compared to the system in its original state. It turns out that the delay reduces the amount of interaction and people end up exploring less of the data. While that is to be expected, when asked people didn’t think the delay would affect them, and a third didn’t even consciously notice it.

Visualizing Statistical Mix Effects and Simpson’s Paradox by Zan Armstrong and Martin Wattenberg examines Simpson’s Paradox (e.g., median increases for entire population, even though every subgroup decreases) in visualization. They have built an interesting visualization to illustrate why the effect occurs, and make some recommendations for mitigating it in particular techniques. This is an important consideration for aggregated visualization, which is very common given today’s data sizes.

Showing uncertainty is an important issue, and often it is done with error bars on top of bar charts. The paper Error Bars Considered Harmful: Exploring Alternate Encodings for Mean and Error by Michael Correll and Michael Gleicher shows why they are problematic: the are ambiguous (do they show standard error or a confidence interval? If the latter, then which one?), asymmetric (points in the bar appear to be more likely than points over the bar, at the same distance from the bar’s top), and binary (a point is either within the range or outside). Their study demonstrates the issue and then tests two different ways, violin plots and gradient plots, which both perform better.

My Tableau Research colleagues Justin Talbot, Vidya Setlur, and Anushka Anand presented Four Experiments on the Perception of Bar Charts. They looked at the classic Cleveland and McGill study of bar charts, and asked why the differences they found occurred. Their study is very methodical and presented very well, and opens up a number of further hypotheses and questions to look into. It has taken 30 years for somebody to finally ask the why question, hopefully we’ll see more reflection and follow-up now.

I unfortunately missed the presentation of the AlgebraicVis paper by Gordon Kindlmann and Carlos Scheidegger. But it seems like a really interesting approach to looking at visualization, and Carlos certainly won’t shut up about it on Twitter.

Bertin Exhibit

VIS being in Paris this week is the perfect reason to have an exhibit about Jacques Bertin. It is based on the reorderable matrix, an idea Bertin developed over many years. The matrix represents a numeric value broken down by two categorical dimensions, essentially a pivot table. The trick, though, is that it allows its user to rearrange and order the rows and columns to uncover patterns, find correlations, etc.

The exhibit shows several design iterations Bertin went through to build it so it would be easy to rearrange, lock, and unlock. Things were more difficult to prototype and animate before computers.

The organizers also built a wooden version of the matrix for people to play with. The basis for this was the Bertifier program presented in the morning session. While they say that it is a simplified version of Bertin’s, they also made some improvements. One is that they can swap the top parts of the elements by attaching them with magnets. That way, different metrics can be expressed quite easily, without having to take everything apart. I guess it also lets you cheat on the reordering if you only swap two rows.

They also have some very nice hand-drawn charts from the 1960s, though not done by Bertin. They are interesting simply because they show how much effort it was to draw charts before computers.

Note the amount of white-out used above to remove extraneous grid lines, and below to correct mistakes on the scatterplot.

I was also reminded of this in the Financial Visualization panel, where one of the speakers showed photos of the huge paper charts they have at Fidelity Investments for deep historical data (going back hundreds of years). Paper still has its uses.

In addition to being interesting because of Bertin’s influence and foresight, this exhibit is also an important part of the culture of the visualization field. I hope we’ll see more of these things, in particular based on physical artifacts. Perhaps somebody can dig up Tukey’s materials, or put together a display of Bill Cleveland’s early work – preferably without having to wait for him to pass away.

Running and Partying

The second VIS Run in recorded history took place on Wednesday, and that night also saw the West Coast Party, which is becoming a real tradition. The first session on Thursday morning was consequently quite sparsely attended.

VIS 2014 – Tuesday

Wed, 11/12/2014 - 08:46



The big opening day of the conference, Tuesday, brought us a keynote, talks, and panels. Also, a new trend I really like: many talks end with the URL of a webpage that contains a brief summary of the paper, the PDF, and often even a link to the source code of the tool they developed.


That VIS would ever take place outside the U.S. was by no means a given. There was a lot of doubt about getting enough participants, sponsors, etc. to make it work (and a ton of convincing by this year’s chair, Jean-Daniel Fekete).

That made it especially interesting to hear the participant numbers. There are over 1,100 attendees this year, more than ever before. They also more than doubled the amount of money coming from sponsors compared to last year, which is very impressive. VIS outside the U.S. is clearly doable, and even though the next three years are already known to be in the U.S., I’m sure this will happen again.

One number that was presented but that I don’t believe is that there were supposedly only 79 first-time attendees. That doesn’t square with the different composition of participants (fewer Americans, more Europeans), and besides would be terrible if true.

Alberto Cairo: The Island of Knowledge and the Shorelines of Wonder

The keynote this year was by Alberto Cairo, who gave a great talk about the value of knowledge and communicating data. Perhaps my favorite quote was that good answers lead to more good questions.

There is a lot more to say, and I want to really do his talk justice. So I’m going to not go into more detail here, but rather write it up in a separate posting in the next week or two.

InfoVis: The Joy of Sets

The first InfoVis session started what I hope is a trend: ending talks with a URL that points to a website with talk materials, the paper, and often even the source code of the presented tool. This is how work can be shared, revisited, and make its way beyond the limited conference audience.

The first paper was UpSet: Visualization of Intersecting Sets by Alexander Lex, Nils Gehlenborg, Hendrik Strobelt, Romain Vuillemot, and Hanspeter Pfister. The system allows the user to compare sets and look at various intersections and aggregations. There are many different interactions to work with the sets. Because there are so many views and details, it’s almost like a systems paper, but good (most systems papers are terrible – another rant for another day).

OnSet: A visualization technique for large-scale binary set data by Ramik Sadana, Timothy Major, Alistair Dove, and John Stasko describes a tool for comparing multiple sets to each other. There are some clever interactions and the tool also shows hierarchies within the sets while comparing.

Rounding out the sets theme was a paper I didn’t actually see the presentation for, but I want to mention anyway: Domino: Extracting, Comparing, and Manipulating Subsets across Multiple Tabular Datasets by Samuel Gratzl, Nils Gehlenborg, Alexander Lex, Hanspeter Pfister, and Marc Streit. From what I gather, it presents a query interface and visualization for sets and subsets, and it looks quite nifty.

InfoVis: Colors and History

I’m a bit conflicted about DimpVis: Exploring Time-varying Information Visualizations by Direct Manipulation by Brittany Kondo and Christopher Collins. They developed a way to show time in a plot so that you can navigate along the temporal development of a value (rather than use a time slider that is disconnected and doesn’t show you history). While that makes sense to me in the original example they showed, a time-varying scatterplot, I’m a bit less convinced by the bar chart, pie chart, and heatmap versions of it.

A paper I missed, but that seems to have stirred some controversy, is Tree Colors: Color Schemes for Tree-Structured Data by Martijn Tennekes and Edwin de Jonge.

“Blind Lunch”

The reason I missed some of the papers in the InfoVis session is that I was one of the people hosting a table for what is called a blind lunch. This used to be called Lunch with the Leaders, which may have sounded a bit too ambitious (and scared off potential leaders who didn’t necessarily consider themselves that), but at least it made more sense. Everybody knew who they were signing up with, and nobody was blindfolded as far as I’m aware.

It’s a good event though. I had a chance to chat with four grad students and share my wisdom about industry vs. academia. There are also a few more activities as part of the Compass program for people who are about to graduate, or just generally want to get more perspectives on the job situation in academia and/or industry.

Panel: Data with a cause: Visualization for policy change

One of the things I was looking for the most at VIS this year was the panel Data with a cause: Visualization for policy change, organized by Moritz Stefaner, with speakers from the OECD, World Bank, and the World Economic Forum.

The panelists all had interesting things to say about what they are doing to make data more accessible, make it easier to share their reports and other materials, and provide means for people to talk back. There are also some interesting issues around the different types of audience they want to serve (economists, policy makers, general public) and the general unease when handing out data to the unwashed masses.

What I was missing, though, was a bit of controversy and actual discussion. For such an important topic, it was a very tame panel. There were some really good questions to be asked though, like one coming from the audience about the responsibility of organizations not to reinforce the winners and losers through their data, and what they might do about that. I also asked about the availability not just of tables, but of the underlying data. I have some more to say on that topic in future postings.


One of my favorites of the conference so far is Multivariate Network Exploration and Presentation: From Detail to Overview via Selections and Aggregations by Stef van den Elzen and Jarke J. van Wijk. I don’t seem to be alone in this, as the paper also received the Best Paper Award at InfoVis this year.

The system they developed shows multivariate graphs, and allows the concurrent display of the network and the multivariate data in the nodes (even including small multiples). What’s perhaps most interesting is the fact that they allow the user to make selections to aggregate the graph, essentially building a sort of PivotGraph to see the higher-level structure on top of the very detailed, hairball-like, graph.

Because they are showing the detailed network first and let the user create an overview version, apparently Jarke van Wijk suggested to name the system Namrediehns – i.e., Shneiderman spelled backwards, since it’s Ben Shneiderman’s famous mantra (overview first, zoom and filter, then details on demand) in reverse.

NAMREDIENHS! The reverse Shneiderman mantra. #ieeevis pic.twitter.com/zBRJ3oipNJ

— Nils Gehlenborg (@nils_gehlenborg) November 11, 2014

This was much funnier the way Stef van Elzen did it of course, and in particular with Ben Shneiderman sitting there in the first row, directly in front of him.

VisLies, Parties

It remains a crime that VisLies is not a regular session, but a meetup that is tacked on and usually at a time when everybody is at dinner. I think it’s a really great idea, and there should be room for it in the regular program. It deserves a lot more attention and attendance. I missed it this year again.

There were also two new parties, the Austrian Party and the NYU Party. I really like this new tradition of parties to connect people and reinforce the community aspect of the conference. It does mean even less sleep than before, though.

VIS 2014 – Monday

Tue, 11/11/2014 - 08:41



IEEE VIS 2014 technically began on Saturday, with the first full day open to all attendees being Sunday. Monday continued the workshops and tutorials, and that is where we join our intrepid reporter.

VIS Social Run

The day started at 6:30am, when five fearless runners braved the cold and dark, and completed the inaugural VIS Social Run. It was a great run, about 5km in length, in (what I consider) perfect running weather (i.e., cool bordering on cold). While the darkness limited the sightseeing potential of the run, the early morning was great because it’s the time when all the boulangeries are baking their bread, so we got to suck in the delicious smells of fresh bread.

I’ve posted the route on Strava for all to enjoy. We even took a dorky, sweaty, blurry group selfie at the end.

We’re also running Wednesday morning and Friday morning, and potentially also Thursday. Stephen Kobourov might also do a longer run on Friday afternoon. Let me know if you want to join us, or just come to the Marriott at 6:30am.


I only saw part of the BELIV workshop (the name still stands for Beyond Time And Errors: Novel Evaluation Methods For Visualization). The papers there are well worth checking out though, because they represent some of the most interesting thinking about how to better evaluate visualization work.

Pierre Dragicevic gave a great keynote discussing the use of statistics in visualization. In particular, the use of p values, often without understanding them well, cherry-picking results, ignoring effect sizes, etc. Instead, using confidence intervals is a much better idea, because it provides much more information than the largely binary (and opaque!) significance test.

This is really important to make results more useful beyond just the individual paper, easier to compare in replication, and just generally more honest. Pierre and his group have a great website with lots of resources to explore.

Bernice Rogowitz has some good points in her questions after the talk, like the fact that using more than just the plain p values makes for much better way of telling the story of the analysis than the boring boilerplate stats you usually get. Walking the reader through the analysis also makes it easier to also include the weaker results instead of hiding them.

There was also a panel on tasks, which largely talked about task taxonomies. There was an odd lack of self-awareness on that panel, because for all the talk about tasks, there didn’t seem to be much thought about what people would actually do with those taxonomies. Who are the users of the taxonomies? What are their tasks? Is any of this work actually useful, or is it just done for its own sake? That struck me as particularly odd as part of this event.

I didn’t see the actual paper presentations, but BELIV generally has a good mix of interesting new thinking and interesting results from evaluations of visualization tools and systems.

On a related note, Steve Haroz has put together a great guide to evaluation papers at VIS this year.

Everything Except The Chart Tutorial

Among the more unusual things this year was Moritz Stefaner and Dominikus Baur‘s tutorial titled Everything Except The Chart. They talked about all the things web-based visualization project needs to be successful (other than the visuals): how to make it findable, how to make it shareable, various web technologies, etc. They did that based on their own projects, like Selfiecity, the OECD Better Life Index, etc.

The room was packed, which was interesting. Who knew academics actually cared about sharing their work with the world? Apparently, they do.

There was a lot of information in that tutorial, I will not even begin to try and summarize it all. They have published their slides, and also made some demo code available.

Perhaps the best summary of the tutorial is the project checklist they used to frame part of it:

  • Is it findable?
  • Does it draw you in?
  • Is it enjoyable to use?
  • Is it informative?
  • “Why should I care?”
  • Is it shareable?

These are questions anybody can ask themselves easily, and then figure out what to do about them. This includes simple things like hidden images and text to make the page easier to index for search engines and share/pin/etc. And it even includes things like a press kit, so journalists can write about your projects more easily (and get the best images).

While I wasn’t as excited about the long list of tools (bower, grunt, snort, blurt, fart, etc. – I may have made up a few of those, guess which!), they had lots of good points about making design responsive, having it work well (or at least be useable) on small screens, etc. None of this has ever been discussed at VIS before as far as I am aware, and it has the potential to have the largest impact for getting word out about the work we do in visualization. Now all the people who attended just need to actually put these things into practice.

The VIS Sports Authority

Sun, 10/19/2014 - 23:35



When you think of a conference, does sitting around a lot come to mind? Lots of food? Bad coffee? No time to work out? For the first time in VIS history, there will be a way to exercise your body, not just your mind. The VIS Sports Authority, which is totally an official thing that I didn’t just make up, will kick your ass at VIS 2014.

There will be two disciplines: cycling and running. Jason Dykes is running the cycling team, and I will be driving the runners.

Le Tour de VIS

Jason is way more organized than I am, having put together not just a real website with a logo, but actually ordered bike jerseys. Cycling has somewhat more complicated logistics though, so that is certainly a good thing. I hear Jason has even picked out the soundtrack for the race already.

The Vélo Club de VIS will embark on Le Tour de VIS (this is apparently named after some sort of bike race) on the Saturday after the conference, November 15.

Go to one of the pages linked above to get more information, like a map of the planned route, and to sign up.

VIS Runners

The running will be a bit more low-key. I couldn’t think of a better name than VIS Runners, so let’s just run with that (unless you want me to call us Eager Runners).

However, running will not happen after the conference, but during. Since the receptions and parties are in the evenings, it makes the most sense to go out in the mornings. My current plan is to meet at the conference hotel at about 6:30am, then run for about an hour, so we’ll be back by 7:30.

For the distance, I’m thinking no more than 6 miles/10 kilometers, but that can be adjusted. We probably won’t do more than three runs, and in particular will likely skip Thursday (after the reception Wednesday night).

The course should be different every day to get some variety, and will depend on the distance people want to go. If you’re a local or just know your way around Paris, I’d appreciate your input in the route planning, too!

I’m embedding a form below (also available here) to collect some information about when and how far people want to go, and to get people’s names so I can follow up later.


Large Multiples

Mon, 10/13/2014 - 03:43



Getting a sense of scale can be difficult, and the usual chart types like bars and lines don’t help. Showing scale requires a different approach, one that makes the multiplier directly visible.


In the U.S., CEOs on average make 354 times as much as workers, according to this recent posting on the Washington Post’s Wonkblog. That is an astounding number. Put differently, a CEO makes in one day almost as much as the worker makes in an entire year. How do we show this enormous difference?

Roberto A. Ferdman at Wonkblog shows the numbers as a bar chart.

The bars compare between countries, but I was interested in the comparison between the worker and the CEO. Just how much more is 354 times more? This chart doesn’t tell me that.


An article on Quartz from late last year looks at similar data, and translates it into how many months workers at different companies would have to work to make the same as the CEO does in one hour. The disparities in these examples are even more staggering, since while the Wonkblog chart above looked at averages, Quartz used specific – extreme – examples. For example, McDonald’s CEO makes 1120 times what a McDonald’s worker makes. This is shown as a sort of calendar that has months marked for how long the worker needs to work to make that much.

While that illustrates the time, it kind of misses the point. Showing days when the comparison is hours understates the true magnitude by a factor of eight (assuming an eight-hour work day). Why not show the same units?

Large Multiples

The idea of showing the number of days is good, however, and I wanted to apply it to the Wonkblog numbers. So I built a little unit or multiples chart for this purpose.

I originally had included a bar chart as well as the unit chart, but based on Twitter feedback, decided to remove it. This focuses the chart on its main message, even if it makes comparing between countries more difficult. That comparison is not really all that interesting anyway, but rather the enormous disparity in and of itself.

While I was building an interactive chart, I added a bit of animation. The bubbles building up is meant to make the number a bit more tangible by also translating it into time: you have to wait longer to get the full value the larger the actual number is. This makes you feel the difference a bit more than a simple chart. I stole this idea from the UK Office of National Statistics Neighbourhood Quiz.

Click the image below to go to the interactive version of the chart. Let me know what you think!