EagerEyes.org

Subscribe to EagerEyes.org feed
Updated: 1 hour 11 min ago

VIS 2014 Observations and Thoughts

Tue, 11/18/2014 - 03:19

Categories:

Visualization

While I’ve covered individual talks and events at IEEE VIS 2014, there are also some overall observations – positive and negative – I thought would be interesting to write down to see what others were thinking.

I wrote summaries for every day I was actually at the conference: Monday, Tuesday, Wednesday, Thursday, and Friday. VIS actually now starts on Saturday with a few early things like the Doctoral Colloquium, and Sunday is a full day of workshops and tutorials.

Just to be clear: my daily summaries are by no means comprehensive. I did not go to a single VAST or SciVis session this year, only saw two out of five panels, did not go to a single one of the ten workshops, attended only one of the nine tutorials, and didn’t even see all the talks in some of the sessions I did go to. I also left out some of the papers I actually saw, because I didn’t find them relevant enough.

Things I Don’t Like

I’m starting with these, because I like a lot more things than I don’t, and listing the bad stuff at the end always makes these things sound like they are much more important and severe than they really are.

The best paper has been quite odd at InfoVis for a while. Some of the selections made a lot of sense, but some were just downright weird. This year’s best paper was not bad, but I don’t think it was the best one that was presented. Even more, some of the really good ones didn’t even get honorable mentions.

While it’s easy to blame the best paper committee, I think we program committee members also need to get better at nominating the good ones so they can be considered. I know I didn’t nominate any of the ones I was primary reviewer on, and I really should have for one of them. We tend to be too obsessed with criticizing the problems and don’t spend enough time making sure the good stuff gets the recognition it deserves.

Another thing I find irritating is the new organization of the proceedings. I don’t get why TVCG papers need to be in a separate category entirely, that just makes finding them harder. It also only reinforces the mess that is the conference vs. journal paper distinction at VAST. Also, why are invited TVCG papers listed under conference rather than TVCG? How does that make any sense? There has to be a better way both for handling VAST papers (and ensuring the level of quality) and integrating all papers in the electronic proceedings. There is just too much structure and bureaucracy here that I have no interest in and that only gets in the way. Just let me get to the papers.

Speaking of TVCG, I don’t think that cramming presentations for journal papers into an already overfull schedule is a great idea. That just takes time away from other things that make more sense for a conference (like having a proper session for VisLies). While I appreciate the fact that VIS papers are journal papers (with some annoying exceptions), I think doing the opposite really doesn’t make sense. Also, none of the TVCG presentations I saw this year were remarkable (though I admittedly only saw a few).

The Good Stuff

On to the good stuff. This was the best InfoVis conference in a while. There were a few papers I didn’t like, but they were outweighed by a large number of very strong ones, and some really exceptional ones. I think this year’s crop of papers will have a lasting impact on the field.

In addition to the work being good, presentations are also getting much better. I only saw two or three bad or boring presentations, most were very solid. That includes the organization of the talk, the slides (nobody seems to be using the conference style, which is a good thing), and the speaking (i.e., proper preparation and rehearsals). A bad talk can really distract from the quality of the work, and that’s just too bad.

Several talks also largely consisted of well-structured demos, which is great. A good demo is much more effective than breaking the material up into slides. It’s also much more engaging to watch, and leaves a much stronger impression. And with some testing and rehearsals, the risk that things will crash and burn is really not that great (still not a bad idea to have a backup, though).

A number of people have talked about the need for sharing more materials beyond just the paper for a while, and it is now actually starting to happen. A good number of presentations ended with a pointer to a website with at least the paper and teaser video, and often more, like data and materials for studies, and source code. After the Everything But The Chart tutorial, I wonder how many papers next year will have a press kit.

The number of systems that are implemented in JavaScript and run in the browser is also increasing. That makes it much easier to try them out without the hassle of having to download software. Since many of these are prototypes that will never be turned into production software, it doesn’t matter nearly as much that they won’t be as easily maintained or extended.

VIS remains a very friendly and healthy community. There are no warring schools of thought, and nobody tries to tear down somebody else’s work in the questions after a talk. The social aspect is also getting ever stronger with the increasing number of parties. That might sound trivial, but the main point of a conference are communication and the connections that are made, not the paper presentations.

There is also a vibrant community on Twitter, at least for InfoVis and VAST talks. I wonder what it will take to get some SciVis people onto Twitter, though, or help them figure out how to use WordPress.

VIS 2014 – Friday

Fri, 11/14/2014 - 15:46

Categories:

Visualization

Wow, that was fast! VIS 2014 is already over. This year’s last day was shorter than in previous years, with just one morning session and then the closing session with the capstone talk.

Running Roundup

We started the day with another run. Friday saw the most runners (six), bringing the total for the week to 15, with a count distinct of about 12. I hereby declare the first season of VIS Runnners a resounding success.

InfoVis: Documents, Search & Images

The first session was even more sparsely attended than on Thursday, which was really too bad. The first paper was Overview: The Design, Adoption, and Analysis of a Visual Document Mining Tool For Investigative Journalists by Matthew Brehmer, Stephen Ingram, Jonathan Stray, and Tamara Munzner, and it was great. Overview is a tool for journalists to sift through large collections of documents, like those returned from Freedom of Information Act (FOIA) requests. Instead of doing automated processing, it allows the journalists to tag and use keywords, since many of these documents are scanned PDFs. It’s a design study as well as a real tool that was developed over a long time and multiple releases. This is probably the first paper at InfoVis to report on such an extensively developed system (and the only one directly involved in somebody becoming a Pulitzer Prize finalist).

The Overview paper also wins in the number of websites category: in addition to checking out the paper and materials page, you can use the tool online, examine the source code, or read the blog

How Hierarchical Topics Evolve in Large Text Corpora, Weiwei Cui, Shixia Liu, Zhuofeng Wu, Hao Wei presents an interesting take on topic modeling and the ThemeRiver. Their system is called RoseRiver, and is much more user-driven. The system finds topics, but lets the user combine or split them, and work with them much more than other systems I’ve seen.

I’m a bit skeptical about Exploring the Placement and Design of Word-Scale Visualizations by Pascal Goffin, Wesley Willett, Jean-Daniel Fekete, and Petra Isenberg. The idea is to create a number of ways to include small charts within documents to show some more information for context. They have an open-source library called Sparklificator to easily add such charts to a webpage. I wonder how distracting small charts would be in most contexts, though.

A somewhat odd paper was Effects of Presentation Mode and Pace Control on Performance in Image Classification by Paul van der Corput and Jarke J. van Wijk. They investigated a new way of rapid serial visual presentation (RSVP) for images, which continuously scrolls rather than flips through page of images. It’s a mystery to me why they only tried sideways scrolling, which seems much more difficult than vertical scrolling.

Capstone: Barbara Tversky, Understanding and Conveying Events

The capstone was given by cognitive psychology professor Barbara Tversky. She talked about the difference between events and activities (events are delimited, activities are continuous), and how we think about them in when listening to a story. She has done some work on how people delineate events on both a high level and a very detailed level.

This is interesting in the context of storytelling, and particularly in comics, which break up time and space using space, and need to do so at logical boundaries. Tversky also discussed some of the advantages and disadvantages of story: that it has a point of view, causal links, emotion, etc. She listed all of those as both advantages and disadvantages, which I thought was quite clever.

It was a very fast talk, packed with lots of interesting thoughts and information nuggets. It worked quite well as a counterpoint to Alberto Cairo’s talk, and despite the complete lack of direct references to visualization (other than a handful of images), it was very appropriate and useful. Many people were taking pictures of her slides during the talk.

Next Years

IEEE VIS 2015 will be held in Chicago, October 25–30. The following years had already been announced last year (2016: Washington, DC; 2017: Santa Fe, NM), but it was interesting to see them publicly say that 2018 might see VIS in Europe again.

This concludes the individual day summaries. I will also post some more general thoughts on VIS 2014 in the next few days.

VIS 2014 – Thursday

Fri, 11/14/2014 - 07:16

Categories:

Visualization

Thursday was the penultimate day of VIS 2014. I ended up only going to InfoVis sessions, and unfortunately missed a panel I had been planning to see. The papers were a bit more mixed, but there were agains some really good ones.

InfoVis: Evaluation

Thursday was off to a slow start (partly because of the effects of the party the night before that had the room mostly empty at first), but eventually got interesting.

Staggered animation is commonly understood to be a good idea: don’t start all movement in a transition at once, but with a bit of delay. It’s supposed to help people track the objects as they are moving. The Not-so-Staggering Effect of Staggered Animated Transitions on Visual Tracking by Fanny Chevalier, Pierre Dragicevic, and Steven Franconeri describes a very well-designed study that looked into that. They developed a number of criteria that make tracking harder, then tested those with regular motion. After having established their effect, they used Monte-Carlo simulation to find the most best configuration for staggered animation of a field of points (since there are many choices to be made about which to move first, etc.), and then tested those. It turns out that the effect from staggering is very small, if it exists at all. That’s quite interesting.

Since they tested this on a scatterplot with identical-looking dots, it’s not clear how this would apply to, for example, a bar chart or a line chart, where the elements are easier to identify. But the study design is very unusual and interesting, and a great model for future experiments.

Another unexpected result comes from The Influence of Contour on Similarity Perception of Star Glyphs by Johannes Fuchs, Petra Isenberg, Anastasia Bezerianos, Fabian Fischer, and Enrico Bertini. They tested the effect of outlines in star glyphs, and found that the glyph works better without it, just showing the spokes. That is interesting, since the outline supposedly would help with shape perception. There are also some differences between novices and experts, which are interesting in themselves.

The only technique paper that I have seen so far this year was Order of Magnitude Markers: An Empirical Study on Large Magnitude Number Detection by Rita Borgo, Joel Dearden, and Mark W. Jones. The idea is to design a glyph of sorts to show orders of magnitude, so values across a huge range can be shown without making most of the smaller values impossible to read. The glyphs are fairly straightforward and require some training, but seem to be working quite well.

InfoVis: Perception & Design

While there were some good papers in the morning, overall the day felt a bit slow. The last session of the day brought it back with a vengeance, though.

Learning Perceptual Kernels for Visualization Design by Çağatay Demiralp, Michael Bernstein, and Jeffrey Heer describes a method for designing palettes of shapes, sizes, colors, etc, based on studies. The idea is to measure responses to differences, and then train a model to figure out which of them can be differentiated better or worse, and then pick the best ones.

The presentation that took the cake for the day though was Ranking Visualization of Correlation Using Weber’s Law by Lane Harrison, Fumeng Yang, Steven Franconeri, and Remco Chang. It’s known that scatterplots allow people to judge correlation quite well, with precision following what is called Weber’s Law (which describes which end of the scale is easier to differentiate). In their experiments, the authors found that this is also true for ten other techniques, including line charts, bar charts, parallel coordinates, and more. This is remarkable because Weber’s law really describes very basic perception rather than cognition, and it paves the way for a number of new ways to judge correlation in almost any chart.

The Relation Between Visualization Size, Grouping, and User Performance by Connor Gramazio, Karen Schloss, and David Laidlaw looked at the role of mark size in visualizations, and whether it changes people’s performance. They found that mark size does improve performance, but only to a point. From there, it doesn’t make any more difference. Grouping also helps reduce the negative effect of an increase in the number of marks.

Everybody talks about visual literacy in visualization, but nobody really does anything about it. That is, until A Principled Way of Assessing Visualization Literacy by Jeremy Boy, Ronald Rensink, Enrico Bertini, and Jean-Daniel Fekete. They developed a framework for building visual literacy tests, and showed that this could work with an actual example. This is just the first step certainly, and there are no established visual literacy levels for the general population, etc. But having a way to gauge visual literacy would be fantastic and inform a lot of research, use of visualization in the media, education, etc.

The Podcasting Life

Moritz and Enrico asked me to help them record a segment for the VIS review episode of the Data Stories podcast. You can listen to that in all its raw, uncut glory by downloading the audio file.

VIS 2014 – Wednesday

Thu, 11/13/2014 - 14:29

Categories:

Visualization

Wednesday is more than the halfway point of the conference, and was clearly the high point so far. There were some great papers, the arts program, and I got to see the Bertin exhibit.

InfoVis: Interaction and Authoring

Revisiting Bertin matrices: New Interactions for Crafting Tabular Visualizations by Charles Perin, Pierre Dragicevic, and Jean- Daniel Fekete was the perfect paper for this year. They implemented a very nice, web-based version of Bertin’s reorderable matrix, very closely following the purely black-and-white aesthetic of the original. They are also starting to build additional things on top of that, though, using color, glyphs, etc.

The reason it fits so well is not just that VIS is in Paris this year (and Bertin actually lived just around the corner from the conference hotel), but it also ties in with the Bertin exhibit (see below). They also made the right choice in calling the tool Bertifier, a name I find endlessly entertaining (though they clearly missed the opportunity to name it Bertinator, a name both I and Mike Bostock suggested after the fact – great minds clearly think alike).

iVisDesigner: Expressive Interactive Design of Information Visualizations by Donghao Ren, Tobias Höllerer, and Xiaoru Yuan is a tool for creating visualization views on a shared canvas. It borrows quite a bit from Tableau, Lyra, and other tools, but has some interesting ways of quickly creating complex visualizations that are linked together so brushing between them works. They even showed streaming data in their tool. It looked incredibly slick in the demo, though I have a number of questions about some of the steps I didn’t understand. Since it’s available online and open-source, that’s easy to follow up on, though.

VIS Arts Program

I saw a few of the papers in the VIS Arts Program (oddly abbreviated VISAP), though not as many as I would have liked. There were some neat projects using flow visualization to paint images, some more serious ones raising awareness for homelessness with a large installation, etc.

The one that stood out in the ones I saw was PhysicSpace, a project where physicists and artists worked together to make it possible to experience some of the weird phenomena in quantum physics. The pieces are very elaborate and beautiful, and go way beyond simple translations. There is a lot of deep thinking and an enormous amount of creativity in them. It’s also remarkable how open the physicists seem to be to these projects. It’s well worth watching all the videos on their website, they’re truly stunning. This is the sort of work that really shows how transcending art and science can produce amazing results.

InfoVis: Exploratory Data Analysis

This session was truly outstanding. All the papers were really good, and the presentations matched the quality of the content (almost all the presentations I saw yesterday were really good). InfoVis feels really strong this year, both in terms of the work and the way it is presented.

The Effects of Interactive Latency on Exploratory Visual Analysis by Zhicheng Liu and Jeffrey Heer looks at the effect latency has on people’s exploration of data. They added a half-second delay to their system and compared to the system in its original state. It turns out that the delay reduces the amount of interaction and people end up exploring less of the data. While that is to be expected, when asked people didn’t think the delay would affect them, and a third didn’t even consciously notice it.

Visualizing Statistical Mix Effects and Simpson’s Paradox by Zan Armstrong and Martin Wattenberg examines Simpson’s Paradox (e.g., median increases for entire population, even though every subgroup decreases) in visualization. They have built an interesting visualization to illustrate why the effect occurs, and make some recommendations for mitigating it in particular techniques. This is an important consideration for aggregated visualization, which is very common given today’s data sizes.

Showing uncertainty is an important issue, and often it is done with error bars on top of bar charts. The paper Error Bars Considered Harmful: Exploring Alternate Encodings for Mean and Error by Michael Correll and Michael Gleicher shows why they are problematic: the are ambiguous (do they show standard error or a confidence interval? If the latter, then which one?), asymmetric (points in the bar appear to be more likely than points over the bar, at the same distance from the bar’s top), and binary (a point is either within the range or outside). Their study demonstrates the issue and then tests two different ways, violin plots and gradient plots, which both perform better.

My Tableau Research colleagues Justin Talbot, Vidya Setlur, and Anushka Anand presented Four Experiments on the Perception of Bar Charts. They looked at the classic Cleveland and McGill study of bar charts, and asked why the differences they found occurred. Their study is very methodical and presented very well, and opens up a number of further hypotheses and questions to look into. It has taken 30 years for somebody to finally ask the why question, hopefully we’ll see more reflection and follow-up now.

I unfortunately missed the presentation of the AlgebraicVis paper by Gordon Kindlmann and Carlos Scheidegger. But it seems like a really interesting approach to looking at visualization, and Carlos certainly won’t shut up about it on Twitter.

Bertin Exhibit

VIS being in Paris this week is the perfect reason to have an exhibit about Jacques Bertin. It is based on the reorderable matrix, an idea Bertin developed over many years. The matrix represents a numeric value broken down by two categorical dimensions, essentially a pivot table. The trick, though, is that it allows its user to rearrange and order the rows and columns to uncover patterns, find correlations, etc.

The exhibit shows several design iterations Bertin went through to build it so it would be easy to rearrange, lock, and unlock. Things were more difficult to prototype and animate before computers.

The organizers also built a wooden version of the matrix for people to play with. The basis for this was the Bertifier program presented in the morning session. While they say that it is a simplified version of Bertin’s, they also made some improvements. One is that they can swap the top parts of the elements by attaching them with magnets. That way, different metrics can be expressed quite easily, without having to take everything apart. I guess it also lets you cheat on the reordering if you only swap two rows.

They also have some very nice hand-drawn charts from the 1960s, though not done by Bertin. They are interesting simply because they show how much effort it was to draw charts before computers.

Note the amount of white-out used above to remove extraneous grid lines, and below to correct mistakes on the scatterplot.

I was also reminded of this in the Financial Visualization panel, where one of the speakers showed photos of the huge paper charts they have at Fidelity Investments for deep historical data (going back hundreds of years). Paper still has its uses.

In addition to being interesting because of Bertin’s influence and foresight, this exhibit is also an important part of the culture of the visualization field. I hope we’ll see more of these things, in particular based on physical artifacts. Perhaps somebody can dig up Tukey’s materials, or put together a display of Bill Cleveland’s early work – preferably without having to wait for him to pass away.

Running and Partying

The second VIS Run in recorded history took place on Wednesday, and that night also saw the West Coast Party, which is becoming a real tradition. The first session on Thursday morning was consequently quite sparsely attended.

VIS 2014 – Tuesday

Wed, 11/12/2014 - 08:46

Categories:

Visualization

The big opening day of the conference, Tuesday, brought us a keynote, talks, and panels. Also, a new trend I really like: many talks end with the URL of a webpage that contains a brief summary of the paper, the PDF, and often even a link to the source code of the tool they developed.

Opening

That VIS would ever take place outside the U.S. was by no means a given. There was a lot of doubt about getting enough participants, sponsors, etc. to make it work (and a ton of convincing by this year’s chair, Jean-Daniel Fekete).

That made it especially interesting to hear the participant numbers. There are over 1,100 attendees this year, more than ever before. They also more than doubled the amount of money coming from sponsors compared to last year, which is very impressive. VIS outside the U.S. is clearly doable, and even though the next three years are already known to be in the U.S., I’m sure this will happen again.

One number that was presented but that I don’t believe is that there were supposedly only 79 first-time attendees. That doesn’t square with the different composition of participants (fewer Americans, more Europeans), and besides would be terrible if true.

Alberto Cairo: The Island of Knowledge and the Shorelines of Wonder

The keynote this year was by Alberto Cairo, who gave a great talk about the value of knowledge and communicating data. Perhaps my favorite quote was that good answers lead to more good questions.

There is a lot more to say, and I want to really do his talk justice. So I’m going to not go into more detail here, but rather write it up in a separate posting in the next week or two.

InfoVis: The Joy of Sets

The first InfoVis session started what I hope is a trend: ending talks with a URL that points to a website with talk materials, the paper, and often even the source code of the presented tool. This is how work can be shared, revisited, and make its way beyond the limited conference audience.

The first paper was UpSet: Visualization of Intersecting Sets by Alexander Lex, Nils Gehlenborg, Hendrik Strobelt, Romain Vuillemot, and Hanspeter Pfister. The system allows the user to compare sets and look at various intersections and aggregations. There are many different interactions to work with the sets. Because there are so many views and details, it’s almost like a systems paper, but good (most systems papers are terrible – another rant for another day).

OnSet: A visualization technique for large-scale binary set data by Ramik Sadana, Timothy Major, Alistair Dove, and John Stasko describes a tool for comparing multiple sets to each other. There are some clever interactions and the tool also shows hierarchies within the sets while comparing.

Rounding out the sets theme was a paper I didn’t actually see the presentation for, but I want to mention anyway: Domino: Extracting, Comparing, and Manipulating Subsets across Multiple Tabular Datasets by Samuel Gratzl, Nils Gehlenborg, Alexander Lex, Hanspeter Pfister, and Marc Streit. From what I gather, it presents a query interface and visualization for sets and subsets, and it looks quite nifty.

InfoVis: Colors and History

I’m a bit conflicted about DimpVis: Exploring Time-varying Information Visualizations by Direct Manipulation by Brittany Kondo and Christopher Collins. They developed a way to show time in a plot so that you can navigate along the temporal development of a value (rather than use a time slider that is disconnected and doesn’t show you history). While that makes sense to me in the original example they showed, a time-varying scatterplot, I’m a bit less convinced by the bar chart, pie chart, and heatmap versions of it.

A paper I missed, but that seems to have stirred some controversy, is Tree Colors: Color Schemes for Tree-Structured Data by Martijn Tennekes and Edwin de Jonge.

“Blind Lunch”

The reason I missed some of the papers in the InfoVis session is that I was one of the people hosting a table for what is called a blind lunch. This used to be called Lunch with the Leaders, which may have sounded a bit too ambitious (and scared off potential leaders who didn’t necessarily consider themselves that), but at least it made more sense. Everybody knew who they were signing up with, and nobody was blindfolded as far as I’m aware.

It’s a good event though. I had a chance to chat with four grad students and share my wisdom about industry vs. academia. There are also a few more activities as part of the Compass program for people who are about to graduate, or just generally want to get more perspectives on the job situation in academia and/or industry.

Panel: Data with a cause: Visualization for policy change

One of the things I was looking for the most at VIS this year was the panel Data with a cause: Visualization for policy change, organized by Moritz Stefaner, with speakers from the OECD, World Bank, and the World Economic Forum.

The panelists all had interesting things to say about what they are doing to make data more accessible, make it easier to share their reports and other materials, and provide means for people to talk back. There are also some interesting issues around the different types of audience they want to serve (economists, policy makers, general public) and the general unease when handing out data to the unwashed masses.

What I was missing, though, was a bit of controversy and actual discussion. For such an important topic, it was a very tame panel. There were some really good questions to be asked though, like one coming from the audience about the responsibility of organizations not to reinforce the winners and losers through their data, and what they might do about that. I also asked about the availability not just of tables, but of the underlying data. I have some more to say on that topic in future postings.

Namrediehns

One of my favorites of the conference so far is Multivariate Network Exploration and Presentation: From Detail to Overview via Selections and Aggregations by Stef van den Elzen and Jarke J. van Wijk. I don’t seem to be alone in this, as the paper also received the Best Paper Award at InfoVis this year.

The system they developed shows multivariate graphs, and allows the concurrent display of the network and the multivariate data in the nodes (even including small multiples). What’s perhaps most interesting is the fact that they allow the user to make selections to aggregate the graph, essentially building a sort of PivotGraph to see the higher-level structure on top of the very detailed, hairball-like, graph.

Because they are showing the detailed network first and let the user create an overview version, apparently Jarke van Wijk suggested to name the system Namrediehns – i.e., Shneiderman spelled backwards, since it’s Ben Shneiderman’s famous mantra (overview first, zoom and filter, then details on demand) in reverse.

NAMREDIENHS! The reverse Shneiderman mantra. #ieeevis pic.twitter.com/zBRJ3oipNJ

— Nils Gehlenborg (@nils_gehlenborg) November 11, 2014

This was much funnier the way Stef van Elzen did it of course, and in particular with Ben Shneiderman sitting there in the first row, directly in front of him.

VisLies, Parties

It remains a crime that VisLies is not a regular session, but a meetup that is tacked on and usually at a time when everybody is at dinner. I think it’s a really great idea, and there should be room for it in the regular program. It deserves a lot more attention and attendance. I missed it this year again.

There were also two new parties, the Austrian Party and the NYU Party. I really like this new tradition of parties to connect people and reinforce the community aspect of the conference. It does mean even less sleep than before, though.

VIS 2014 – Monday

Tue, 11/11/2014 - 08:41

Categories:

Visualization

IEEE VIS 2014 technically began on Saturday, with the first full day open to all attendees being Sunday. Monday continued the workshops and tutorials, and that is where we join our intrepid reporter.

VIS Social Run

The day started at 6:30am, when five fearless runners braved the cold and dark, and completed the inaugural VIS Social Run. It was a great run, about 5km in length, in (what I consider) perfect running weather (i.e., cool bordering on cold). While the darkness limited the sightseeing potential of the run, the early morning was great because it’s the time when all the boulangeries are baking their bread, so we got to suck in the delicious smells of fresh bread.

I’ve posted the route on Strava for all to enjoy. We even took a dorky, sweaty, blurry group selfie at the end.

We’re also running Wednesday morning and Friday morning, and potentially also Thursday. Stephen Kobourov might also do a longer run on Friday afternoon. Let me know if you want to join us, or just come to the Marriott at 6:30am.

BELIV

I only saw part of the BELIV workshop (the name still stands for Beyond Time And Errors: Novel Evaluation Methods For Visualization). The papers there are well worth checking out though, because they represent some of the most interesting thinking about how to better evaluate visualization work.

Pierre Dragicevic gave a great keynote discussing the use of statistics in visualization. In particular, the use of p values, often without understanding them well, cherry-picking results, ignoring effect sizes, etc. Instead, using confidence intervals is a much better idea, because it provides much more information than the largely binary (and opaque!) significance test.

This is really important to make results more useful beyond just the individual paper, easier to compare in replication, and just generally more honest. Pierre and his group have a great website with lots of resources to explore.

Bernice Rogowitz has some good points in her questions after the talk, like the fact that using more than just the plain p values makes for much better way of telling the story of the analysis than the boring boilerplate stats you usually get. Walking the reader through the analysis also makes it easier to also include the weaker results instead of hiding them.

There was also a panel on tasks, which largely talked about task taxonomies. There was an odd lack of self-awareness on that panel, because for all the talk about tasks, there didn’t seem to be much thought about what people would actually do with those taxonomies. Who are the users of the taxonomies? What are their tasks? Is any of this work actually useful, or is it just done for its own sake? That struck me as particularly odd as part of this event.

I didn’t see the actual paper presentations, but BELIV generally has a good mix of interesting new thinking and interesting results from evaluations of visualization tools and systems.

On a related note, Steve Haroz has put together a great guide to evaluation papers at VIS this year.

Everything Except The Chart Tutorial

Among the more unusual things this year was Moritz Stefaner and Dominikus Baur‘s tutorial titled Everything Except The Chart. They talked about all the things web-based visualization project needs to be successful (other than the visuals): how to make it findable, how to make it shareable, various web technologies, etc. They did that based on their own projects, like Selfiecity, the OECD Better Life Index, etc.

The room was packed, which was interesting. Who knew academics actually cared about sharing their work with the world? Apparently, they do.

There was a lot of information in that tutorial, I will not even begin to try and summarize it all. They have published their slides, and also made some demo code available.

Perhaps the best summary of the tutorial is the project checklist they used to frame part of it:

  • Is it findable?
  • Does it draw you in?
  • Is it enjoyable to use?
  • Is it informative?
  • “Why should I care?”
  • Is it shareable?

These are questions anybody can ask themselves easily, and then figure out what to do about them. This includes simple things like hidden images and text to make the page easier to index for search engines and share/pin/etc. And it even includes things like a press kit, so journalists can write about your projects more easily (and get the best images).

While I wasn’t as excited about the long list of tools (bower, grunt, snort, blurt, fart, etc. – I may have made up a few of those, guess which!), they had lots of good points about making design responsive, having it work well (or at least be useable) on small screens, etc. None of this has ever been discussed at VIS before as far as I am aware, and it has the potential to have the largest impact for getting word out about the work we do in visualization. Now all the people who attended just need to actually put these things into practice.

The VIS Sports Authority

Sun, 10/19/2014 - 23:35

Categories:

Visualization

When you think of a conference, does sitting around a lot come to mind? Lots of food? Bad coffee? No time to work out? For the first time in VIS history, there will be a way to exercise your body, not just your mind. The VIS Sports Authority, which is totally an official thing that I didn’t just make up, will kick your ass at VIS 2014.

There will be two disciplines: cycling and running. Jason Dykes is running the cycling team, and I will be driving the runners.

Le Tour de VIS

Jason is way more organized than I am, having put together not just a real website with a logo, but actually ordered bike jerseys. Cycling has somewhat more complicated logistics though, so that is certainly a good thing. I hear Jason has even picked out the soundtrack for the race already.

The Vélo Club de VIS will embark on Le Tour de VIS (this is apparently named after some sort of bike race) on the Saturday after the conference, November 15.

Go to one of the pages linked above to get more information, like a map of the planned route, and to sign up.

VIS Runners

The running will be a bit more low-key. I couldn’t think of a better name than VIS Runners, so let’s just run with that (unless you want me to call us Eager Runners).

However, running will not happen after the conference, but during. Since the receptions and parties are in the evenings, it makes the most sense to go out in the mornings. My current plan is to meet at the conference hotel at about 6:30am, then run for about an hour, so we’ll be back by 7:30.

For the distance, I’m thinking no more than 6 miles/10 kilometers, but that can be adjusted. We probably won’t do more than three runs, and in particular will likely skip Thursday (after the reception Wednesday night).

The course should be different every day to get some variety, and will depend on the distance people want to go. If you’re a local or just know your way around Paris, I’d appreciate your input in the route planning, too!

I’m embedding a form below (also available here) to collect some information about when and how far people want to go, and to get people’s names so I can follow up later.

Loading…

Large Multiples

Mon, 10/13/2014 - 03:43

Categories:

Visualization

Getting a sense of scale can be difficult, and the usual chart types like bars and lines don’t help. Showing scale requires a different approach, one that makes the multiplier directly visible.

Bars

In the U.S., CEOs on average make 354 times as much as workers, according to this recent posting on the Washington Post’s Wonkblog. That is an astounding number. Put differently, a CEO makes in one day almost as much as the worker makes in an entire year. How do we show this enormous difference?

Roberto A. Ferdman at Wonkblog shows the numbers as a bar chart.

The bars compare between countries, but I was interested in the comparison between the worker and the CEO. Just how much more is 354 times more? This chart doesn’t tell me that.

Multiples

An article on Quartz from late last year looks at similar data, and translates it into how many months workers at different companies would have to work to make the same as the CEO does in one hour. The disparities in these examples are even more staggering, since while the Wonkblog chart above looked at averages, Quartz used specific – extreme – examples. For example, McDonald’s CEO makes 1120 times what a McDonald’s worker makes. This is shown as a sort of calendar that has months marked for how long the worker needs to work to make that much.

While that illustrates the time, it kind of misses the point. Showing days when the comparison is hours understates the true magnitude by a factor of eight (assuming an eight-hour work day). Why not show the same units?

Large Multiples

The idea of showing the number of days is good, however, and I wanted to apply it to the Wonkblog numbers. So I built a little unit or multiples chart for this purpose.

I originally had included a bar chart as well as the unit chart, but based on Twitter feedback, decided to remove it. This focuses the chart on its main message, even if it makes comparing between countries more difficult. That comparison is not really all that interesting anyway, but rather the enormous disparity in and of itself.

While I was building an interactive chart, I added a bit of animation. The bubbles building up is meant to make the number a bit more tangible by also translating it into time: you have to wait longer to get the full value the larger the actual number is. This makes you feel the difference a bit more than a simple chart. I stole this idea from the UK Office of National Statistics Neighbourhood Quiz.

Click the image below to go to the interactive version of the chart. Let me know what you think!

Eight Years of eagereyes

Thu, 10/02/2014 - 05:20

Categories:

Visualization

What is the purpose of blogging about visualization? Is it to make fun of the bad stuff? Is it to point to pretty things? Is it to explain why things are good or bad? Is it to expand the landscape of ideas and break new ground? Or is it to discuss matters at great length that ultimately don’t matter all that much?

I criticize things, and I think it’s important to do that. I don’t regret any of my postings, however strong they may have been, and however mean they may have sounded. It was all done in good faith and with the intent to point out issues and get people to pay attention.

But increasingly, I’m questioning the thinking that some of that criticism is coming from. I’m not arguing against any particular issue people like to bring up, but I am starting to wonder how much of it is simply coming out of narrow-mindedness and stubbornness. How much of it would be obviated by sitting back, taking a deep breath, and trying to see things from a different angle?

This is not just a question of tone and intensity, but one that goes much deeper: how much do we really know? When you start to ask that question in visualization, it becomes clear very quickly how shockingly little we actually really understand. Going on and on about pie charts? Point to a paper that’s actually showing that they’re bad! Yes, such a paper exists. But how many studies have shown the same thing? Not that many. And it gets much worse for things like 3D bar charts, etc. There is very little support for the religious zealotry with which we like to damn these things.

Then there is the  question of different goals. There isn’t just one use for visualization, and things created for different purposes need to be judged against different standards. It’s all about trade-offs and making decisions. An audience of readers on the web is going to need a different approach than an audience of experts who know the data really well and have a vested interest in digging deeper. An interactive piece on a news media website will need to be much more compelling than a corporate dashboard if anybody’s going to actually bother doing something with it. There is not just one purpose, or one audience, or one way to do things.

It’s encouraging to see the huge interest in visualization. And it’s even more encouraging to see some of the recent and upcoming work on rhetoric, persuasion, and related questions. Because it matters. Communication matters. Data matters. Visualization matters.

Discussing visualization needs to matter too. But it can only do so if it comes from a place of understanding, respect, and an open mind.

Beyond the Knee-Jerk Reaction

Tue, 09/16/2014 - 03:12

Categories:

Visualization

There is a tendency to just reflexively make fun of certain types of charts, in particular pie charts and 3D charts. While that is often justified, there are also exceptions. Not all pie charts are bad, and not all 3D charts are terrible. But to spot those outliers, we have to suppress the knee-jerk reflex and give them a moment of thought before ripping them apart.

The Chart

About two weeks ago, I posted this chart on Twitter after seeing it in the Wired iPad app (September 2014 issue).

Yes, it is a 3D area chart. The vertical axis is the average salary paid in a number of sectors over time. The one horizontal axis on the left is the time axis, showing 30 years from 1983 to 2013. The other horizontal axis divides the chart up into four elements for four sectors: technology, white-collar, manufacturing, and sales. That axis has a second encoding in the width of the “mountains,” which represent the fraction of the workforce in each of those sectors.

The Good

So there’s a lot of data here. You can see that the tech sector pays a lot more than the others, roughly twice what sales pays, and a good 50% more than manufacturing or white-collar jobs. You can also see the effect of the recession in the ripples along the tops of the mountains, with an interesting lag between white-collar/manufacturing and sales.

I also have to admit to being quite surprised to see how small the tech sector really is: only 7% of the workforce, up from 4% 30 years ago. It’s sometimes hard to remember that there’s a world beyond technology when you’re working at a software company and spend your days on Twitter. White-collar jobs have grown to roughly make up for the loss in manufacturing, but not quite, while the percentage of people in sales has not changed.

Not all of that comes from the chart, it certainly requires some reading of the numbers, in particular for the width of the mountains. But the information is there and it’s not hard to read. The reason for posting this was my surprise at finding myself spending several minutes with this chart and finding it quite informative and fun to explore. There is a bit of interaction too, when you tap on the plus signs, but those don’t give you much additional information.

The Bad

What is wrong with this chart? Sure, it’s 3D. You can’t precisely read the numbers. What was the average salary for manufacturing jobs in 1992? You can’t read that with any sort of precision. 3D is wasteful, you could show more data in that space. But who cares? That is not the point of this chart. You can see the development over time, that’s what matters. And the chart does not seem to wildly distort the reading of those values that are readable (which is a common issue with 3D charts).

I also think that this is a good way to present what are basically eight time series (salaries and workforce percentages for four sectors) in a very concise way that works well in a static image. Of course this could be broken up into two or even three charts, but you would lose some of the cohesion the 3D gives you here. And it would be a lot less fun to explore. The lines for workforce percentage would also look extremely boring (they seem to be changing at a fairly constant pace, and certainly don’t change direction). If you care not just about representing the data but also capturing readers’ interest, this is the better chart. It certainly worked in my case.

A Smarter Discussion

But beyond all those reasons, I just want to see a more nuanced and informed discussion of these things. It doesn’t take much intelligence to sneer at every 3D chart and every pie chart that floats by on Twitter. But things are a bit more complicated than that, and these things do have their place. And just throwing some supposed absolute rules around doesn’t do anybody any favors.

Perhaps Christopher Ingraham was right.

@eagereyes Twitter is only for saying mean things about charts, Robert

— Christopher Ingraham (@_cingraham) September 6, 2014

But I hope that we can get to a point where we can have a more intelligent, nuanced, and respectful discussion. We’re not going to make much progress if we just keep rehashing the same old ideas without putting any new thoughts into them.

The Semantics of the Y Axis

Mon, 09/08/2014 - 03:53

Categories:

Visualization

The vertical axis is not just important because it embodies one of the most important visual properties, but also because it is much more semantically loaded than the horizontal. Not only does the right choice of mapping help with reading a chart, it can also be confuse people if done wrong.

It’s not a coincidence that the vertical is so important for us. An animal that is lying on the ground is dead or sleeping, that’s important to know. Vertical movement is also much harder than moving in the other two dimensions, and fast vertical movements can kill us. That is why we overestimate heights: better be scared of a jump that isn’t all that dangerous than taking it lightly and getting injured or killed.

We also have some very strong ideas about the vertical direction. Things moving up are generally good, things moving down less so. Being up (standing, walking, moving) is good, being down (lying, sick, dead) is not. We derive many of our metaphors from this fundamental difference too: being down meaning being sad, things looking up or moving up meaning they are good or getting better. Up also means more: more things being stacked or heaped up means more vertical space being used, and more is usually better, so more is up.

Jawbone UP’s Sleep Tracking

Jawbone wrote a blog posting about when people slept during the soccer world cup according to the data they were gathering from users of their activity tracker armband. The tracker is called UP, which causes some interesting issues parsing the axis labels in these charts.

Parsing “% of UP wearers asleep” has you going back and forth between two interpretations: UP meaning people being up/awake, but then you read “asleep.” The number is encoded on the vertical axis as more people meaning the line going higher. So more up meaning more people asleep, fewer people being up. I remember some confused tweets from people struggling with this when this made the rounds.

Jawbone also seems to have noticed, since in their recent posting on the Napa earthquake, they flipped the axis to make the semantic connection easier to follow. Now it’s “% UP wearers awake,” which makes a lot more sense. More up, more people are awake or, well, up.

The archetype of these visualizations, the New York Times’ How Different Groups Spend Their Day also works like this: the bottom-most layer, and thus the baseline of sorts, is sleep. As it should be.

Which Quadrant

This chart of men’s vs. women’s earnings that I wrote about recently also uses the vertical axis in a simple, yet smart way. It has men’s earnings on the horizontal axis, and women’s on the vertical. That is the only way this makes sense, even if technically the other way around would be just as correct.

The difference is the message the majority of the points send. If women’s earnings were on the vertical axis, those points would be in the upper left quadrant. Up is good, right? So where’s the problem? Placing them in the lower left makes this much more obvious to read. The lines representing women making 10%, 20%, and 30% less also would be quite strange if they were to the top right of the main diagonal.

Bar Charts

I already wrote about this topic in the specific case of bar charts, but it bears repeating. Bars pointing down are unusual, and they grab the viewer’s attention. They can help get a point across and help people read the chart more easily.

Larger numbers being up in line charts, bar charts, scatterplots, etc., may be the default in practically all visualization tools (and that makes sense), but it should not just be accepted without thinking about it. The vertical direction should be chosen with care, because it communicates a lot about how to read a chart. And getting it wrong can cause considerable confusion.

My Favorite Charts

Thu, 09/04/2014 - 04:38

Categories:

Visualization

There are many charts I hate, because they’re badly done, sloppy, meaningless, deceiving, ugly, or for any number of other reasons. But then there are the ones I keep coming back to because they’re just so clear, well-designed, and effective.

All of these are a few years old. Like a fine wine analogy that I could insert here, it probably takes a while for a chart to come up again and again in conversation and when looking for examples to realize how good it is.

Scatterplot

My favorite scatterplot, and perhaps my favorite chart ever, is Why Is Her Paycheck Smaller? by Hannah Fairfield and Graham Roberts. It shows men’s versus women’s weekly earnings, with men on the horizontal axis and women on the vertical. A heavy black diagonal line shows equal wages, three additional lines show where women make 10%, 20%, and 30% less. Any point to the bottom right of the line means that women make less money than men.

The diagonal lines are a stroke of genius (pun fully intended). When you see a line in a scatterplot, it’s usually a regression line that models the data; i.e., a line that follows the points. But such a line only helps reinforce the difficulty of judging the differences between the two axes, which is something we’re not good at, and which is not typically something you do in a scatterplot anyway.

But the diagonal line, as simple as it is, makes it not just possible, but effortless. It’s such a simple device and yet so clear and effective. All the points on the line indicate occupations where men and women make the same amount of money. To the top left of the line is the area where women make more money than men, and to the bottom right where women make less.

The additional lines show 10%, 20%, and 30% less for women. If it’s hard to tell if a point is lying on the main diagonal of a scatterplot, it becomes impossible to guess the percentage it is off. The additional lines make it possible to guesstimate that number to within a few percent. That is a remarkable level of precision, and it is achieved with three simple lines.

There is some interactivity: mousing over points brings up a tooltip that shows the occupation the point represents and how much more one gender makes than the other. Filters in the top left corner let you focus on just a small number of occupations, which include annotations for a few select jobs.

But the key element is the inclusion of the reference lines that help people make sense of the scatterplot and read it with a high level of precision. Simple but effective, and powerful.

Line Chart

My favorite line chart is The Jobless Rate for People Like You by Shan Carter, Amanda Cox, and Kevin Quealy. This chart is somewhat ancient, having been created in Flash and showing unemployment data from January 2007 to September 2009. But its brilliant design and interaction make it timeless.

It’s a line chart, but with a twist. The first thing you see is the heavy blue line, showing the overall unemployment rate. But there are more lines in the background, what are those? So you mouse over and they respond: they light up and there’s a tooltip telling you what they represent. Each is the unemployment rate for a subset of the population, defined as the combination of race, gender, age group, and education. How are hispanic men over 45 with only a high school diploma doing compared to the overall rate? What about black women 15–24? Or white college grads of any age and gender?

Clicking on a line moves the blue line there so it’s easier to see, but the overall rate stays easily visible. The y axis also rescales nicely when the values go above what it can currently display.

In addition, the filters at the top also respond to the selection to show who is selected. Clicking around inside the chart updates them. Hm, so maybe I can use those to explore too? And of course you can, broadening or narrowing your selection, or clicking through different age groups of the same subset of the population, etc.

The Human-Computer Interaction field has a nice term for an indication of more data and interaction: information scent. This is usually used with widgets, which indicate where more information can be found (like the little tick marks on the scrollbar in Chrome when when you search within the page). What makes this chart so good is its clever use of information scent to entice viewers to dig deeper, explore, and ask questions.

It also brilliantly and clearly demonstrates the fact that the overall unemployment rate is a rather meaningless number. The actual rate in your demographic is likely to look very different, and the range is huge. This was the inspiration for my What Means Mean piece, though I don’t think that was nearly as clear as this.

The chart shows interesting data, explains a somewhat advanced concept, and invites people to interact with it. This comes in a package that is understated and elegant in its design. Best line chart ever.

Bar Chart

I have already written about the Bikini Chart, and it remains my favorite bar chart. It’s an incredibly effective piece of communication, and it’s all just based on a simple time series. The fact that the bars point down clearly communicates how it is supposed to be read: down is bad, less down is better than more down.

Bar charts are not exactly a common medium for artistic expression, but the designers of this chart managed to subtly but clearly get a message across.

Bubble Chart/Animated Scatterplot

Animated scatterplots may not have been invented by Hans Rosling and gapminder, but they certainly were not a common thing until his TED talk in 2007. And while it may seem a bit facetious to point to the only reasonably well-known example of a particular chart type as my favorite one, this is clearly one of my favorite charts, no matter what type.

The animation may seem like a bit of a gimmick – and it has been criticized as not being terribly effective –, but it works to communicate a number of important pieces of information.

The main piece of information, of course, is change over time. How have different countries changed in terms of their wealth, healthcare, etc.? This is reasonably effective, because there are trends, and many countries follow them. The outliers are reasonably easy to spot, especially when you can turn on trails and replay the animation. It’s not always immediately possible to see everything, but it does invite people to play and explore.

But then, there are the explanations. There is the clever animation that constructs the two-dimensional scatterplot from a one-dimensional distribution. There is the clever drill-down animation that breaks continents down into countries, and countries down into quintiles, to show the enormous range of values covered by each. This is not just a simple data display, but a way to introduce people to statistical concepts and data operations they may have heard of but don’t understand (drill-down), or never have heard of in the first place (quintiles).

Rosling’s video, and the gapminder software, not only introduced millions of people to data they knew nothing about (the video has over 8.5 million views!), it also demonstrated how a compelling story can be told without a single photograph or other image, just with data. That is an incredible achievement that opened our eyes to the possibilities of data visualization for communication.

Appreciating Good Work

It’s easy to find, and make fun of, bad charts. But between all the pie chart bashing and general criticism of bad charts, it is equally important to find the good examples and try to figure out what makes them work so well. Even if it may be more fun to beat up the bad examples, we will ultimately learn more from understanding the design choices and ideas that went into the good ones.