November 25th, 2008 — 11:57pm
In trying to develop a visual statistical system, it is frustrating to deal with the many limitations of common statistical tests–normality, equal variance, or equal sample size conditions. These conditions make the tests brittle; that is, they only work on a subset of all interesting data sets, and which subset is difficult to determine. How should this brittleness be accounted for in a statistical system? One possibility is to run additional tests and to warn the user if some condition is not met. In practice warnings would occur frequently and it would be up to the user’s judgment to decide if they were meaningful. To overcome this issue, a statistical system could be imagined that would attempt to mimic the decisions of an expert statistician. It would look at a battery of test results and automatically determine what additional tests could be run depending on what conditions were met. How would one train such a system? And, I think, more importantly, how would one evaluate the uncertainty or error in the results of such a complicated system?
I recently came across the idea of robust statistics, which originated with Tukey and associated statisticians in the middle of the last century. This concept appears here in the introductory chapter to Mosteller and Tukey’s Data Analysis and Regression. The goal is to find statistical tests that work on a broad range of data distributions. From a system implementation standpoint, this approach is much preferable to user input or a complicated expert system. However, I have not seen these techniques used in any of my Stats classes. Have they been superseded by Bayesian or bootstrapping approaches?
The other principle topic in this first chapter is “vague concepts”. The authors give the example of standard deviation, which is a very specific method for measuring the spread of a distribution. However, to evaluate our use of the standard deviation we must be able to place it in context of all other possible measures of spread. This meta-level or “vague concept” is lost in many introductions to statistics.
Comment » | Uncategorized
January 27th, 2007 — 2:56pm
Pete Shirley has invited people to follow along with the assignments in his Intro to Graphics course covering the Reyes architecture.
Pat Hanrahan’s new intro course is covering an eclectic mix of topics. You can also check out his interesting use of a course Wiki.
At UNC, another former BYU student, Brandon Lloyd, is teaching a more typical intro course covering rasterization and raytracing.
And back at BYU, Robert Burton’s intro course is asking students to reimplement the OpenGL pipeline.
Comment » | Uncategorized
January 25th, 2007 — 1:30am
I’ve been meaning to get back to TreeMaps for awhile. The recently unveiled Many-Eyes includes a TreeMap visualization. Martin Wattenberg (who did some research on TreeMap layouts) provides a typical example:


My main complaint with TreeMaps is their ugliness. Every TreeMap I have seen, including this one, look disorganized. Basically, we can only see the top level of the hierarchy, indicated with the strong dark lines and labeled. All other levels disappear into the patchwork mess. So one of the “strengths” of TreeMaps, the ability to directly view the hierarchy, is effectively neutered.
The layout also ignores good graphic design. Elements are not aligned and they do not visually cluster in meaningful ways. This is in contrast to indented lists, another common way to show hierarchical data, which use alignment and clustering very effectively to communicate the organization of the hierarchy.
More effective use of whitespace could dramatically improve the appearance of TreeMaps, but so few people use it well. Here’s an example that does. Also notice how the alignment makes the diagram look very organized.
Next time: the travesty of Cushion TreeMaps.
Comment » | Uncategorized
January 25th, 2007 — 12:50am
A couple weeks ago I attended a small Visualization workshop organized by the DHS and the CIA in New Mexico. It was my first time to the state. I met a number of interesting people in the Visualization community, including David Salesin, from Adobe, and Stephen Few.
The discussion focused on how documents and presentations could be produced more effectively and efficiently. Two main issues were raised:
- Best practices need to be codified.
- There is a lot of research on how to communicate spread across a large number of fields including education, rhetoric, HCI, visualization, cognition, etc.
- Having this knowledge consolidated and organized would be helpful for people.
- With this knowledge, programs could be developed to automatically apply (or suggest) best practices, reducing the time needed to create an effective document.
- Documents need more background.
- The analysis process behind a document is often as important as the document itself (containing just the final conclusions). If we had a way to track the analysis, it would be possible for document readers to drill-down into questionable conclusions. It would also be possible to check the analysis for consistency with changing conditions. For example if the document relied a piece of intelligence that was later shown to be false, the entire document could be flagged (automatically?) as “Overtaken by Events”.
Comment » | Uncategorized
January 5th, 2007 — 11:52am
TreeMaps are a popular hierarchical visualization technique developed by Ben Shneiderman. Despite the emphasis they have received in visualization literature over the last decade, TreeMaps remain a very limited tool. In the next few days, I want to explore why this is true.
Comment » | Uncategorized
December 18th, 2006 — 10:05pm
Today I came across the website of Marcia K. Johnson, a professor of Psychology at Yale. She links to PDFs of every single one of her publications all the way back to 1972. That’s impressive.
I was looking for her paper “Contextual prerequisites for understanding: Some investigations of comprehension and recall”, and was very surprised to find it online. Typically, anything written before about 1990 can be very hard to find online (unless it’s in the ACM Digital Library).
Comment » | Uncategorized
December 16th, 2006 — 3:54am
Today I read Animation: can it facilitate? by Tversky, Morrison, and Betrancourt. They argue that animation has little power to “facilitate comprehension, learning, memory, communication and inference” since it violates the Apprehension Principle.
I think they’re right. Comprehension requires time to analyze the graphic and to develop a mental model. An animated graphic is continually presenting new information; interrupting our attempts to understand.
When I’ve used animations or movies in my own learning, I find myself having to compensate by repeating it multiple times or pausing at key frames, etc. In essence, I’m trying to build a static understanding of the animation.
Comment » | Uncategorized
December 15th, 2006 — 12:45am
Do we really need this word?
Comment » | Uncategorized
December 15th, 2006 — 12:37am
My brother pointed me to this updated SpaceBall product. There was always an old SpaceBall lying around in the lab at BYU. It had some mysterious connector that wouldn’t work with any of the machines in the lab, so it never got used. Are these at all useful for 3D work? It seems like Maya’s interface is pretty simple. Mouse plus Ctrl, Alt, or Shift to choose pan, rotate, or zoom (not necessarily in that order). Does anyone have experience using a SpaceBall?
Comment » | Uncategorized
December 14th, 2006 — 7:32pm
Placing labels on maps is a hard problem. Although quite a bit of work has been done on how to place labels so that they don’t overlap, not much work has been done on how to place labels in an aesthetically pleasing manner. In an effort to understand what’s in production already I decided to make a brief comparison of Google Maps and Microsoft’s Live Maps.
The two maps compared are for the Redmond, Washington area where I lived for the past year. You can compare them yourself by opening Google’s and Microsoft’s versions of the same area. I was looking at them on a relatively large monitor, so you may need to scroll around a little bit to see the same things.
- Initial Impressions. Microsoft uses a subdued color scheme which reminds me of Eduard Imhof’s maps. The parks could be slightly greener, but otherwise it’s very nice. Google’s color scheme is more saturated and, to my eye, not as clean. On the other hand, Microsoft’s map is noticeably blurry, apparently caused by too much anti-aliasing. The small, unlabeled roads in Google are too strong, they distract from the rest of the map. The unlabeled roads in Microsoft are lighter which looks better. However, some of their labeled roads are also very light! It’s often unclear with which street the label is associated. As a general rule: if the street is labeled it should be dark, if not, it should be very light or not rendered at all.
- Label Selection. One of the most noticeable differences is the types of labels shown at this zoom level. Microsoft labels a lot of streets, but only one park out of dozens. Google’s map labels almost all the parks and only labels the largest streets. Park names are useful as landmarks, but it they can’t be nearly as useful as street names. Why waste screen space on them?
Label Clutter. Since Microsoft shows so many street labels, it can suffer from unsightly cluttering, although labels never actually overlap. Notice how the meeting of “NE 85th St” and “116th Ave NE” at right angles looks awkward. Also the coincidence of the top of the ‘E’ in “124th Ave NE” with the edge of the freeway is distracting. Also notice the lack of any alignment and the seeming arbitrary top-to-bottom vs. bottom-to-top reading of the vertical text.
- Label Layout. Microsoft seems inconsistent in placing street labels adjacent to or directly over the road. Google always places the text over the street.
Text Layout. In both maps, the street label text follows the path of the street. This leads to some very strange text, especially on Microsoft’s map. “Sammamish” is really one word. An obvious solution would be to smooth the path so this doesn’t happen. The rotation and baseline of a label’s text should change smoothly over the label. A related problem is that of character spacing. Notice that the ‘i’ has almost disappeared from the word “Sammamish”.
- Label Contrast.
Both maps render a background behind the text to make the label stand out from the map. Google’s is a nearly opaque white border (or yellow, for main roads). Microsoft uses a much more subtle mask. Visually, I prefer Microsoft’s approach. However, I did notice that in many places Microsoft’s mask would obscure the road, making the label appear unassociated with anything on the map. Google neatly avoids this problem by making the text border the same color as the road (yellow), this provides more visual continuity to the road even when obscured by the label.
- Labeling Areas. Labeling of areas doesn’t work very well as you zoom into the area. For example, city labels remain even after you have zoomed so that the city fills up almost the entire screen. It appears that the city label is identifying a specific intersection, rather than an entire area. I’d be interested to see how traditional maps deal with this problem.
Comment » | Uncategorized