Category Archives: Visualization Tools

Information Visualization: Spring 2014

Course description & overview
Over the course of the semester, students will develop the critical capacity to assess and develop accurate, engaging information visualizations for journalism. In addition to readings about the art and science of information visualization, students will complete a series of critiques of published graphics, in the form of rationalized redesigns of the original work. For these assignments, students will be given a range of acceptable media (physical or digital) in which their solution may be executed; each revision will also be accompanied by a written justification of the design choices it includes. Weekly readings should be summarized either in paragraph or bullet-point form; these summaries are due in aggregate at the end of the semester, though students are expected to complete the readings in advance of the class session for which they have been assigned. Students will also have weekly individual programming assignments and a group final project which will be reviewed in an industry critique at the end of the semester.

Weekly class sessions will be held on Tuesday and Wednesday mornings. Students are expected to arrive promptly for all class meetings and trainings, and will be graded on both their attendance and participation. Students are encouraged to engage their prior knowledge and experience in all aspects of the class, including readings summaries and critiques, as well as in-class and online discussions. Students must complete all readings summaries, programming and design assignments, as well as their final project, in order to pass the class.


Week 1
Lab
Introductions, expectations and class overview, the purpose of programming.
Installing & configuring Aptana Studio 3, FileZilla, introduction to github
Publishing your first webpage to your CUIT server space; overview of webpage structure.
Videos: HTML Basics
Readings due:

Seminar
Visualization as communication, visual shorthand.
First design assignment outlined.
“In the news/room” group problem-solving exercise.
Readings due:


Week 2
Lab
Programming basics and data format familiarity
Readings due: JSON, XML, Programming Parts of Speech
Videos: The browser console, data types, and getting started with GitHub
In-class resources: Unix & GitHub Quick Reference

Seminar
Visual understanding as science and influence. “In the news/room” group problem-solving exercise.
Readings due:

 


Week 3
Lab
More programming basics: functions, conditionals and loops.
Readings due: Understanding Functions, AJAX
Videos: Loops, conditionals and functions

Seminar
Visualization for inference & drawing conclusions from visual information: histograms, chloropleth maps.
Readings due:

 


Week 4
Lab
Basic HTML + CSS sizing and positioning; Bootstrap style framework.
Readings due: API, CSS
Videos:API data, jQuery, AJAX & divs

Seminar
Basic best practices in visualization for journalism
Readings due:

 


Week 5
Lab
Hitting the playground, getting to “ready” with the Google Charts API
Readings due: Google Line Chart documentation: Overview, Example, Loading
Videos:Page configuration & data formatting for Google Visualization; chart rendering.

Seminar
Principles of data journalism and information design
“In the news/room” group problem-solving exercise


Week 6
Lab
Using Google Fusion Tables as a data source
Readings due: Google Line Chart documentation: Configuration Options
Videos: Using Google Fusion Tables with SQL queries as a dynamic data source for Google Visualizations.

Seminar
In the newsroom
Readings due:

 


Week 7
Lab
Customizing interaction with annotations and rollovers
Readings due: KML
Videos: Mapping with Google Fusion Tables

Seminar
Lessons from professional practice
Readings due:

 


Spring break


Week 8
Lab
Basics of data analysis.
Readings due: mean, median, outlier, normal distribution, histogram
Videos: Data analysis essentials with spreadsheets and OpenRefine.

Seminar
Final project pitch discussions


Week 9
Lab
Basic page interactivity and layout with jQuery and CSS

Seminar
Designing the design process
Readings due:

 


Week 10
Lab
Responsive designs with Twitter Bootstrap

Seminar
Essentials of information layout and design.
Readings due:

  • Thinking With Type Ellen Lupton. Princeton Architectural Press, 2010.
    Letter: Anatomy, Size, Scale, Type Classification, Type Families, Superfamilies; Text: Tracking, Line Spacing, Alignment, Hierarchy; Grid: Grid as Program, Grid as Table, Multicolumn Grid, Modular Grid
  • Content-Out Layout by Nathan Ford. A List Apart, 2014.

 


Week 11
Lab
Optimizing code with the principles of DRY (“Don’t Repeat Yourself”) and “convention over configuration”. Using these to have data selections respond to user input.
Videos: Revising and editing your code for efficiency and modularity by applying conventions and the “DRY” (Don’t Repeat Yourself) principle. Adding responsiveness to user actions.

Seminar
Interaction and navigation, designing for users at different levels of experience
Readings due:

 


Week 12
Lab
Updating the URL for sharing user-selected content; adding annotations and customized tooltips.
Videos: “Stateful” URLs for sharing particular views of an interactive; adding annotations for context.
Seminar
Q&A with Lam Thuy Vo, Al Jazeera America
Lam Thuy Vo is a multi-platform reporter and editor currently leading Al Jazeera America’s interactive team where she produces and edits multi-platform stories. She started her journalism career at the Wall Street Journal as a videographer, spearheading the publication’s video operations in Asia, and has also worked as a producer and reporter for Planet Money, telling economic stories with charts, videos and other visuals.

Q&A with Rani Molla, WSJ Visual Journalist and CUJ ’12 graduate.


Week 13
Lab
Targeted final project assistance.

Seminar
Open work day


Week 14
Lab
Final project open lab

Seminar
Final project in-class critique, feedback from instructors and guests


Week 15
Lab
Final project open lab

Seminar
Final project presentations to industry; final project reflections due

Exploring America: Visualizing Data with Google Fusion Tables

Interactive maps and charts can be a great way to add interest and visual appeal to a primarily textual work of journalism, and in a newsroom there are many times when it’s only after the article has been largely written that we think to create them. But visualizations can also serve as essential reporting tools in and of themselves, allowing journalists to see patterns in otherwise impenetrable data sets – patterns that can provide essential leads to finding interesting stories in the first place.

This habit of pattern-finding and data analysis is already well known in the area of journalism usually known as “computer-assisted reporting” (CAR), and where it exists is usually the purview of perhaps one or two specialist reporters. Typically, the tools associated with CAR have been both reasonably expensive and time-consuming to learn: database technologies like Microsoft Access; mapping technologies like ArcGIS. Few individuals and not all organizations could realistically afford the software, training and personnel required to do this kind of work.

At the infamously (and in this case, happily) exponential pace of technological evolution, however, there are now free, user-friendly tools whose power and versatility is rapidly surpassing the more “traditional” tools of CAR. A prime example of this Google Fusion Tables, which combines the essential functions of a database (large-scale data storage, powerful sorting and filtering, the ability to merge tables) with the robustness of Google’s mapping resources (and their many charting tools).

A quick walk-through of how to use the the merging and mapping functions is provided through the Fusion Tables Help, of which the below is an edited and annotated version. To follow along, you’ll need to be logged in to a Gmail account.

To start, click on each of the links below (these should come up in new browser tabs):

110th US Congressional District Outlines

2008 1-year American Community Survey (ACS) Data

You’ll notice that these pages basically looks like giant spreadsheets, and this is essentially what they are. Looking at the column headers, take a moment to notice that the format of the Outlines table’s “id” column and the ACS table’s “Two-Digit District” column is very similar.

Looking a little further, you’ll see that in the upper-right there is a count of the number of rows in the table (e.g. “1100 of 436″ for the ACS data); clicking on the “Next>>” link will show you the next 100 rows.

Just below the “Next>>” link, in the gray title bar, you’ll see a square button with a small triangle in it – this is the subtle clue you’re given that there are more columns of data than what you currently see. If you begin clicking this in the ACS data set, you’ll quickly discover the many, many columns that this table contains.

Browsing that many columns of data is tedious, though, and impossible to analyze. So let’s get it mapped instead so we can see really see what’s here.

In the Outlines table, click the “Merge” link in the blue bar. In the box on the left, you see radio buttons for each of the columns in that table, with “id” selected. Above the empty box on the right are the instructions: “Merge with” followed by an input box and a GET button. Ignoring the dropdown, paste the url of the ACS table (http://www.google.com/fusiontables/DataSource?dsrcid=237928) in the input box and click “GET”.

 

 

The right-hand box now contains radio buttons of every column in the ACS table – but what it wants you to do is tell it how to match up the information in the two tables. Remember How “id” and “Two-Digit District” looked pretty similar? Make sure those are the radio buttons selected and then type a name for your about-to-be-merged table into the input box labeled “Save as a new table named”. Finally, click the “Merge Tables” button.

 

 

 

Very often a red box with the warning “Could not merge tables” will appear for a moment. Ignore this and wait for the page to finish loading.

You’ll now be looking at a new Fusion Table with the name you entered earlier. The columns from the first table have a white background; those from the merged or “joined” table are pale yellow. Because we connect outline information with data, we can now see that data visualized by selecting “Visualize >> Map” in the blue bar. After several seconds, you’ll see a bright red Google Map of the U.S., with gray outlines marking the 110th Congressional District boundaries.

 

Click somewhere on the map. After a moment, a balloon pops up with a readout of the first 10 columns of table data for that area. That’s a little useful, but if we were interested in data only about one district, it would have been just as easy to read from the table. Instead, click the “Configure styles” link above the map.

In the “Configure map styles” popup, click on “Fill color” under the “Polygons” header at left. To the right, click “Buckets” and then select the “Divide into 2 buckets” radio button. First, click on the “2” dropdown and change your number of buckets to “4”. Open the “Column” dropdown below this and you can quickly scroll through all of the data columns available in ACS table. To start exploring, select one and click the “Save” button at the bottom. After a moment, a yellow “Map style saved” label will appear above the map, and it will be recolored according to your selection. To see how other columns of data map, simply click “Configure styles” again, select a different column of data from the dropdown, and click “Save”.

A few notes:

  • Keep in mind that the default “bucket” ranges (0-25, 25-50, 50-75 and 75-100) may not be ideal for the particular data column you’ve chosen. You may need to adjust these values in order for the map to be meaningful (or even show any color variation at all).
  • Also note that the default “bucket” colors should really be shades of a single color, rather than 4 distinct colors. Any time you are mapping the intensity of a single value, it should be indicated by intensity of a single color, not multiple colors.
  • Why doesn’t your chosen data point show up in the little balloon? You need to adjust its contents by clicking the “Configure info window” link above the map. There you can select the with check boxes the exact column information you want to appear. Select a few on the left, click “Save” and then click on the map again.

Having played around with the data for a little while, you’ve found some interesting data points. I’m always interested in rent burden and housing affordability, so I chose the very last column of data “Percent of Renter-occupied Units Spending 30 Percent or More of Household Income on Rent and Utilities”. After using the table view and (by clicking on the column header) sorting the data and find its minimum and maximum values, I adjusted my ranges to be 0-30, 30-45, 45-60, and 60-100, colored in shades of red. The result is a map that shows a few interesting things – at first glance, we note that Nebraska’s 3rd Congressional district is the only one in the country where more than 70% of people live in affordable housing, and two of the most rent-burdened districts are California’s 45th and Florida’s 25th – not in districts in New York, San Francisco, Los Angeles or other notoriously expensive cities. What’s going on here? This data alone won’t tell us, but it has given us a lead towards what might be an interesting story.

Once you’ve done the rest of your research and discovered some of the “why” behind your visualization, you’ll want to make sure your readers have your data at their disposal. To add the map to your page/site you’ll need to do 2 things:

1. Share it. In the upper-right corner of the map or table view, you’ll see a “Share” button that brings up a popup. In the bottom half of this window, three radio buttons list the “Visibility options”. To add your map to a webpage, you’ll need to make it at least “Unlisted”, if not “Public”.

 

 

 

 

 

 

2. Embed it. Click “Get embeddable link” above the map and a small scrolling window is revealed above the map with code that you can paste into an HTML. To it into WordPress, as above, you’ll need an iframe plugin like Easy iFrame Loader installed. Using the revealed code, follow the directions to add the map to your post or page.

If you want only to email your link to a few people (and not have others be able to view it) you can leave the “Visibility options” on “Private” and share it either through the “Share” popup, or else email them the link made visible when you click the “Get KML network link”.

So that’s a first round on using Google Fusion Tables to generate explore and share data sets through interactive mapping. There are many other features available here though, so there will undoubtedly be more Google Fusion Tables fun to come.

Google Gadget: “Geomap”

After Gmail,  Google is probably known best for mapping, and their “geomap” gadget is a great for generating quick world and region maps that either be embedded directly in a webpage to function interactively, or used statically via screenshot.

These maps work best at smaller sizes; even in the small image at left you can see that Florida looks a bit like a pointy pegleg hanging off the east coast, and the shapes  get only more geometric and slightly bizarre as the size increases.

All that being said, however, the geomap is a nifty little tool for quickly generating shaded maps of the world or world regions. It can map data directly from a Google Spreadsheet with a simple click – simply put the country name in the first column and its corresponding numerical data value in the column next to it, for as many countries as you have (note that you can import .txt or .csv files directly into Google Spreadsheets, which are common formats in which governments and NGOs make data available). Then simply copy the url of the spreadsheet from your browser and paste it into the “Data source url” field here, select the “Show Legend” check box,  and click  the “Preview Changes” button at the bottom of the page. Viola! Your data is mapped.

The tool does a fairly good job of generating an appropriate color ramp (the term for the range of colors in the legend or key), though the top and bottom numerical values can be a bit messy. Still, it provides options to change the size of the map and to focus on specific regions or countries (via the “Map” dropdown), which provides decent flexibility. Just remember to see each change you have to click the “Preview Changes” button again.

Whether you use it statically or interactively, you must credit your source for the map. Map outlines constitute information produced by someone other than yourself – and borders may differ from one source to the next – so the source you use must be cited. That means don’t crop out the little blue line of text at the bottom right that says “Gadgets powered by Google”. Either keep the original or add it back if you use an image of it. If you embed it directly in a page via the “Get the Code” button, this credit should be included automatically.

This gadget is best suited to give an overall impression of how countries or regions vary on some high-level metric, like GDP or population. Data with extreme outliers or where more granular geographic information is important are not good candidates for this; these will be better suited to some of the visualizations that can be created through Google Fusion Tables. But for simple data sets with reasonably limited variation, a quick geomap may be just the thing.

Get on the grid

Grid design, subtraction.comWhether you realize it consciously or not, every major website you visit (and every WordPress or other blog theme)  is laid out according to a grid pattern similar to the one at left (which comes from subtraction.com, the blog of Khoi Vinh, former design director of NYTimes.com).

Why does everyone use grids? A few reasons:

  1. It makes decisions easier. Without a grid, you’d have to make all kinds of design decisions on the fly – like how big to make a photo, or where to place a text well. Following the grid reduces the number of options so you don’t have to agonize over every single choice.
  2. It looks better. Almost any layout – unless created by a skilled artist, designer or illustrator – will look amateurish if it’s not based on some kind of grid. The grid provides an underlying structure and organization that translates to a polished look.
  3. It offers a way to make certain elements pop. Interest and surprise can be created by very occasionally and very selectively “breaking” the grid. However, this means that the element – usually an image or illustration – that does the breaking should not itself be a rectangle of some kind. Photo cutouts of people or objects are the best candidates for this.

A great walkthrough of grid design is the presentation made by Vinh and Mark Boulton (both of whom have now written books on the subject) at SXSW Interactive in 2007, called Grids Are Good.

There are also a range of grid generators and templates that can be found around the web:

  • gridsystemgenerator will create and display a custom grid according to values you enter, and (I have not tried these) will even generate s complete html/css template set for you.
  • 960.gs, by Nathan Smith, offers templates in a variety of formats (including Flash, which I’ve saved down to a CS3 format here). His system is the basis for the grids created with the generator above.