Knight Digital Media Center Multimedia Training

Tutorial: Protovis Part 4: External data and animated labels

By Len De Groot

For updates and discussion on this tutorial, visit:
http://kdmc.berkeley.edu/tutorials/protovis-part-4-external-data/

Introduction

Now that you have mastered the basics of Protovis, it's time to learn how to load data from an external data file. Using external data has two obvious benefits. First, it simplifies the code, making it easier to work with. Second, the data can be updated without having to rework the code on a page.

We'll work with a JSON file. JSON stands for JavaScript Object Notation. It's similar to XML but it's simpler and results in a smaller file. It's very common on the Web and often times you can get JSON files directly from government sources.

This tutorial also demonstrates how to animate a bar chart and it's corresponding labels to allow readers deeper access to your data.

Example

This example loads demographic data from the 2000 Census and is viewable on modern browsers. This tutorial will walk you through the steps of making this graphic.

Population by race for California counties

Point to an external data file

Open the bar chart file and in the head area, lets add a new script that loads an external data file. Notice the source URL doesn't have any http information. That's because we're going to use a relative link. This means that as long as the data file is in the same folder as the html file you're working on, Protovis will be able to use it.

We also need to set the type to text/javascript. We'll examine the data file more closely in a moment, but first lets do a little set up on our document.

Document setup

We're going to work with some demographics data for California counties so perform a find for the word example and replace it with the word race in all instances.

Add two new variables, one for the width and one for the height. Make the values of both 150.

Add a variable for leftMargin and set the value to 100

Replace the width and height values in the panel with the corresponding variables. Then change the width variable to 300.

Next, find the left property. At the very end is the number 26 that sets the left position of the chart. Change that to read leftMargin.

Using JSON data

Now lets look at our data. Open carace.js. What you are looking at is a JSON file. JSON stands for JavaScript Object Notation. It's similar to XML but it's simpler and results in a smaller file.

JSON is a very common file format. The easiest way to create this type of file is to export a comma separated value file or CSV from a spreadsheet. Then you can open the file into a text editor and paste it into a free CSV to JSON converter on the web.

Look at the first block of data. If you were starting with a spreadsheet, the row of column headers appears here as the labels on the left and followed by a colon. To the right of the colon is the corresponding data in the first spreadsheet row. This is placed inside curly brackets and forms an element in what will be your data array.

The second element consists of the same column headers and the second row of corresponding data.

This continues until every row in your spreadsheet is an element and they are all separated by commas.

Square brackets are put around all the elements to make it into an array and the array is made equal to the variable popbyrace. Using external data has two obvious benefits. First, it simplifies the code , making it easier to work with. Second, the data can be updated without having to rework the code on a page.


Go back to TextMate and replace the data value with popbyrace.

Working with scales

One of our goals is to create templates that can be reused quickly. One of the most time-consuming adjustments for reusing a chart is changing the scale to fit inside the panel. Fortunately, Protovis has included a class called pv.Scale to make this more dynamic and eliminate your need to update most settings. Let's start by adding the variable dataRange and make it equal to 5000000. The highest population (or data value) is about 4.7 million, so dataRange will be used to set the scale on the chart.

Add a new variable called yScale that we'll use to distribute the bars vertically. Remember, we're going to flip the chart on its side. We'll use the pv.Scale class and define the scale as linear because the bars will be distributed along the Y axis. The scale starts at 0, then popbyrace.length counts how many elements make up the length of the data array and sets that number as the end point. Next we set the range of the bars in the panel with a start point of 0 and distribute the bars the height of the panel.

Nifty! Now lets use the same tool to define how each bar will stretch across the panel. Add a variable called total. This time, we're going to use pv.Scale to adjust the length of the bars based on the data values. The bars will start at 0 and extend to the value of the dataRange. Next it sets the width of the bars in the panel with a start point of 0 and extends the bars the width of the panel minus the leftMargin.

Add a variable called barWidth. Divide height by popbyrace.length (the number of elements in the data) to be sure all the bars fit in the panel.

Add a new variable called gap and set the value at 4. This will be used to set the gap between the bars.

Finally, change the height variable to 800 and we're ready to edit our chart.

Edit the chart

Start by deleting everything but the data property. Type top (function() yScale(this.index)) Remember, this.index refers to each element in the data array.

Set the width using (function(d) total(d.White) exactly as it appears in the data file. This is important. . This function acts like a little machine that scans the data file and finds each race element. We're telling to program to find all the White race data in each element and use it to draw the bars. If we want to use the Asian data we would type d.Asian.

Set the height to (barWidth-gap). Set left to leftMargin and set fillStyle to steelblue.

Add data labels with .anchor

Now let's add our first set of labels. Add an anchor and set the value to "right" then type .add(pv.Label). Notice that while Protovis knows where to put the labels, it doesn't know how to fill in the information yet.

Set textAlign to the "left" of the anchor. Now type .text(function(d) d.White). d.White will fill in the corresponding labels with the White data.

Add commas to labels

It's not enough to simply display the data, the chart has to be easy to read. We'll address this as ws we go through the next several steps. Let's start by adding commas to our data lables. Fortunately, Protovis has built in comma formatting. To use it, add a new variable called commas and make it equal to pv.Format.number()

Now go back to the text property and type commas and put d.White in parens.

Add county labels

Now lets add county labels on the left. Copy the entire pv.Bar code and paste it above the original.

Change the left property to 0.

Change the fillStyle to null.

Change the anchor to left and labels are flush left on the page.

While this is a perfectly acceptable way to label a chart, we want the labels to hang off the bars.

First, delete the width property since we no longer need it.

Change left to 99.

Change anchor to right.

Change the anchor left property to 99.

In the text property, delete commas and change d.White to d.County.

Finally, change textStyle to "gray". This small design change makes the labels secondary and the data easier to read.

Everything should match the code below.

Add mouse event animation

Not bad but all those labels make it hard to consume the information visually. Lets add some animation so that the population data lables only appear when the mouse rolls over a bar.

Add a new variable called activeBar and make it = 0. activeBar will refer to the index position of the data. The first position in an array is always 0, not 1 (because arrays use base-10 math). This setting means that the first bar will automatically highlight when the graphic loads.

Next, go to the new panel variable and add an event called "mousemove" to the entire panel. This is important because it sets up the animation for all the subsequent parts. We'll use a class called pv.Behavior.point to identify what the mouse pointing at. By default, the point area is an invisible circle beneath the mouse. We can expand this area to cover the entire panel by setting the point property to infinity. But we can simultaneously limit how pv.Behavior works. We really only care what the mouse is over as it moves up or down. Since there are no new objects to activate when the mouse moves horizontally, we can .collapse the "x" axis.

Next go to the original bar code and after the left property, add an event listener that will react to the mouse's "point" behavior. In this case, we're going to use a function to state that when the activeBar = this.index (the data element in the array), return a result for the race panel.

Animate bars and data labels

Next, let's animate the color of each bar as the mouse points at it. Change the fillStyle to use a function similar to the one we just used in the event property. Notice there are two equal signs. This is sort of the equivalent of saying that we really mean that activeBar must equal this.index. Really. We really, really mean it.

Also notice that we use the ? to specify that if an element (or bar) in the array is active, it's color is black. If it's not active, default to steelblue.

And finally we want to apply the same fillStyle function to the population data label. Copy the entire fillStyle function. Scroll down and put your cursor in front of the semicolon.

Hit return, type textStyle and paste. Change "steelblue" to "transparent".

Now the bars and labels animate. This emphasizes the visual story you want to tell and makes it much easier for a user to consume the information.

Clean up the template

So how is our template doing? Change the height and the gap variables to see how the chart redraws. that's not bad but let's do a little cleanup to take better advantage of our variables and make this easier to use.

Change leftMargin variable to 90.

Find the total variable and and after left margin, type -40 so fit the entire Los Angeles label into the panel.

Now go down to the county labels and find the left property below the anchor property. Change 99 to leftMargin-1. Delete the left property from the bar above it.

The code should look exactly like it does here:

More animated labels

OK. That's a good, useful chart. But there is a lot of data driving this graphic that the user doesn't get to see. Let's change that with some animated labels.

Create a new variable named labelXpos and set the value at 200. Create another variable named labelYpos with the value of 400. We'll use these to position our labels in the panel.

Create a third variable named source and make it equal to US Census in quotes.

Scroll down to the bottom of the page. Above the render command, type race.add(pv.Label)

On the next line type .text (function(d) popbyrace[activeBar].County + " County") We're telling the program to find the active element in the array popbycounty and show the county label data. We then tell it to add the word county on the end. This last bit is called concatenation and is extremely useful, as you will see.

Next, Set the left property to labelXpos and set the bottom property to labelYpos. Set the font to bold 11pt san-serif.

We can see the label is getting cut off. Go up to the labelXpos variable and set the value to 150 to move it over.

So why have this in two places? You are about to find out. Scroll back down and add another label. Type race.add(pv.Label) then write .text( function (d) ("Asian: " + commas(popbyrace[activeBar].Asian))). Set the left property to labelXpos. Set the bottom property to labelYpos - 20 to shift the label down. Now if labelYpos changes both labels will move together. This makes it easy to fine tune a chart when you use a different data set. Set the font to 9pt sans-serif to style the label.

Now, copy the entire asian label code and paste copy directly below it. In the text property, change the word Asian to Black in both places. Change the bottom property to -40. Now you have the black populations for each county as well.

Repeat with all the other categories until they look like those shown here.

Finishing up

The labels are still a little tight against the right edge. Go up to the width variable and change it to 400. Now you can see there is plenty of room for your labels.

Finally, let's add a source line. Scroll back to the bottom, copy the last label code and paste it below. Change the text property to source, change the bottom property to -120 and change the font size to 6pt. If you want, you can concatenate the word source to your text property.

So that's it for this exercise. If you made it all the way through, you should be proud. You now have a powerful template that's easy to edit and swap out data. In our next tutorial, we'll do exactly that, add more complicated data and then convert the chart to create multiple lines. We'll also explore another animation technique.

Here is the complete code:

So that's it.

About this Tutorial

This tutorial was produced as a presentation for the Knight Digital Media Center's Interactive Census Workshop  at the Berkeley Graduate School of Journalism.

Republishing Policy

This content may not be republished in print or digital form without express written permission from KDMC. Please see our Content Redistribution Policy at kdmc.berkeley.edu/license.