Hubway Rides by Neighborhood over Time

Aneesh Agrawal, Kenny Friedman, and Katie Marlowe

The data show routes that people commonly take by using Hubways. We want to tell this story because Hubway can be a great alternative transportation for routes that the MBTA does not cover.

Our data came from, which was a challenge in 2012 to visualize data from Hubway rides. Our data includes information on rides from 2011-2013. We picked a chord chart to visualize this data because this type of chart emphasizes the connections between various stations. The thickness of the chords corresponds to the relative frequency of rides between the neighborhoods. The chord chart points out specific routes that are taken frequently, which leads to the question: Why are people taking Hubways between these stations? Is it because the MBTA does not currently provide a good way to get between these destinations? Or is it just that there are a lot of people traveling between these areas? Specifically, we can look at the blue region (MIT) and the gray region (back bay). There is a thick chord between these areas. We know that it is pretty difficult to get between these areas via public transportation, there isn’t a T line that runs between them. This could be a good indication of a route that many people take without many options of how to get between, so many people decide to utilize Hubways.

If you look at the data over time, then you can see that some stations didn’t exist at the beginning, but were built in the middle of time this dataset is from. By the end of the timeframe, these stations become about half of overall monthly usage. This points to the conclusion that expanding the Hubway system is effective, and we recommend expanding it further. In late 2015, Hubway did announce some future plans for expansion.

View the visualization here.

Do Smear Campaigns work?

Group members: Michelle Thomas, Reem Alfaiz, Andrew Mikofalvy

The data shows the effectiveness of negative ad campaigns as well as their tendency to be used as a last resort in gaining support and lowering support for competitors. We wanted to tell this story because of the over saturation of campaign ads and curiosity over the effectiveness of negativity.

We used data from the Politcal TV Ad Archive to look at which candidates negative ads were targeting as well as when they were being published. We compared those findings to voter results in each primary and caucus from data from the New York Times. We chose to tell this story through a line graph representing voter results overlayed with a bar graph depicting air count of negative ads per candidate. Both graphs are plotted over the time frame of February 1st- March 1st. This is the date of the first caucus until Super Tuesday, a day in which 12 states and 1 territory hold their primaries and caucuses. This layout allows people to see the race between candidates and relate it to when campaigns chose to start using negative ads, as well as if the candidates they target did worse or not. We felt that this is an effective time frame as Super Tuesday is a large sample of voting results and holds a perceived weight for campaign success. We chose to use delegates won as the measure for candidate success since it factors in issues such as relative importance of states, since negative ads were shown more in some states than others. We also included explanatory text so that the chart is understandable regardless of political system education.


View infographic here

Political TV ads: not what you think they’re about

Team members: Kalki Seksaria, Gary Burnett, Michael Drachkovitch, Argyro Nicolaou

The data say that the top topic choices for political TV ads in Iowa did not always match the issues voters considered to be the most important in that state. We want to tell this story because it provides insight into the logic of political TV ads during a primary, where the emphasis is less on pitting the voters against the rival political party but more about differentiating same-party candidates and getting voters to the polls.

We mainly worked with the Political TV ad data set but also used data from the CNN entrance polls from the Iowa primaries.

Screen Shot 2016-03-03 at 3.04.20 PM

We first had to clean up the data set, giving each topic its own column, since the ‘topic’ cells were populated with every topic mentioned in each ad. After aggregating the number of times each topic came up, we picked the top five issues for each party. In order to attribute ad affiliation as either Republican or Democrat we worked with the Sponsor column and not the candidate column since the latter included every candidate mentioned in the ad. We made a list of each of the Sponsors and researched their affiliation. Our charts exclude unaffiliated, non-profit donors (there were only two such organizations that advertised in Iowa anyway).

Having done this work, we used Tableau to create a line graph per issue per party (10 graphs total) mapping topic against time.

The CNN entrance polls on the Iowa primaries were used to make a bar chart of the timing of Republican and Democrat voters. Since the timing offered by the exit poll survey was under categorical values: ‘Today’; ‘last week’; ‘last month’; we had to decide which range of dates to include under each of these terms, to make sure that a relationship existed between the two datasets. Kalki describes the process: I first converted the categories into dates. Today = 2/1/16 (Iowa Caucuses Date). Last few days = the 2 days before the caucuses. For before last month, I assumed it meant between 1 and 3 months ago. I then assumed that the number of people who decided in a time window were evenly distributed over that time window. For example, if 30% of people decided last month, and “last month” included 23 days (30 day month – last 7 days are  “last week” or shorter), then 30% / 23 = 1.3% decided each day.

Our choice to present the most popular ad topics and most important topics according to voters as a table aims at pointing to the unexpected discrepancies between the two sets of information. What other reasons could there be for pushing a specific ad topic, even if voters don’t think it is important? To try to understand this, we plotted each party’s top-5 ad topics as a line graph against time and superimposed an area chart that maps the timing of voter decisions in order to see whether there is a correlation between certain ad topics being blasted out to voters and when the voters made up their mind.



A Day in the Life of a Hubway


By Jyotishka Biswas, Phillip Graham, and Maddie Kim

The data say that Hubway has had a positive impact on health and the environment in the Greater Boston Area. We want to tell this story to show that choosing to bike can make a difference.

The Hubway Bike Share system launched in 2011, and completed over 1 million rides over the next two years. We decided to look at the benefits of biking on health and the environment, and to quantify the impact that Hubway has had along these dimensions.

We chose to focus on the positive message for this assignment, as if we were part of Hubway’s marketing team, which guided many of the decisions we made. The first was to de-emphasize the charts — we used only two charts, to show the age and gender breakdown of Hubway users, statistics which were fun facts rather than central to our message. When it came to the core of the infographic, we presented medians rather than distributions to avoid unnecessary complexity. Primarily, we focused on keeping the tone light and fun, to make the reader more receptive to the message.

The result is a scrollable infographic, in which the story is told in a loose sequential frame format. The numbers are communicated in the context of a day in the life of a Hubway bike, and small comments in speech bubbles are used to signpost the flow of the story and provide some humor. We used bright colors to frame our content, and large, bold text to emphasize important numbers. We made the conscious decision to have a clear opinion and message, rather than to lay out our analysis and ask readers to assess it for themselves. We believe that this resulted in a more accessible presentation, and hopefully one which is as informative as it is enjoyable.

You can find the infographic here. (It’s made up of large images, so don’t click if you’re worried about data usage.)



Campaign Strategy 101: Winning Hearts and Minds

By Felipe Lozano-Landinez, Jane Coffrin, and Julia Appel

The Political TV Ad Archive contains information about the televised ads during the 2016 primary campaign season. Our goal with this project was to explore this data set and see what interesting campaign strategy insights we could derive by looking at which candidates sponsored ads on which TV shows. To do this, we cleaned/modified the data set to specifically focus on candidates via what ads they sponsored (not which ones they appeared in), the program on which each aired, and the ad’s emotive content (i.e. positive, negative, or mixed). We took a subset of the data (all TV programs with more than 500 ads aired as of the time that we downloaded the data), and also filtered out all the Presidential Candidates that haven’t been relevant in the race as of the last couple of weeks. Finally, we grouped TV shows into four “Show Type”: Talk Shows, Entertainment, Game Shows, and News.

We looked at the data in multiple layers through a series of increasingly granular questions: How did the ads gets segmented by “Show Type”? Did a particular political party dominate a specific “Show Type”? Were Republicans more likely to advertise on certain types of shows than democrats? Were there specific TV programs/shows that were targeted by specific candidates? Finally, were the ads sponsored by these candidates “pro” ads, meant to bolster their candidacy, or “con” ads, meant to bring down another candidate’s campaign?

We think this is an effective way to ask questions of the data, and ultimately derive an interesting story from them, because our top-down enabled us to look at the big picture, notice discrepancies, and then dig further to try and explain them.  We wanted to tell a few stories that surprised people; our approach helped us look at something that made sense on the surface (candidates advertise more on news shows), but maybe not at a deeper level (Donald Trump advertised significantly less than the two remaining Republican candidates in the race).

We believe that campaign strategists are strategic in their message targeting, but wanted to better understand how they target TV viewers, and whether or not they have different assumptions than we do about the political inclinations of TV viewers. We also wanted to see if the actions of a candidate’s campaign would differ from the conventional wisdom that normal Americans have about those candidates. On the surface level, our views/perspectives may align, but when we dig deeper we deconstruct our perspectives and demonstrate where things begin to differ, leading to greater understanding of the larger political atmosphere.


If you prefer to get late night comedy from Stephen Colbert than Jimmy Fallon and you’re a registered Republican planning on voting for Donald Trump, Marco Rubio’s campaign manager Terry Sullivan knows it. And he’s trying to change your mind.

While it may come as no surprise that campaign strategists profile TV viewers to target political ads and maximize impact, it may be surprising what shows they are actually targeting. What we found about the political ads you’ve seen this election cycle may give insight into the campaign strategies of the front running Republican and Democratic candidates. It also may give you a reason to change the channel.

First, we looked at which categories of TV show were most likely to air an ad from one of the nine major political candidates.

Distribution of Ads over 4 Show Types

If you’re watching a news show — the Today Show, for example — you’re almost twice as likely to see an ad for a political candidate than on any other type of TV show. Which political candidate? Bernie Sanders and Marco Rubio.


What surprised us most about the breakdown of ads on news shows wasn’t who was advertising most frequently; it was who wasn’t. 


Notice Donald Trump, Republican nominee frontrunner and winner of seven states on Super Tuesday. He ran nearly 2/3 fewer ads than Marco Rubio, half as many as Jeb Bush (who didn’t even make it to Super Tuesday), and 300 less than Ted Cruz.

Emotive Content of Republican Candidate Sponsored News Ads Also, Donald Trump didn’t waste time attacking his opponents: he ran no attack ads (those marked with “con” emotive content) on news shows. Jeb Bush, Ted Cruz, and Marco on the other hand, were slinging mud all over the place.

Why are we seeing fewer ads, and no negative ads from Donald Trump? Maybe he doesn’t need to run attack ads, since much of his media presence revolves around negative commentary of his opponents? Maybe he doesn’t need to spend as much money on traditional media because of his polarizing candidacy? Whatever the reason, it looks like we won’t be seeing any traditionally slanderous campaign ads from the Donald any time soon.

After looking at news shows, we looked at the category that aired the second most political ads: talk shows. This is where Marco Rubio’s campaign strategy got interesting… We wanted to see the breakdown of advertisements from Republican candidates on talk shows. 
Marco Rubio out-advertised his rivals by a margin of 2 to 1. Jeb Bush and Donald Trump were, again, distant second and third place runners-up to the TV advertising machine, Marco Rubio.

Republican Ads on The Late Show and The Tonight Show


Rubio’s advertisements weren’t all positive, either. Note the lack of attack ads from Donald Trump on both shows, as compared with Rubio, Bush, and Cruz.

Again we wondered: why is Marco Rubio trying to win the hearts and minds of these TV viewers? Is he trying to attract young voters, and perhaps draw them from the front-running Democratic candidate? Is he trying to appeal to young voters as a moderate candidate? Is he a moderate candidate?

Only time will tell whether these ad strategies — or lack thereof — will really influence voters; until then, we will continue to wonder how the political strategists are targeting our favorite TV shows.

Attack of the Chains?

By Catherine Caruso, Kendra Pierre-Louis and Tiffany Wang

Every year since 2008 the New York City based nonprofit, Center for an Urban Future has compiled a tally of the number of national retail chains located across the city. Underpinning their analysis is the unspoken assumption that retail chains are somehow a detriment to the fabric of the city. In that regards they have some support.

“The unintended consequence of their [chain stores] victories through the 1970s and beyond,” writes The Geography of Nowhere author James Howard Kuntsler in a 2013 post in the Huffington Post, “was the total destruction of local economic networks, that is, Main Streets and downtowns, in effect destroying many of their own livelihoods.”

A number of films, such as Wal-Mart: The High Cost of Low Price, associate chain stores, like McDonalds, Bed Bath & Beyond, Whole Foods, and Wal-Mart with the economic and cultural destruction of the communities in which they are located. In lieu of buying from national retailers, we’re told, that the best thing for local communities is to buy from local, independent retailers in what are called “buy local campaigns.”

There is some evidence suggesting that it may be better to buy local. In 2003 the Maine based Institute for Local Self-Reliance found that for every dollar spent at a local business, 45 cents stayed in the local community. Another nine percent stayed within the state. For chain stores, however, only 14 cents remained within the local community. The rest trickled out to the national management along with distance product suppliers. Their supposition does suggest that communities with chain stores would be economically stronger than those without them, and we wanted to see if this was the case in New York City.

We used Center for an Urban Future’s data on chain stores, and cross referenced it with income data to see if communities with higher incomes have fewer chain stores.

State of the Chains_Corrected

As you can tell, for the most part that’s exactly what we found. A few exceptions existed among middle income people but they’re within a reasonable margin of error. Generally speaking in New York City, if you make less than 44,000 dollars a year your neighborhood is going to be rife with chain stores, and if you’re making more than 84,5000 a year your neighborhood will have very little.

One word of caution: this doesn’t tell us why this correlation exists merely that it does exist. It could be that chain stores remove income from communities, or it could be cultural – a signal of gentrification is the emergence of local neighborhood shops. It could be that higher income individuals prefer to live in neighborhoods with fewer retail chains.

#NeverTrump: The Candidates are On Board, but What About the SuperPACs?

By Judy Chang, Iris Fung, and Eric Lau

The data say that SuperPACs supporting non-Trump candidates are more interested in attacking other non-Trump candidates than Donald Trump. We want to tell this story because it stands in stark contrast to the #NeverTrump movement.

The news is dominated by the 2016 election, specifically by the negativity of the candidates’ campaigns. One of the chief contributors to this negativity are SuperPACs, which wield unlimited political spending power in support of their favored candidate. We focused on exploring the data behind the SuperPAC efforts directed against Republican front-runner Donald Trump. So far, Trump has been the focus of concerted verbal attacks by Marco Rubio and Ted Cruz in the televised debates. However, after exploring the Political TV Ad Archive’s dataset in Tableau, we found that this did not hold true in SuperPAC advertising. For example, the majority of the ‘con’ ads were purchased against Rubio and sponsored by Right to Rise, a SuperPAC supporting Jeb Bush. This was surprising. For all the talk of the Republican establishment’s dislike of Trump, the data suggested that there was even more internal discord among themselves.

We wanted to express this disconnect between the aforementioned intent and execution in an intuitive way. To do this, we built a hybrid infographic/interactive chart/article. Text snippets guide the viewer. A smaller bar graph shows current delegate numbers from the Associated Press. This recognizable, friendly graphic eases the viewer into a more technical chord diagram, which is not a commonly seen chart format and requires the reader to spend more time on it to understand the relevant relationships. We chose the chord diagram over a normal bar graph, because the directional arrows and arrangement of candidates around the circle connoted the feisty conflict of the campaign. Our infographic article combines humor and data to give the reader a non-obvious insight on the Republican race.

Check it out here!