Visualizing Binary Data

Original visualization and Author’s comments

binvis.io isn’t so much a statement as an exploration of what files on computers “look like”. In essence, it takes a binary file containing arbitrary data and generates an interactive visualization which allows users to inspect regions of the file for values such as byteclass (useful for distinguishing text from other data, for example), local entropy, and byte-level details. Admittedly, the target audience of this presentation is likely technically-minded, but the key ideas apply to a lot of areas. To keep it short, I’ll go over just two of them.

BinVis

The first is the representation of parametrized data in 2D. Many visualizations try to present multidimensional data in 2D, often using 3D computer graphics and animation. This visualization tackles the opposite problem: presenting a 1D sequence of points in 2D. The author exploits spatial locality by clustering contiguous regions of similar type, but also allows simple sequential display of the data on the grid. He also explains the techniques he used right in the help menu.

The second feature is that information isn’t withheld. For credibility, a presentation should be verifiable by giving sources or making the raw data available. This visualization takes this idea a step further by showing the raw data directly alongside the visualization, albeit in readable chunks. An even cooler feature is that the data which the user is focusing on is highlighted in each of the components (see the screenshot above). This creates a visual mapping between the raw data and the presentation, a powerful technique for showing connections across levels of abstraction and inviting users to ask their own questions.

Shutterstock Top Color Trends of 2015

Top Color Trends of 2015 – Shutterstock [source]

Shutterstock, a company that provides millions of royalty free stock photos, illustrations and vectors, has published a collection of visualizations that provide an analysis of color trends from the year 2015. By matching pixel data with image downloads, Shutterstock was able to identify trends in the fastest growing colors across their entire collection and the most popular colors by country. With stunning and beautiful professional quality photos, this data visualization acts as an eye-catching lure to the shutterstock website for commercial purposes. The bottom of the page leads you to a signup link and an email address for press to contact the creators of the visualization.

Fastest Growing Colors

fast

The first visualization displays the four most popular colors of 2015 based on image downloads. The colors are given in hex value (#01B1AE, #2E4A17, #40A1AC and #1F2A44) along with high resolution example images that are predominately that color value and consist of simple geometric and repeating patterns. All four images are deep blues and purples.

Colors Around the World

world_colors

The second graphic is interactive and displays the most popular colors of 2015 by country represented on a pixelated map. Colors by country (also given in hex RGB) vary greater than the fastest growing colors; Colors by country consist of peachy skin tones, forest greens, rocky grays, and ocean and sky blues. Clicking on a single country reveals the top three colors from that country along with some example images that are predominately those colors.

For all of the hex color values and example images Shutterstock’s search engine provides similarly colored photographs. The infographic demonstrated the power of the Shutterstock’s search engine and the high quality images that Shutterstock provides for its customers. This visualization acts as a beautiful, productive and convincing argument to use Shutterstock for your next design project.

 

The Zika Virus Explained

Vox, the news site that markets itself as the go to destination for explainer journalism, or journalism that breaks down the broader context of the news issues currently topping headlines, recently did an explainer article, using 6 charts and maps, on the zika virus.

Zika is a mosquito borne virus which, until recently had been of limited import. Though the Aedes aegypti and Aedes albopictus mosquitoes that carry the virus have long existed in the West, including in the southern parts of the United States, zika only migrated to the Western Hemisphere in 2013. And, unlike malaria which is also carried by Aedes aegypti and Aedes albopictus mosquitoes, zika is rarely deadly. In fact, people often don’t even know that they’ve infected either because they don’t show symptoms, or the symptoms – fever, rash, joint pain, or pink eye – are easily confused with that of other illnesses. Symptoms, if they occur, generally clear up within twelve days.

Still, Zika has spread incredibly rapidly in 2007 there were 14 cases diagnosed worldwide to an estimated 1.5 million in 2015. And in the areas zika has spread, so too has the increase in microcephaly in newborn babies. From the article:

The country has seen an unusual surge of Zika cases over the past two years — possibly after the virus arrived with World Cup travelers in 2014. Last year, more than 1.5 million people were affected.

Over that same period, Brazil has seen more and more newborns born with microcephaly, a congenital condition that’s associated with a small head and incomplete brain development. Normally Brazil gets several hundred cases a year, but since October 2015, health officials have reported more than 3,500 cases.

graph of incidences

According to the CDC, microcephaly is linked to seizures, a decreased ability to learn and function in daily life, feeding problems, hearing loss, vision problems, and developmental delays.

tiny head

As images of children with microcephaly has swept the web, panic has followed, with some countries telling women to delay pregnancy by as much as two years, and people – including women are who are not pregnant and don’t plan to be soon – cancelling planned vacations to affected areas.

The VOX article serves to temper those fears, with facts. The use of a drawn image of microcephaly  is good because it helps to visualize the issue without stoking panic. They also do a good job of pointing out that what is known is less frightening, and very narrowly impacts specific populations who should.

I also appreciate their tempering of the issue while pointing readers in a direction they should be concerned: everything suggests that at least in the United States currently, zika is manageable. There have been no domestic transmissions, but climate change is what likely allowed zika to spread so rapidly, and other more deadly diseases may be lurking behind it. Climate change is what we should really be afraid of.

zika countries

climate countries

 

2015 Year in Music

At the beginning of each year, Spotify refreshes its “Year in Music,” where it ranks the top artists and albums of the year and also publishes interesting statistics, such as how many tracks and how much time we’ve listened to music this year. The content can be the sum of all Spotify users, but can also be customized specifically to the user him/herself. Below shows an example screenshot, where Spotify totals that the world listened to 21 million different artists in 2015.

changycj_yearinmusic

Spotify 2015 Year in Music

On the website, the user can scroll through multiple panels, each of a different statistic. As the user browses through the website, different songs will play depending on the context. For example, on the Top Tracks panel, the song “Lean On” by Major Lazer plays, as it is the top track in 2015, according to Spotify.

Spotify publishes this content for its users to summarize global trends and also expose music habits and preferences the user him/herself may not know about. Overall, I think the statistics are very interesting, as they are very relatable, but I thought Spotify could have made it even better by having better visualization of these data. For example, a line chart could be drawn for the number of tracks we’ve listened to over years, instead of simply a line of text saying that we’ve listened to “167,493 more than last year.” Similarly, pie charts could be constructed comparing the top genres/artists across seasons, instead of separate panels with one-liners, “We loved Ellie Goulding, Wiz Khalifa, and Major Lazer this Spring” or “We finished the year strong with a lot of Justin Bieber, The Weeknd, and Drake.”

With these visual changes, I think Spotify can make its Year In Music incredibly relatable and interesting for its users, as they could explore their own taste but also compare it to the rest of the world’s.

Fact Checking Politicians

The New York times recently put out a chart rating each presidential candidate for their accuracy of statements. The statements were grouped into six categories, ranging from pants-on-fire lies to true. The post is aimed at a wide audience, from adults who are trying to choose a candidate to the candidates and their teams. With this graph the New York Times both urges average citizens to think more critically about what candidates say and pushes candidates into having to be more careful with the way they present data.

truth-lies

New York Times article- All Politicians Lie.

The chart on a whole is relatively clear and easy to read. The color scheme chosen both helps to differentiate between true and false and I appreciate that the author steered clear of loaded colors within the two party system. It is easy to find general summary information located along the left and right side, and to get a sense of the candidates general ranking. However, there are a few problems with this representation. The first being that, although the criteria used to place statements into categories is listed, it is still a pretty arbitrary metric. In addition, some of the candidates have vastly more statements that have been checked, and this information is not presented clearly.

I feel as though this presentation is effective in sparking an interest people to fact check candidates and think more critically about what they are saying. There has also been a strong response in campaigns around fact-checking journalism. The graph uses humor (title of categories) as a way to engage readers, however, I also feel that this takes away some legitimacy of the information being delivered, which is exactly what the chart is trying to fix.

The Fallen of World War II

Screen Shot 2016-02-04 at 11.11.58 AM

The Fallen of World War II is an interactive, data-driven documentary about the casualties of WWII. The documentary’s punchline, however, is that despite contemporary sentiments and contrary to what the media may make us feel, we are in fact living in a period of ‘long peace’. What this means is that we are now less likely than ever before in the recorded history of mankind (!) to die in battle.

The documentary’s central data visualization tool – the bar chart – is simple and accessible. The beauty of this piece is that it uses the easily legible bar chart in exciting new ways that really drive home some of the most insightful points that the documentary makes. For example, in order to highlight the extent of military casualties of the Soviet Union, the video slowly follows a new bar that rises up for almost a whole minute, eventually towering above the equivalent bars that detail German and French military casualties. This toggling between micro and macro views of the data on offer helps the viewer realize both the overall human life cost of WWII but also how each country fared comparatively to one another. What’s more, the sound effect used when each bar is presented – which alludes to casino chips falling – highlights the way civilians and soldiers alike were often used as pawns by their respective governments.

Besides the inventive use of the bar chart, the documentary is effective for many reasons. Together with a cross-country comparison, the documentary offers the number of casualties from each country involved in the conflict across time. The interactive bar chart (seen in the image below) allows for multiple variables to be taken in at once: month and year, nationality and even specific battles/events are available for the user to browse through.Screen Shot 2016-02-04 at 12.52.05 PM

Another effective technique in this documentary is the presentation of the same data in different ways. For example, when talking about the number of Jews killed during WWII, the documentary first arranges that data per country, and then arranges it again by cause of death (gas chambers; mobile killing squads etc).

Screen Shot 2016-02-04 at 11.10.28 AM

Beyond data visualization tools, the documentary uses narration and still photography in a way that (1) ensures the audience understands the data presented and (2) adds a ‘human’ and historic element to the story.

Screen Shot 2016-02-04 at 12.51.42 PM

 

 

World AIDS Day, 25 Years Later: What Have We Learned?

AIDS_Report_Infographic_800px_wide1

This is an infographic produced by the ONE Campaign, an international campaigns and advocacy organization dedicate to ending extreme poverty and preventable disease in Africa, that tells the story of the progress made against HIV/AIDS over the past 25 years (this post was published in 2013) and what’s needed to rid the disease from the planet.

 

The data shown next to the image of Africa illustrates the progress select African countries have made against AIDS by measuring the ratio of people newly infected over the people newly added to treatment (assumed an annual measure) and introduces the “tipping point” as the moment when the total number of people infected is equal or less to the number of people newly added to treatment. They then categorize the select African countries into four buckets: (1) Reached the Tipping Point; (2) Close to the Tipping Point; (3) Acceleration Needed; and (4) Progress Reversed. From the data presented, it’s clear that while many countries have made great progress or are on there way to reaching the tipping point (21 countries), there are still 16 countries where acceleration is needed or progress has been reversed. To illustrate this difference, the infographic compares the state of HIV/AIDS in two countries: Cameroon and Ghana.

The audience for this report U.S. policy makers and international development officials.

The goal of infographic is to drive home the message that while progress has been made, significant challenges lie ahead. And the use of data around HIV/AIDS deaths is a powerful reminder of the difference that smart, effective and accessible interventions can have.

Social Progress Index – Global Data Representation Analysis

Hi Everyone,

For my first CMS.631 assignment, I have chosen a data graphic that is in line with this semester’s theme of Civic Data, which is defined as “Data about our world and how we experience it, being used with the goal of making it better for us all”. This data representation comes from the researchers at the Social Progress Imperative, whose aim is to provide substantive metrics to measure the overall “social progress” of each country in today’s world.

Link for Data Representation

Social Progress Index Data Representation
The data being shown in this information graphic is the aggregate “Social Progress Index Score”, which is made up of three core metrics: “Basic Human Needs”, “Foundations of Well Being”, and “Opportunity”, each with their own four sub-metrics which then have their own data components. This is a massive data endeavor, and many different types of standard metrics such as “Child Mortality Rate”, “School Enrollment”, and “Obesity Rates” are combined with less standard metrics such as “Freedom over life choices” and “Corruption” in order to come up with the aggregated scores. As such, the face-value data that is present is essentially a conglomeration of thousands of economic and social indicators, boiled down to a few key scores and partitioned at the country level.

I think there are multiple audiences for this type of data representation. In my opinion the main one is researchers and policy-makers, in the sense that the data tries to provide a brand new methodology and a massive amount of already finished work for measuring progress as a country (and can be used at different levels of granularity) in a different, more holistic way then today’s methods. I think another main audience is that of those organizations responsible for data collection, in that it provides a clear vision of what kind of analysis and (hopefully) impact can result from certain data sets that are collected. Finally, I think another important audience for this data representation is the general population; this is because all of this complicated data curation, analysis, and representation is boiled down to a few numbers per country with a focus on visual presentation of the information, the information is presented freely online, and the Social Progress Index has been the focus of several TED talks over the last few years.

I think the there are three main goals of this data presentation. The first is to get a conversation going among the general populace with regards to what social progress means, how it is measured, and why different countries have different levels of social progress. The second is to provide decision-makers who may not be that technical with a methodology upon which to have discussions with other decision-makers and also to drive forward policy and research initiatives. The third is to draw researchers into the methodology and analysis done such that they consider the Social Progress Index as an effective metric and measurement system for how the world is changing over time.

I think that the graphic is quite effective, because it allows for different levels of granularity (which serves multiple audiences),  has a strong visual focus (colors help differentiate between metrics and shades of color help to compare different countries), and minimizes the amount of numbers presented (instead uses relative positioning via a ranking and colors to help people orient themselves within the framework presented).

Overall, I found this graphic to be very thought-provoking and incredibly relevant to the world today and the our class’ theme, and had a lot of fun looking through it. I look forward to having many more of these experiences over the rest of this semester!

Cheers,

Felipe

Ragged Mountain Snowfall

As a member of the MIT Ski Team, I frequently look at the upcoming forecasts, hoping for more snow. Unfortunately, the unseasonably warm weather lately has made training very difficult. This is a chart of yearly snowfall at the team’s home mountain, Ragged Mountain, from opensnow.com. It shows for each season from 2010-11 until now, what percentage of average snowfall the mountain has had. Ragged SnowfallThe chart uses circles of different heights to correspond to that’s years percentage, and it also includes different colors of the circles to go along with that. Seasons of low snowfall have orange circles, while seasons with more snowfall have bluer circles.

The audience of this chart is people like me, who ski at Ragged often. Since I’ve been skiing at Ragged for a few years now, I know how much snow they had last year and the year before. So when I look at the chart and see that this year has a much lower mark than the previous couple of years, I have a good sense of how much snow to expect. I believe that that’s the goal of the presentation. It shows skiers the trend in snowfall, in a way that’s easy to compare from year to year.

I think that this presentation is effective because I can easily see that this season I should expect less snow than in recent years, which lines up with what the weather has been like lately. However, one detail I wish this chart included is the actual number of inches, rather than only a percentage. Even just an indication of how many inches “Average” is would be a helpful specification.

CDC’s Women and Risks of Drinking Poster

The Centers for Disease Control and Prevention released an Infographic on the risks of drinking for women last Tuesday as part of their Vital Signs Report. The data displays the risks of drinking alcohol when pregnant. The data also claims that drinking 8 drinks a week, or binge drinking can lead to injuries/violence, STDs, and unintended pregnancies for women, among many other risks, without discussing any other contributing influences.

The audience is doctors, nurses, and other health professionals as the lower half of the poster shows a 5-step guide for helping women avoid drinking too much.

The goals are to show women (or their health professional who will pass on the information) that drinking holds many risks during pregnancy and the general risks of drinking, although the risks are not attributed to anything except for drinking which is absurd.

A small portion of the data displayed effectively shows the risks of drinking while pregnant, which seems to be the main message of the poster. A quote supporting this is on the lower half of the poster that suggests that women trying to get pregnant should avoid drinking alcohol. The biggest problem is half of the data shown on the top half of the poster does not suggest women who are trying to get pregnant should drink less, which is legitimate considering the risks shown when a women becomes pregnant. The poster instead suggests that women use birth control when drinking, insinuating that because women can have children; they are to be held accountable for the risks of drinking, which is why this poster has become so controversial.

 

Top Half of CDC Poster on Women and Risks of Drinking

Poster Link

Original CDC Article Link