Twitter Voodoo is a realtime visualization of public sentiment about 50 world-famous icons based on a sampling of 50 random tweets/second. It was a collaboration between me and Harry Seo for DESMA 161: Network Media.
Decoding the Visualization
Every time someone says something positive about one of our voodoo people, the heart scales according to how positive that tweet is. So if someone says something like, “OMG, Justin Bieber just had the most amazing show ever. Love. Puppies. Rainbows.” Justin’s heart would get really big. If someone said “Justin Bieber is alright,” the heart would be relatively small. Over time, this effect gives the illusion of a beating heart.
Every time someone says something negative about one of our voodoo people, a pin is added to a random place on the doll. Once all of the pin slots on the person have been filled, they are added down below with a number that keeps count of the extra needles.
The prompt for this project forced us to use data captured from a crowd of people and display it in a meaningful way.
I immediately liked the idea of taking data that wasn’t given to us but that instead was posted publicly for some other reason. To me, this gave the project a very authentic feel, one that seemed less contrived than asking people to answer predefined questions.
The voodoo theme seemed like a natural way to critique and question the role of social media in our lives today. The relationship between a person and their virtual self is strikingly similar to a person and their voodoo doll. We would never allow someone in real life to prick us with a pin, yet we often let people’s online pins poke real holes in our sense of self worth and happiness. In a way twitter has become a real virtual voodoo doll for the people who use it. We wanted to visualize these little instantaneous bursts of opinion that have tangible effects on the real world.
The most consistently and frequently discussed person was Donald Trump by far. Closely behind were Twitter and Youtube. All the other profiles were talked about here and there, but not really 24/7 like these 3. One time while running the site, Demi Lovato had a lot of people tweeting about her, and based on the tweets, it seemed like she had just finished a performance at a concert, so Twitter Voodoo is somewhat effective as a radar, showing which figures people are talking about in any given moment, or who is trending right now.
Overall, it seemed like there was a small majority of tweets that were categorized negatively, though certain users sometimes displayed a proclivity towards mostly positive or mostly negative. The most tweeted about profiles were often polarizing, showing a lot of positive tweets as well as a lot of negative ones, not giving us a definitive categorization like I was expecting.
Perhaps this is the most significant finding of this experiment: there is no general and definite sentiment about popular figures of our or any time. There are always people who love what others and there are always people who hate what others do. To try to characterize people as having or losing public opinion is often, but not always a grey area rather than black and white.
Harry is an incredibly talented illustrator, so he designed the entire website from colors to fonts and even creating all 50 illustrations. He tested the drawings with friends and iterated them several times in order to make each doll as recognizable as possible. I focused my efforts on the technical parts of the project, breathing life to Harry’s beautiful skins and refining the overarching vision for the project.
Behind the Code
Twitter Voodoo had a number of unique development challenges that I had never tackled before, which proved to be interesting problems to solve. First of all, I needed a way to scrape data from twitter realtime. Fortunately, PubNub has a live twitter stream that you can call with a key to their api. The second challenging part was using sentiment analysis to determine if a statement was positive or negative as well as how positive or negative.
IBM Watson has a service called Natural Language Understanding that makes sentiment analysis as easy as an api call as well, retuning a positive or negative sentiment along with a strength factor. We mapped the relative positivity of the statement to the size of the heart and negative comments to a pin placed randomly on the doll.
Third, the biggest visual programming challenge was how to give the illusion of relatively random placement of pins on the voodoo dolls. It’s not easy to isolate placement of the pins to an irregular shape like the voodoo dolls without occasionally placing them outside of the figure — say in the space between the legs or an arm and leg. To solve this, I created an array of 6 possible positions for the pins to be placed in and then assigned them a random rotation from the tip of the pin. This gave us enough control to ensure that the pin placement was well-balanced but also not too similar to any of the other doll’s pins.
There are almost 8,000 tweets made every single second. That’s an enormous amount of data—in fact, it’s probably too much data to parse through or even read. In order to make that number more consumable, PubNub limits the number of tweets you can stream to 50 per second. Even that is an incredible amount of data:
So if this amount of data is not even 0.01% of all the tweets at any given second, then you can easily imagine how overwhelming the whole amount would be. This limitation means that even though these samples are random, it is still technically only representing 0.01% of the opinions voiced on twitter, not all of them.
Impressively, IBM Watson’s sentiment analysis works with 8 languages including English, Arabic, French, German, Italian, Portuguese, Russian, and Spanish, but there are undoubtedly many tweets in languages that are not included in these 8. As a result, the opinions and sentiments of those people are completely left out. This may make the data we collect skewed since only users who tweet in these 8 languages are included in our visualization.
Furthermore, the very demographics of twitter affects the results of the visualization too. Even though the US is no where near the most populated country in the world, it easily holds the position for the most active users on twitter, meaning that America has its opinions disproportionately represented compared to other countries like China and India both of whose primary language isn’t even supported by Watson.
Sometimes Watson makes mistakes. Parsing tweets for sentiment is difficult, even for humans, so we have to forgive Watson for occasionally mis-categorizing certain tweets.
Some tweets are ambiguous or missing context, so even humans would have a hard time knowing if it is positive or negative:
Other tweets include no real parsable information on which Watson can make an educated guess:
In both cases, if Watson detects anything and returns any value other than a neutral ‘0’ then that result is counted and included in the visualization, even though the result is probably not accurate.
SELECTED “TOP 50” USERS
The 50 users we chose had to be recognizable by most people because otherwise the concept would fall flat. If you don’t know who the person is, does it matter what other people think about them? We also needed people who are talked about a lot because otherwise there would be no tweets about them and the site would just be a bunch of pretty illustrations. The last parameter we had was that we didn’t want to pick and choose individual users. This felt like we were interfering too much with the data, and we didn’t want our bias to factor into this project at all.
There are many metrics with which to rank twitter profiles, and at the end of the day, we decided to use the Top 50 Twitter profiles ranked by followers. This selection satisfied all of our parameters and seemed more objective than rank by “engagement,” which could be gamed by bot programs. The sample was also reasonably diverse including world leaders, entrepreneurs, musicians, athletes, media, and tech companies as well as other miscellaneous celebrities.
This selection was also made at an arbitrary time and even though the majority of these profiles stay in the top 50, sometimes they shift positions every once in a while. The site does not add voodoo dolls dynamically, so the order of the dolls is a snapshot of that particular time in history. Maybe in future iterations of the project, we could dynamically create dolls so that we could visualize twitter sentiment for any user. That would be cool.
Let me know what you think of this project in the comments and find other projects here!