This week, I read Yahoo! Research's paper Twitter Under Crisis: Can We Trust What We RT?, written by a trio of researchers out of Chile and Spain. The analysis was done over all the tweets sent over twitter over the time period of February 27 to March 2 of 2010, which most likely was sourced through Yahoo!'s access to Twitter's 'firehose'.
The crisis was the major earthquake (apparently the seventh worst ever recorded) that hit Chile on February 27, and the tsunami's that hit shortly afterward. They filtered the data to try to focus on only those accounts based out of Chile (based largely on Timezone settings, which was the most reliable indicator they had available), and a few other factors so they could limit the data set, but also try to focus on those users most directly affected by the disaster.
The Chilean Earthquake was interesting, because as a disaster did a ton of damage to Chile's telecommunications infrastructure. As would be expected, the traffic spiked early, quickly overtaking discussion of other events around Chile at the time, but petering out within a week or so. However, the idea that people use Twitter to trade news about disasters is hardly news.
What they found was largely unsurprising. Most people (68%) only sent out one or two tweets specifically about the disaster. The most active tweeters about a disaster generally had the most followers, but they were also generally news outlets covering the story (the top was an account named "BreakingNews"). One thing that really surprised me, was the relatively low number of retweets about the disaster, but I suppose that people in the heart of it, weren't spending a lot of time reading their Twitter feeds for things to resend.
The keyword analysis was also fascinating, showing that Twitter could be used to gauge the progress of a disaster. The first day was all about the earthquakes, resulting tsunamis and people dying. Day two and three focused on looking for missing people, and day four had a ton of discussion about the NASA story saying that this earthquake was so powerful, it actually disrupted the rotation of the earth making days approximately 1.26 microseconds shorter.
The next interesting part of the Analysis was looking at the discussion of fourteen rumours spotted in the Twitter data, seven proven true, seven false. This is a small data set, but the findings are interesting. People were far more likely to question or deny the false rumours (oddly, there were still a lot of affirmations of the false rumours). This is going to require more study, but with enough data, it appears that Twitter can be used as a reasonable predictor of the truth of a claim made by someone on Twitter.
There were interesting findings in this paper, but for the most part, I think it's a starting point. The findings are promising, in that if you had full access to Twitter's Firehose, you could form a lot of reasonable conclusions from the data hitting Twitter over the course of the disaster.
Next week, I'm going to be reading Language Support for Lightweight Transactions, which I'll post a link to, if it's in an Open Access journal. The paper serves as the basis of features like Haskell's Software Transactional Memory.