Click here for the Daily Orange's inclusive journalism fellowship applications for this year


Science and Technology

Going down in #history: Library of Congress receives donation to archive tweets

Micah Benson | Art Director

A generous donation by Twitter to the Library of Congress will provide insight into an entire archive of the company’s tweets, some 170 billion and counting dating back to 2006.

Each tweet that is logged is equipped with a host of supplementary data such as the geographic origin of the tweet, the number of re-tweets and the author’s follower list, according to a Jan. 3 article published in The Washington Post.

William Ward, a social media professor at the S.I. Newhouse School of Public Communications, praised the new endeavor.

“Semantic analysis can be gathered by mining through people’s moods and cross referencing historical trends,” Ward said.

Twitter is an example of big data, which Ward outlined as “the collection of massive amounts of data for seeking trend insights.” The concept of big data has only been made possible in the last decade with the popularity of social media. Twitter’s enormous data pool grows by nearly 400 million tweets per day, according to the article.



The tweets often surpass velocities of 3,000 per minute and originate from nearly every country on the planet, according to an Aug. 28 USA Today article.

Ward said Big Data allows viewers to see spikes in tweeting rates that may help “discover trends when people are more active or non-active during events like the Arab Spring.”

However, these tweets are not necessarily representative of the population. According to AdvertisingAge, about 41 percent of Twitter users are between the ages of 18-29. Twitter users also tend to earn less when compared to the U.S. Bureau of Labor Statistics income breakdown, according to AdvertisingAge.

This information will remain in a raw state due to continued budgetary constraints. The U.S. Committee on Appropriations awarded the library $79 million less than the $587 million proposed budget. This lack of funding greatly inhibits search query speed and server capability, according to The Hill, a congressional newspaper.

In addition, the current archive will not include deleted tweets or users who protect their profiles with strict privacy settings, according to The Washington Post article.

Jill Ann Hurst-Wahl, an associate professor of practice and director in the School of Information Studies, said there are still some questions to be answered about the archive.

“What stories will the [archive] tell about what we are as a people? Can we see changes in how we think or act?” she said.

It will be very interesting to learn what researchers can discover from the archive, Hurst-Wahl said.

She also noted that access to the database of billions of tweets would be available on a six-month delay and only within library confines.

The Twitter archive is the first venture of the library to preserve the seemingly fleeting messages of the digital age. This database affords the public the ability to relive a moment via the lens of the Twittersphere.





Top Stories