31009 Harvesting the Twitter Firehose for Measurement and Evaluation: A Content Analysis of Tweets From the Cdc's Tips From Former Smokers Campaign

Glen Szczypka, MA1, Sherry Emery Emery, MBA, PhD2 and Eman Aly, MSW2, 1Institute for Health Research and Policy, University of Illinois, Chicago, IL, 2Institute for Health Research and Policy, University of Illinois at Chicago, Chicago, IL

Theoretical Background and research questions/hypothesis:  The use of social media has grown exponentially over the last several years.  In fact, most television programs and televised advertising have a social media component, designed to expand reach and engagement with the audience.  To date, the tobacco control community has relied on traditional media—paid television, radio, billboard and print media advertising--to promote their messages.  On March 19, 2012, the Centers for Disease Control and Prevention (CDC) launched “Tips from Former Smokers.”  This campaign was the CDC’s largest anti-smoking campaign ever and it’s first national advertising effort.  The campaign was reported to last four months and consisted of both traditional and social media, including a website, Facebook page, YouTube channel, and a Twitter handle and hashtag.   This presentation will measure and evaluate a key social media component of the campaign—its Twitter reach and impact.  The social media goals of the “Tips from Former Smokers” campaign were to engage social media users, motivate them to become ambassadors for key campaign messages, encourage smokers to quit, and prevent youth from smoking in the first place.  The CDC created a Twitter user handle @CDCTobaccoFree and created a hashtag #CDCTips to tweet current data, articles, research and campaign resources.    The digital curation and content analysis of twitter data provides a useful tool for measuring online public engagement, audience sentiment, and campaign discourse. 

Methods:  Since its inception in 2006, Twitter has grown to be the largest social media network in the United States. Besides the text of the tweet, other metadata is digitally imprinted with each individual tweet.  These include hashtags, mentions, retweets, user names, user twitter page, internet device, internet links, and geo-locations.  This presentation will report the results of a study of all tweets and metadata associated with the Twitter handle @CDCTobaccoFree, the hashtag  #cdctips, and all smoking related keywords during the four month campaign.  Researchers have contracted an outside data provider to obtain access to Twitter’s complete datastream, called the Firehose.  In contrast to the publicly available stream, which provides a 1% sample of tweets, the Firehose provides real time search access to 100% of all tweets and metadata on Twitter.  We will report on the overall reach and audience engagement of the campaign through an analysis of unique users reached, number of retweets, and mentions.    This information will not only track the engagement of individual users but also measure the engagement of state tobacco control programs in the campaign.  A sentiment analysis will be conducted on tweets to gauge the emotional valence of the campaign and individual television ads.  Finally, using keywords for quitting and uptake, the numbers of twitter users that express interest in quitting or prevention will be reported.

Results: Monitoring of tweets and metadata began on March 18th.

Conclusions:  Reach and impact of the campaign will be reported through the following social media metrics: public engagement, audience sentiment, and campaign discourse.

Implications for research and/or practice:  The analysis of social media data has become an essential tool in the evaluation of health media campaigns.