Theoretical Background and research questions/hypothesis: Both pro-vaccine and anti-vaccine information are prevalent on Twitter. Previous research focused on how anti-vaccine information spread on social media. However, there is a gap in the literature on how pro-vaccine information spread. This study aims to study the factors associated with high retweet frequency and high “reach” count in a corpus of highly retweeted tweets with hashtag #vaccineswork (a pro-vaccine hashtag). “Reach” is defined here as the amount of person-time during which the subjects were potentially exposed to the tweet (not necessarily have seen or have read), and is calculated by taking the sum of the followers of the seed user and all of the followers of those who re-tweeted the original tweets.
Methods: A corpus of Twitter data with hashtag #vaccineswork (N=194,259, from January 1, 2014 to April 30, 2015), was purchased from GNIP, Inc., of which 201 tweets with 100 retweets or more were manually coded. Separate zero-truncated negative binomial regression models were applied to analyze the content categories that were associated with “reach” count and retweet frequency respectively. For both outcome variables, 99 were subtracted from the values before fitting the zero-truncated models as the data were truncated at 99.
Results: After controlling for other contents with “reach” as the outcome variable, tweets that mentioned vaccines preventing deaths and/or outbreaks had 39% more “reach” (adjusted probability ratio, aPR=1.3867; 95% CI, 1.0755, 1.7879, P=0.012) compared with tweets that did not mention vaccine efficacy. After controlling for other contents with retweet frequency as the outcome variable, we found that all content variables are statistically significantly associated with retweet frequency: mentions of vaccines preventing deaths and/or outbreaks (aPR=0.6213, 95% CI 0.6212, 0.6214, P<0.001); mentions childhood vaccination (aPR=9.6001, 95%CI 9.5990, 9.6013, P<0.001); mentions a professional organization (aPR=0.5954, 95% CI 0.5953, 0.5955, P<0.001); mentions efficacy of vaccines (aPR=0.7140, 95%CI 0.7139, 0.7141, P<0.001); mentions of global vaccination improvement/efforts (aPR=0.4487, 95%CI 0.4486, 0.4488, P<0.001); mentions people’s pack of access to vaccines and vaccination (aPR=0.4853, 95%CI 0.4853, 0.4854; P<0.001); focuses on world immunization awareness week (aPR=1.3089, 95%CI 1.3086, 1.3091; P<0.001); mentions outbreaks and/or deaths of vaccine-preventable diseases (aPR=0.1904, 95%CI 0.1904, 0.1905; P<0.001).
Conclusions: Within the #vaccineswork corpus, the number of followers of a user was found not to be a confounder for predicting the “reach” of a tweet or for predicting the retweet frequency of a tweet. Contents for #vaccineswork tweets did affect their retweet frequency. However, for the potential “reach” of a tweet, only the “mention of vaccines preventing deaths and/or outbreaks” is associated with a higher “reach”.
Implications for research and/or practice: A better understanding of the types of contents that attract more attention for pro-vaccination tweets may assist public health practitioners to disseminate pro-vaccine information on Twitter.