Theoretical Background and research questions/hypothesis: Previous research analyzed individual Twitter chats as case studies of social media public health communication. This study compared four Twitter chats held by the Centers for Disease Control and Prevention (CDC) in 2016: #PublicHealthChat was our primary interest while #AMRChallenge, #HIVAgingChat, and #CDCPrep2016 were used as control groups. Following questions and hypotheses are addressed: RQ 1: When compared with the others, how many original tweets and how many unique users were in the #PublicHealthChat corpus? RQ 2: How did users connect during these four events with regards to retweet network? Are the retweet networks centralized? If yes, who were the hubs? H1: Twitter accounts, who have more followers, more original tweets and have less frequently retweeted other accounts, receive more retweets.
Methods: Twitter data were retrieved via Twitter Search Application Programming Interface and web scraping techniques (N=5,281). Social network analysis and a two-component hurdle regression model (logistic regression for zero versus non-zero, and negative binomial regression for positive counts) were applied to the data.
Results: Among the four Twitter chats studied, #PublicHealthChat had the highest number of original tweets (n=397/1074; 37%). Up to 348 unique users were engaged in #PublicHealthChat, ranking third among four. #PublicHealthChat retweet network was moderately centralized; the top 10% users received over 97% of retweets, at the center of which stood the @CDCgov dominating the information diffusion process. The majority (81.2%) of most retweeted 10% of users were health organizations, which dominated the #PublicHealthChat conversations. A two-component hurdle regression model was applied to investigate the relationship between retweet count, as dependent variable, and three independent variables: “frequency of relevant tweets”, “frequency of retweets from others” and “the logarithm of followers count”. All three predictors were found to be significantly capable for discriminating zeros and non-zeros. The odds ratios of “frequency of relevant tweets”, “frequency of retweets from others” and “the logarithm of followers count” are 1.82 (95% CI, 1.49, 2.21), 0.34 (95% CI, 0.20, 0.58) and 1.74 (95% CI, 1.33, 2.27) respectively. Regarding the count model, only the frequency of relevant tweets and the logarithm of followers count were significant, with relative risk at 1.06 (95% CI, 1.03, 1.08) and 1.35 (95% CI 1.22, 1.48) respectively.
Conclusions: @CDCgov dominated the information diffusion process of #PublicHealthChat and this success could be interpreted as a consequence of numerous followers combined with it posting as many as 32 tweets within only one hour. And contrary to existing literature, our results also showed that retweeting from others was negatively associated with whether a tweet would be retweeted but was irrelevant to the actual retweet counts.
Implications for research and/or practice: To our knowledge, this is one of the first studies that compare multiple Twitter chats pertinent to public health. And its analytic results can be used as references for future CDC Twitter chats. Especially for those who are late to the Twitter world, despite of having fewer followers, they could still disseminate their information and attain expected effects through regularly composing original relevant tweets.