Background: The rise of social media creates large volumes of valuable textual data. Textual data come from a variety of sources including search phrases and social commentary (Google, Facebook, Twitter, etc.). Transforming this large volume of “unstructured” data into meaningful insights presents significant technical challenges, and text analytics is increasingly being used as a tool to meet this challenge. Deriving meaningful information from this sea of texts is important for public health practitioners because an increasing body of evidence has demonstrated that public and Web-mediated discussions about disasters, widespread diseases, and even mentions of health issues by celebrities impact health attitudes and behaviors of millions of people. In the age of budget austerity, developing the ability to synthesize this type of data into meaningful, actionable insights is critically important to maximizing the return on investment for every dollar spent, and for effective targeting of public health messages.
Program background: IQ Solutions, Inc., has conducted several text analytics projects for Federal, state, and nonprofit clients— covering topics that range from drug usage, suicide, and vitamin and supplement inquiries. Results from these analytics projects have helped to inform Web content, marketing campaigns, and Web designs.
Evaluation Methods and Results: Search phrases that people used to access content for Federal health agency Web sites were explored in response to both sudden spikes in certain topic areas and events that compelled focused investigations on how people accessed specified content. In both instances, search phrases used to access content were compared against national search trends using Google Trends. News media were also explored to see how related events may have contributed to online search behaviors. By evaluating how people accessed content on Web sites through the broader lenses of online public discourse and events, IQ Solutions was able to (1) identify rising topics of interest in certain health domains in real time, (2) recommend evidence-based strategies for framing content, and (3) provide strategies for maximizing public access to existing content by matching popular search phrases with the content.
Conclusions: Using text analytics to draw meaning from emerging communication tools, such as Google Trends, to inform health communication is a vital approach for drawing actionable information from volumes of “unstructured” social data online. It is especially beneficial for public health agencies and organizations that have the difficult task of dispersing critical health information in a timely, accurate, and relevant manner on a national scale.
Implications for research and/or practice: As more and more people seek and discuss health information online, strategies for analyzing large volumes of social text will become increasingly important for identifying breakthrough topics, understanding search behaviors and its implications on terminology use for public health campaigns, maximizing use of existing content, and informing future directions for public health.