Google Flu Trends – Knowledge and References

Explore chapters and articles related to this topic

Convince Them

Published in Walter DeGrange, Lucia Darrow, Field Guide to Compelling Analytics, 2022

WHAT IF we could track and predict outbreaks of a virus before cases are confirmed by in-person physician visits? This foresight would allow healthcare providers to prepare resources and reduce the overall impact of a virus on the community. In 2008, Google attempted to do just that for the seasonal influenza with the search engine-based tool, Google Flu Trends. The idea behind the tool was relatively simple: utilize trending search queries to predict regional influenza-like illness physician visits within the United States. For example, if someone enters a query such as “do I have the flu?” or “flu symptoms,” there is a strong likelihood this searcher is experiencing flu-like indicators. By using the wealth of search query data available, Google could crowdsource flu outbreak detection.

Machine Learning Modeling from Health Care Data

View Chapter

Purchase Book

Published in Chengliang Yang, Chris Delcher, Elizabeth Shenkman, Sanjay Ranka, Data-Driven Approaches for Health care, 2019

Chengliang Yang, Chris Delcher, Elizabeth Shenkman, Sanjay Ranka

Research and industry have consistently shown that machine learning approaches are effective at analyzing large amounts of data and using results to make predictions. Amazon applies users’ search and purchase histories to predict their next purchase. Uber forecasts transportation demand based on historical data to help drivers get business more efficiently. Google Flu Trends (GFT) learns influenza outbreaks from Google search queries on medical symptoms. For each of these applications, supervised and unsupervised machine learning is the key underlying technology for unleashing the power of data. As mentioned in the previous chapter, massive amounts of data accumulate in the health care world. Therefore, the field looks promising to use machine learning to address the high utilizer problem. In order to best apply machine learning techniques, researchers need to tailor machine learning approaches to identify high utilizers from data, interpret the factors that contribute to high utilization, and predict future high utilizers. This section describes several supervised and unsupervised machine learning approaches that can help address the high utilizer problem. We will start from the objectives of each approach and delve into their technical details.

Data Quality and Inference Errors

View Chapter

Purchase Book

Published in Ian Foster, Rayid Ghani, Ron S. Jarmin, Frauke Kreuter, Julia Lane, Big Data and Social Science, 2020

Paul P. Biemer

A well-known example of the risks of bad inference is provided by the Google Flu Trends series that uses Google searches on flu symptoms, remedies, and other related keywords to provide near-real-time estimates of flu activity in the US and 24 other countries. Compared to CDC data, the Google Flu Trends provides remarkably accurate indicators of flu incidence in the US between 2009 and 2011. However, for the 2012–2013 flu seasons, the Google Flu Trends estimates are almost double the CDC’s (Butler, 2013). Lazer et al. (2014) cite two causes of this error: big data hubris and algorithm dynamics.

Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy

View Article

Journal Information

Published in International Journal of Human–Computer Interaction, 2020

Ben Shneiderman

Moving up to the middle-range of consequential applications leads to medical, legal, environmental, or financial systems that can bring substantial benefits and harms. A well-documented case is the flawed Google Flu Trends, which was designed to predict flu outbreaks, enabling public health officials to assign resources more effectively (Lazer et al., 2014). The initial success did not persist and after two years, Google withdrew the website, because the programmers did not anticipate the many changes in search algorithms, user behavior, and societal context. Lazer et al. describe the harmful attitude of programmers as “algorithmic hubris,” suggesting that some programmers have unreasonable expectations of their capacity to create foolproof autonomous systems, akin to what happened with the Boeing 737 MAX.

Why we need biased AI: How including cognitive biases can enhance AI systems

View Article

Journal Information

Published in Journal of Experimental & Theoretical Artificial Intelligence, 2023

Thilo Hagendorff, Sarah Fabi

After having presented the theoretical idea of why it might be helpful to include cognitive biases into machine learning algorithms, we want to present an empirical basis for this claim. Unfortunately, until now, there does not exist a lot of work combining machine learning techniques with the implementation of cognitive biases. What already exists and what can give us some insights are comparisons between implemented heuristics and machine learning algorithms. Surprisingly, in a study by Makridakis et al. (2020), they compared 61 forecasting methods on 100,000 datasets, finding that simple models were better in predicting the data than more complex models. The researchers concluded that combinations of simple statistical and complex machine learning models might be best suited for forecasting tasks, supporting our own claim of combining machine learning with simple heuristics. Artinger et al. (2018) investigated 60 datasets and found that the hiatus heuristic was more accurate than random forests or logistic regressions. Lee et al. (2002) found that a simple take-the-best heuristic exceeded a more complex Bayesian model in a literature search. In 30 classification tasks from medicine, sports, and economics, fast-and-frugal trees performed similarly to more complex algorithms like logistic regressions (Martignon et al., 2008). A very recent investigation of Katsikopoulos et al. (2022) shows that Google Flu Trends (Ginsberg et al., 2009) with 45 variables from 50 million Google search queries can be outperformed by the simple recency heuristic. Rafati et al. (2021) showed that artificial neural nets can be outperformed by simpler heuristic methods, in this case regarding solar power forecasting. Many other studies could be added (Makridakis et al., 2018), in which heuristics exceed complex (machine learning) models. Nevertheless, in this paper, we do not want to contrast the two methods or even downgrade the successes of machine learning algorithms. On the contrary, we think it is highly important to develop them further and combine them with successful heuristics inspired by human cognition.