Third, the zero percentages in Table 2 could be due to missing data from the Yelp.com reviews and/or from the CDC reports and should therefore be treated with caution. As a result, the reported correlations could also be affected by missing data, in addition to other factors (such as the scheme used in categorizing and grouping foods). Fourth,
the term list used in extracting foodborne illness reports are limited to typical symptoms of gastroenteritis and foodborne diseases, thereby missing some terms and slang words that could be used to describe foodborne illness. In future studies, we will develop a more comprehensive list that includes additional terms to better capture reports of foodborne illness. Fifth, the data are limited to businesses closest to specific colleges implying only a sample of foodservices in each state were included in the dataset thereby limiting selleck the conclusions that can be drawn from the comparison with the FOOD data, which although limited is aimed at statewide coverage of disease outbreaks. Sixth, the number of restaurants serving particular food items could influence the distribution of implicated foods across the food categories. For example, cities in the central part of the U.S. might
be more likely to serve meat–poultry products compared to aquatic products. Consequently, individuals are more PLK inhibitor likely to be exposed to foodborne pathogens present in foods that are more regularly ADAMTS5 served, which could partially explain the implications of these foods in foodborne illness reports. Lastly, the CDC warns that the data in FOOD are incomplete. However, this is the best comparator available for this analysis at a national scale. More detailed state or city-level analyses could further refine the evaluation of this online data source. The lack of near real-time reports of foodborne outbreaks at different geographical resolutions reinforces the need for alternative data sources to supplement traditional approaches to foodborne disease surveillance. In addition, data from Yelp.com can be combined
with data from other review sites, micro-blogs such as Twitter and crowdsourced websites such as Foodborne Chicago (https://foodborne.smartchicagoapps.org) to improve coverage of foodborne disease reports. Furthermore, although this study is limited to the United States, foodborne diseases are a global issue with outbreaks sometimes spanning multiple countries. We could therefore use a similar approach to assess and study trends and foods implicated in foodborne disease reports in other countries. Social media and similar data sources provide one approach to improving food safety through surveillance (Newkirk et al., 2012). One major advantage of these nontraditional data sources is timeliness. Detection and release of official reports of foodborne disease outbreaks could be delayed by several months (Bernardo et al.