Categories: “Computer Science

Reference #: 2018-032

OTC Contact: Zeinab Abouissa, 202-687-2702 (Directory Information | Send a Message)


A recent report by the University of Minnesota has shown mounting cyber-security risk facing the food industry. “The potential consequences of an attack on the industrial control systems used in the food industry include contaminated food that threatens public health, physical harm to workers, destroyed equipment, environmental damage, and massive financial losses for companies.” [1] Now more than ever, it is of the utmost importance that the food industry adopt strong communication and information technology to ensure a stable and safe global food supply. However, the total amount of data publicly available on global networks is increasing exponentially and cannot be manually reviewed, even by a vast network of humans, to quickly identify all relevant data for a potential food threat.

At present, food threat surveillance focuses on “horizon scanning”, which is often human-based schemes for monitoring both proprietary and open source data streams thought to be relevant to known or unknown threats. While these conventional methods are the norms, they are often inefficient and cannot identify surprises, latest developments, or novel plots because these the searches rely on a human conceived and a defined set of interests or knowledge that a computer-aided search treats as a priori knowledge. This pre-set boundary limits the capability of a search to detect and identify unexpected events. There are no true “Big Data” approaches to food threat surveillance that are capable of avoiding surprise or previously unknown threats, because:

  • They run the risk of not identifying surprises since, by definition, surprises do not occur frequently and are therefore unlikely to be considered as an interpretation of observed data
  • Keywords searches look for something specific
  • Machine classifiers are trained on the familiar
  • Logistic regression looks for risk factors of predefined, desired outcomes

Accordingly, there is a need for an improved system to identify relevant hypotheses in data, including surprising hypotheses, and to recognize known and emergent event signatures and enable human and/or machine event recognition of food safety and related events. Researchers at Georgetown University Medical Center Division of Integrated Biodefense along with Department of Computer Science have developed a novel turnkey, automated system to recognize threats to the food supply earlier than any current state of the art thus enabling real time or near real time surveillance of massive amounts of data. This method allows the data itself to define a space of possible hypotheses, which optionally merges and groups similar hypotheses, and then weighs and selects a subset of relevant hypotheses for further consideration. The system thus has the ability to:

  • Monitor the internet and other data continuously
  • Discover hypotheses potentially explaining data
  • Detect leading indicators of food threats and using these as signatures of nascent events
  • Notify users of identified and potential threats

The system enables human and/or machine event recognition by analyzing data to construct one or more qualitative metrics, establishing a baseline for the qualitative metric(s), identifying additional data over time, identifying an updated baseline, and outputting the adjusted baseline for display to the user. This system thus identifies known signatures of food threats in massive data sets and identifies hypothesis that can explain observed data to identify unknown food threats. Rather than bringing an a priori conceived hypothesis, it lets the data itself defined a ranked set of possible hypothesis. This approach to hypothesis generation in food security is:

  • Applicable to food threats broadly defined (e.g., food fraud/adulteration, chemical and biological poisoning, supply chain vulnerability, etc)
  • Applicable to textual (and potentially other) data
  • Executable in real time or near-real time and scalable to large applications
  • Has sound theoretical basis that users can understand and trust

[1] (new window)

Patent Status

PCT Patent Application No. PCT/US2019/034824


Ophir Frieder, Ph.D.

David Hartley, Ph.D.

Nili S Yossinger