html xmlns=”http://www.w3.org/1999/xhtml” xmlns:mml=”http://www.w3.org/1998/Math/MathML” xmlns:epub=”http://www.idpf.org/2007/ops”>
Overview
Syndromic surveillance has been defined by the U.S. Centers for Disease Control and Prevention (CDC) as “a process that regularly and systematically uses health and health-related data in near ‘real-time’ to make information available on the health of a community.”1 Based on its original definition, the purpose of syndromic surveillance would be to prevent morbidity and mortality by early identification of case clusters in which mitigation would affect the outcome of the disease’s natural course. This original definition was designed for early event detection and became prominent in the public domain after the September 11, 2001 terrorist attacks in the United States and the subsequent anthrax illnesses and deaths. Since 2001, syndromic surveillance systems have been implemented during mass gatherings, such as the 2012 London Olympic and Paralympic Games. Elliot identifies syndromic surveillance as the collection, analysis, interpretation, and dissemination of health-related data, typically on a real-time (or near real-time) basis, to determine the early impact (or absence of impact) of potential human or veterinary public health threats that require effective public health action.2
With a heightened sense of urgency related to the so-called war on terror, many systems were put into place within the United States for the protection of the public health. These included such diverse programs as vaccine initiatives (BioShield), static detectors located throughout large cities to identify specific organisms of interest in the air (BioWatch), and the beginning and sustainment of a national syndromic surveillance system for early detection of outbreaks (BioSense). These three initiatives were designed for the following reasons, respectively: 1) prevention of disease if a terrorist attack occurred; 2) early identification of airborne pathogens during the asymptomatic phase of such disease; and 3) early identification of illness prior to definitive diagnosis that would be confirmed either by culture or laboratory tests. These government initiatives were complemented by independent, non-federal syndromic surveillance systems that were designed primarily for early identification of naturally occurring illnesses but were adaptable for use in bioterrorism surveillance. Other countries, such as the United Kingdom, have also heightened the evaluation of syndromic surveillance systems for response to early detection of bioterrorism events.3 In addition, there are other syndromic surveillance initiatives in Europe and Asia (Taiwan)4–6,7.
These systems are largely based on surveillance of existing data, such as help line calls and emergency department visits. For such population-based reporting, several factors must be defined if the surveillance system is to be useful. The system must provide initial detection such as finding an event as early as possible. It must quantify the event by defining the number of people who are potentially ill and identifying the location for the source of infection with enough granularity to allow for specific intervention. In addition, it would be useful if the surveillance system incorporated other supportive data such as provider and laboratory testing, and permitted early, computer-based investigation of possible case clusters by using such items as patient demographics. At least in theory, this should allow for initial outbreak management, such as confirming existing cases and tracking new ones. It would also facilitate timely countermeasure administration such as individual or community isolation, antimicrobial prophylaxis, and vaccination. In addition, if maximal utilization of data is the goal, then bioterrorism surveillance systems should also identify naturally occurring outbreaks and case clusters, because this will be the most frequent use of the data on an ongoing basis.8–16
For such data usage, there are multiple components of a surveillance system that should be defined prior to its implementation (see Table 13.1). Such questions as “What is the population under surveillance?”, “What is the time period for data collection?”, and “What data will be collected and who provides it?” are essential components that should be decided in the planning phase. For personal data security purposes, the issue of information transfer and storage is absolutely critical. Some assortment of personal identifiers is mandatory even if only regional mail code, age, and sex are used. The issue of data analysis is also critical, particularly: 1) who will analyze the data; 2) what methodology will be used and how often; and finally, 3) how will the reports be disseminated, to whom, and by what method. Although all of these factors may appear to be self-evident, there are no accepted and universally available standards for syndromic surveillance that would make the answer to these questions simple. Added to these complexities would be the need for data validation, which may or may not be possible based on the need for reporting timeliness. The quantification of uncertainty may be required for optimal use of data by decision-makers.
Theme | Examples of Questions to Consider for Surveillance |
---|---|
Who | Who is the population under surveillance? |
Who uses the data and for what purposes? | |
What | What data are being collected? |
Where | Where is the geographic location for consideration? |
When | When is the time period of data collection? |
Why | Why is the data collection important? |
Why is personally identifiable data needed? | |
How | How is the data being collected? |
How is the data being stored? | |
How will the data be analyzed and distributed? | |
How will the conclusions be conveyed? |
With an increased emphasis on surveillance for non-intentional events to augment the usefulness of expensive bioterrorism surveillance systems, the term “situational awareness” has moved from the military to the public health community. Thus, for a system to be fully operational, it should go beyond the possibility of early signal event detection and define the location, extent, and progression of multiple disease clusters and outbreaks that can occur at different times. For this mission, it may be important to evaluate a greater variety of data sources beyond traditional symptom syndromes. It may require more well-defined geographic locations for individuals (for example, including all five numbers in a zip code vs. just the first three numbers in the United States, or specific geo-tagging, particularly in rural or low-density population areas). More rapid reporting that would approach real-time may be necessary, meaning instantaneous transmission of any data point as soon as it is available with instantaneous analytic processing and report generation and distribution.
Last, the traditional use of symptoms for syndromic surveillance may not be adequate to accomplish all of these missions, including early event detection and situational awareness. A variety of other data sources have been suggested and investigated. Examples include over-the-counter medication sales, prescription drug purchases, number of phone calls to pediatricians’ offices, absenteeism from schools or work, and ambulance emergency runs. This so-called augmented syndromic surveillance could allow for greater specificity of signals that define true clusters or outbreaks versus statistical anomalies that would otherwise require large increments of time by public health authorities for investigation.17
Current State of the Art
The concept of syndromic surveillance is relatively straightforward, although the proof of concept and/or value is yet to be shown. In its simplest terms, data that can be immediately obtained prior to definitive diagnostic testing (e.g., microbiology culture or laboratory serology) are transferred to a central repository. Examples of these types of data include healthcare diagnostic or procedural coding, such as International Classification of Diseases (ICD-10) or Current Procedural Terminology (CPT-5) codes, or emergency department chief complaints. After receipt at the repository, the data are parsed into groupings related to established syndromes, such as respiratory, neurological, or gastrointestinal. The philosophy of syndrome grouping such as this rests on the assumption that, although errors may be made in specific diagnostic or procedural codes or chief complaints, the general group of the codes should be correct and allow early analysis of data. In addition, even if ICD-10 or CPT-5 coding is not possible, natural language interpretation of chief complaints can be used with specific trigger words to allow assignment of a syndromic grouping. As an example, identification of the possible spread of avian influenza is an important public health issue. In support of this effort, syndromic surveillance is considered an important component for early detection of avian flu infecting humans. Patients potentially infected with the virus will be categorized as having an “influenza-like illness” or “respiratory” syndrome using the syndromic surveillance model. It should be noted, however, that with the greater availability of molecular testing and other rapid diagnostic modalities, the use of surrogate markers such as ICD-10 coding and natural language readers may decrease or, for certain diseases, become unnecessary.
Additional data can be added such as blood pressure or temperature to improve the predictive value of any signal derived from statistical analysis of the syndrome groupings. For instance, if evaluating traditional respiratory syndromes, the blood pressure of most patients arriving at a healthcare facility would be reasonably normal, even during cold and flu season. On the other hand, if an airborne anthrax attack were to occur, there may be a significant increase in patients with a respiratory syndrome, high fever, and very low blood pressure. A more robust response by the public health community may be required when confronting such a severe syndrome as defined by augmented syndromic surveillance. However, it remains unclear whether such a robust approach would provide added value since the severity of illness and multiple patients presenting with similar signs and symptoms would likely alert the clinicians and public health authorities that an event had occurred.
There are multiple syndromic surveillance systems in use around the globe and across the United States. Often more than one is visible to regional public health entities for use as a stand-alone system or in conjunction with other local alerting systems, such as ambulance runs, to determine priorities for public health investigation and intervention. Syndromic surveillance contrasts with the “knowledgeable intermediary,” the single clinician who, recognizing that a patient or group of patients arriving for care displays an unusual set of signs or symptoms, alerts public health authorities. Such knowledgeable intermediaries were evident in the anthrax attack in the United States and the sarin gas attack in Tokyo. Syndromic surveillance is also different from standard notification of reportable diseases to public health. In the latter, such disease reporting is often made after a diagnosis is confirmed, such as with hepatitis B, meningococcal meningitis, or tuberculosis. Although such reporting is important, it generally lacks the timeliness necessary for mitigation in the case of an intentional biological event. Syndromic surveillance, therefore, is a methodology designed to gain the advantage of earlier detection (by days) of a biological attack or other infectious illness. This may facilitate: 1) earlier interventions to stop the spread of disease; 2) rapid initiation of appropriate treatment for affected individuals; and 3) perhaps, by increasing such timeliness, an enhanced probability of apprehending the perpetrators of an intentional biological event.
The current state of the art in syndromic surveillance is a rapidly moving target. There are a multitude of surveillance systems available. One review of the literature identified 36 surveillance systems and U.S. health departments alone have implemented syndromic surveillance systems at more than 100 sites since 2003.14,18 This, coupled with the shift from early event detection to situational awareness, the development of large city systems individualized for specific geographic areas, and the creation of new technologies, necessitates using exemplary systems of syndromic surveillance in this discussion. In fact, for public health entities that are better resourced, multiple systems are often used simultaneously to differentiate true case clusters or outbreaks compared with anomalies identified by the syndromic surveillance system. Improving this signal-to-noise ratio allows for optimum use of public health resources for investigation and mitigation as needed. This point receives particular emphasis in the paper by Buehler where multiple interviews indicate that syndromic surveillance was best utilized as a component of multiple data inputs at the local level.19 The author also notes the trend for syndromic surveillance away from early detection to situational awareness. In contrast, however, the American Reimbursement and Recovery Act of 2009 provides financial incentives to individual physicians to fully use an electronic medical record.20,21 Participating in activities such as syndromic surveillance reporting to electronic data systems may give the opportunity to enhance the value of syndromic surveillance as a component of an overall early identification or situational awareness plan. Table 13.2 gives a brief listing of several U.S. surveillance systems, past and present, using a variety of methodologies to achieve the previously stated goals.
System | Owners/Stakeholders | Description |
---|---|---|
BioSense | CDC | Used at the local, state, and national level by health officials; cloud computing environment that has analytical capabilities. |
Epi-X | CDC | Used by CDC officials, state and local health departments, poison control centers, and public health to access and share preliminary health information. |
National Electronic Disease Surveillance System (NEDSS) | CDC | Used by hospitals and healthcare systems to submit surveillance data to public health departments, which is then submitted to the CDC. Currently forty-six states, New York City, and Washington, DC, use NEDSS-based systems. |
Electronic Surveillance System for the Early Notification of Community-based Epidemics (ESSENCE) | Department of Defense (DOD) and Johns Hopkins University | A web-based application to monitor and provide alerts on rapid and usual increases in the occurrence of infectious diseases and biological outbreaks. DOD outpatient data is monitored worldwide. |
Real Time Outbreak and Disease Surveillance (RODS) | University of Pittsburgh | Real-time public health surveillance system that is used by multiple cities, states, and countries. The National Retail Data Monitor (NRDM) monitors 30,000 retail stores for over-the-counter medication sales throughout the United States |
The European Surveillance System (TESSy) | European CDC | Used by all Member States of the European Union and European Economic Area countries for the reporting of communicable diseases. |
Global Outbreak and Alert Response Network (GOARN) | WHO | Global network for technical support for outbreak surveillance. IHR 2005 expanded the role of GOARN for increased surveillance and response. |
Surveillance Systems – A World View
Perhaps the most significant change in surveillance systems of all types, including syndromic surveillance, is the appreciation that a broader perspective on biosurveillance is needed. Of particular importance is the World Health Organization (WHO) issuance of the revised International Health Regulations in 2005 (IHR 2005). This was to take effect in 2007 with compliance by member countries by 2012. It calls for international cooperation for surveillance, detection, reporting, and response to biological threats. While this may seem self-evident to those in public health and epidemiology, it may not have been so clear to others with a less defined mission. With the rise in international travel, long-distance transportation of goods and services, and particularly the movement of food commodities, the need for internationalization of biosurveillance must certainly be highlighted.
The United States provides an excellent example of a distributed system for syndromic surveillance. Chen provides comments on thirteen different systems that use a variety of inputs for biosurveillance purposes.22 For the most part, these do not interconnect, but instead utilize varying input models and a multitude of statistical methodologies to provide outcome data. While not surprising, the usefulness of the data for decision-makers will be limited by the narrow scope of the data provided.
Regarding the systems deployed in the United States, two are perhaps the most well-known, the BioSense system from the CDC and the Electronic Surveillance System for the Early Notification of Community-Based Epidemics (ESSENCE) from the U.S. Department of Defense (DOD). For BioSense, the primary initial daily inputs for the system originally came from the electronic medical records of the DOD and the U.S. Department of Veterans Affairs (VA). This has been widely expanded to other facilities in the private sector as well as laboratory results. While BioSense has gone through several years of testing and updates, BioSense 2.0 is the current iteration for general use. It is available to public health authorities and has the advantages of incorporating a large data set, containing analytic tools, and possessing the imprimatur of the CDC. The system in general does not require data input by practitioners and relies heavily on the electronic medical record. For the current iteration, local and state health departments are able to store, analyze, and control data in a government-certified cloud environment, which provides analytical tools to the health departments.23
The second system, ESSENCE, was developed to monitor the health status of military healthcare beneficiaries around the world. It relies heavily on diagnostic billing codes (ICD-10), now expanded to increase data inputs from a variety of additional sources and statistical algorithms to define outliers. It is relatively user-friendly, but is generally not a system for country-wide use and does not support national decision making.
Other systems found in the United States such as the Real-Time Outbreak Detection System (RODS), Early Aberration Reporting System (EARS), and others provide information with varying emphasis on such topics as spatial and temporal data monitoring systems, visualization methodologies, and data modeling. Any of the systems may be extremely useful for health authorities, but likely will be used as adjuncts to locally adapted approaches and in combination with other procedures used to collect information. This will be particularly important since valid signals from each of the models may be difficult to separate from the inevitable “noise” commonly found in such widely disparate data sets.
Emphasizing the need for a more global approach and for greater information organization from varying sources, the ProMed, HealthMap, and Argus projects monitor open source information using a variety of data monitoring infrastructures. Argus uses primarily web-based technology to capture information for biologic events and outbreak severity. Using this methodology, worldwide information can be accrued and made available across the globe. HealthMap also uses disparate data sources with a variety of inputs, including those with varying reliability such as official alerts from WHO, accounts from ProMed mail alerts (also containing animal health data), and open news sources. While validity of these systems may vary, they do provide a more global perspective that can support not only immediate interventions, but also the consideration of future needs as diseases travel from continent to continent.23–25
As the United States has progressed with syndromic surveillance, so have Europe and other parts of the world. Over the last several years, Europe and member states of the European Union have examined many syndromic surveillance systems methodologies across the continent, and perhaps the need for a new definition for syndromic surveillance.4 The goal appears to be defining a more formal network of syndromic surveillance activities. One of the European Syndromic Surveillance Systems (Triple-S) project deliverables was an inventory of the current existing syndromic surveillance systems. This was followed by questionnaires, country visits, and framework discussions. Some of the same issues seen in the United States are also of concern in Europe, such as communication, minimum data sets that are available at multiple sites, evaluation criteria, and methods for data collection and analysis. In addition, veterinary syndromic surveillance systems were also reviewed. While as many as forty-five systems may be in use, very few are fully operational; many are still in the developmental stage. Looking at animal-human biosurveillance synergies is being considered.4 While many individual European states and governmental entities have their own syndromic surveillance systems, some are not convinced about the value of syndromic surveillance as a strategy. Overall, the issue of interoperability remains critical. Open source software provides infrastructure that may be applicable to the needs of areas with limited resources. These assets help breech part of the gap between the developed and developing worlds.26,27 Such software can provide aggregation systems, statistical analysis components, and mapping capabilities. In addition, they can provide some interoperability that is necessary when dealing with multiple surveillance systems. This is critical since diseases do not respect international boundaries and move freely among countries around the globe.
In comparison to the developed, city-based environment, surveillance in the developing world, particularly in rural areas, presents different challenges. Of particular note, these regions lack capability for rapid, specific diagnostic testing for infectious diseases, either traditional or emerging. Compounding this problem is a high prevalence of infectious diseases, case clusters, and outbreaks that occur in these settings due to a variety of socioeconomic and geopolitical issues. In the past this might have generated a localized focus of cases. However, with massive amounts of travel, movement of goods, and perhaps even changes in weather patterns, localized case clusters can easily expand to involve larger cities, different countries, and span continents, as occurred with severe acute respiratory syndrome (SARS). Currently, China is testing a web-based integrated surveillance system for early detection of infectious diseases. This will include standard syndromic surveillance coupled with at least school absenteeism and over-the-counter drug sales, and will be compared with the traditional case reporting systems. Multiple algorithms and statistical methodologies will be tested. While results are not yet available, this sort of innovation is critical for control of the onset and movement of infectious diseases.28
Data Integration
The addition of nonhuman data might also add further value to any syndromic surveillance system. For instance, data on national or international water systems, unusual illnesses or deaths in animals, or unusual occurrences affecting plants or food crops might also increase the positive predictive value of any clusters of human cases found by syndromic surveillance.29
However, the complexity of integrating systems that are dramatically diverse, such as those involving plants, animals, BioWatch sensors, and human data, is daunting. In addition to the different types of data elements, there is also the question of differences in information technology architecture among the variety of datasets or between countries. Although there is variable architecture in human datasets, this situation is accentuated when automated data systems are implemented for plants, animals, or other more technical datasets such as BioWatch sensors.30–32 The information technology architecture of the originating datasets will be important if such large amounts of diverse data are to be electronically delivered, sorted, initially analyzed, and outliers defined. Developing a specific platform architecture that has been clearly defined for all of these diverse datasets remains a challenge.
In August 2007, the U.S. government established the National Biosurveillance Integration Center (NBIC). Subsequently, this center developed and provides oversight for the National Biosurveillance Integration System (NBIS).33 This system is designed to track and integrate data received electronically from individual U.S. Government agency liaisons. Such agencies include the CDC, the U.S. Environmental Protection Agency, the Department of Agriculture, and many other national and international sources. It will use these data to generate reports on the level of risk to the public health and nurture further collaboration among the government entities. In addition to the difficult task of electronic data interpretation, NBIS will also require human analysts to integrate the algorithmic quantifiable data and the more opaque threat data obtained from a variety of intelligence systems. Such sources include U.S. Embassies and other electronic surveillance traffic. Although this is groundbreaking technology, it will take significant time to fully implement and an even longer period to determine whether it is effective.
Data Analysis
There are a variety of mathematical data analysis formulae in place in the extant syndromic surveillance systems.34–48 These methods include everything from cumulative sum scores, smart scores, and anomaly detection algorithms to trends, proportions, expected to observed frequency, standard deviations, and other more descriptive statistics. At this time, system designers have not demonstrated that any of the statistical methodologies are definitively and clearly superior to any of the others, or that they define specifically which outliers are critical for investigation.
With limited resources provided to the public health community, the critical element in a syndromic surveillance system is the ability to define which outliers are sufficiently important to generate an on-the-ground investigation and determine whether a biological event has occurred. This so-called signal-to-noise ratio is one manner of defining the validity of the syndromic surveillance system. If alerts are generated very frequently with little or no outcome from laborious investigation, the system will go unused and thus have little value. If, on the other hand, the surveillance system is correct every time it defines an abnormal signal, then it is likely that such extreme specificity will not allow for sufficient sensitivity. It may allow other critical signals to go undetected at some unknown rate. Although mathematicians, statisticians, and modelers are critical to the process of syndromic surveillance analysis, the most essential element will be in the hands of the public health and epidemiologic community where the signal-to-noise ratio will truly be defined.