TABLE 31.2 Examples of Laboratory Tests and the Decisions Based on their Results | |
QUALITY ASSURANCE IN POINT-OF-CARE TESTING
Because many ICU decisions are based upon the results of laboratory analyses (Table 31.1), the intensivist must understand the strengths and the limitations of laboratory testing, whether performed in a central laboratory, in a satellite laboratory, or at the POC (1).
In order to provide quality results, an overview of QA concepts in laboratory testing follows. The Clinical and Laboratory Standards Institute (CLSI, previously named the National Committee for Clinical Laboratory Standards) defined QA as “the practice which encompasses all endeavors, procedures, formats and activities directed towards ensuring that a specified quality or product is achieved and maintained.” QA programs encompass assessments of analytical quality control; monitoring of TATs, regulatory compliance, and success of proficiency testing; and supervision of personnel training and competency.
To provide quality results:
- Standard operating procedures (SOPs) must be developed and followed.
- Systems must be in place to recognize and solve random and systematic problems.
- Result reliability must be defined in terms of suitable precision and accuracy.
A QA program assesses all aspects of testing: preanalytical, analytical, and postanalytical events. Preanalytical issues concern proper patient identification and tube labeling, proper sample acquisition, appropriate transport to the central laboratory or to the POCT device (e.g., cooling of ABG samples), and timing of the test (e.g., proper timing for therapeutic drug monitoring). Analytical matters concern the instrument performance, and postanalytical issues concern proper result reporting (e.g., the correct result is reported on the correct patient).
Theoretically, the goal of laboratory testing is to produce timely and reliable (e.g., quality) measurements of analytes that assist in the diagnosis, management, and prevention of human diseases. “Analyte” is a generic term for any substance that is measured in any fluid; POC testing is most commonly carried out on blood or urine samples, and the source of blood can be arterial, capillary, or venous.
In the ICU setting, the sample of choice is usually whole blood drawn from an artery when measuring blood gases, or arterial or venous whole blood when, for example, measuring sodium, potassium, glucose, lactate, or ionized calcium. If patients are not in shock and display normal peripheral perfusion, a warmed finger or toe can be lanced to obtain a capillary whole-blood sample for glucose measurement. In the ICU setting, besides hematocrit and glucose measurements, there are no other common reasons to obtain capillary blood.
The Value of Laboratory Test Results
The value of a test result can be conceptualized as the quality of the result divided by the TAT (Equation 1). This assumes that the laboratory data can be acted upon as soon as the data become available. Certainly, physiologic parameters that change the most rapidly attract and demand our attention, such as pH, PaCO2, PaO2, glucose, and potassium.
Quality results are accurate, and repeated measurements of the same sample demonstrate reproducibility (e.g., high precision; Equation 2). The central laboratory’s ability to provide both quality results and a short TAT are often at odds with one another; more accurate and precise complex assays are usually more time consuming, and such tests may not be available on POCT devices. Figure 31.1 depicts a theoretical curve for the relationship of result quality (y axis) and TAT (x axis). If assay time is reduced below a certain limit, the quality of the assay will be reduced. On the other hand, significant delays in making critical clinical decisions can adversely affect patient outcome. We must also acknowledge that POC tests rarely, if ever, will be as accurate or precise as tests accomplished in the central laboratory.
Equation 1:
Value of a test ≈ quality of the result × turnaround time–1
Note: Higher-quality results and lower turnaround times can provide higher-value tests.
Equation 2:
Quality of the result ≈ bias-1 × coefficient of variation-1
Note: Reduced bias (e.g., higher accuracy) and reduced coefficients of variation (e.g., higher precision) improve the quality of the test result.
Desirable intrinsic characteristics of the assay for the diagnosis, management, or prediction of disease are a high sensitivity and specificity. In epidemiologic terms, sensitivity is the number of true positive results divided by the number of observations in a diseased population.
Sensitivity = true positives ÷ (true positives + false negatives)
Specificity is the number of true negative results divided by the number of observations undertaken in a nondiseased population.
Specificity = true negatives ÷ (true negatives + false positives)
In its broadest sense, TAT is the “vein to brain” time: The time it takes between sample acquisition (e.g., venipuncture: the vein time) to result recognition by the treating physician (i.e., the brain time). Usually TAT is defined as the duration of time between sample acquisition and result reporting. Unfortunately, the laboratory often has little control over factors that determine when a sample is delivered to the central laboratory after acquisition. Similarly, preanalytical problems frequently develop because the sample is not properly drawn, labeled, or preserved prior to delivery to the laboratory. To be of value, the correct sample must be drawn from the correct patient at the correct time in the correct volume and placed in the correct tube.
If the analysis produces the most accurate result possible, but the TAT is unacceptably long, the value of the result in patient management is significantly degraded. TAT is most important in the ICU setting when the test results are used to immediately alter the patient’s care. Examples of such tests include ABG analysis for ventilated patients and glucose measurements in glycemic control protocols. There are many instances, however, where a TAT of several hours or more may be appropriate when the test is not used for immediate patient management (e.g., a karyotype result in a patient with suspected Down syndrome). Thus, the required TAT for any test result is relative. On the other hand, an instantaneous result that is not sufficiently accurate will not help—and may even hurt—the patient. It is wise to remember that bad data are worse than no data at all; physicians using bad data are misled.
Assay Performance: Precision
Precision is synonymous with reproducibility; for example, if aliquots of the original sample are retested, will the same result as the original result be observed (5)? Precision can be defined in terms of the assay’s standard deviation (SD) and coefficient of variation (CV). When aliquots of a single sample are measured repeatedly, the histographic distribution of results will represent a bell-shaped curve. Other descriptions for such a distribution include a Gaussian distribution or parametric distribution.
The SD for an assay is the square root of the variance. The variance is calculated as follows: The difference between each individual value and the mean is squared, these values are summed, and the sum of the squares is then divided by the number of repeats minus one. Sixty-eight percentage of the repeats will fall within ±1 SD of the mean. Approximately 95% of the repeats will fall within ±2 SD of the mean, and approximately 99% of the repeats will fall within ±3 SD of the mean. This concept will be used in developing rules that will help us determine when an analysis and analyzer are or are not working properly.
CV is expressed as a percentage: The SD is divided by the mean multiplied by 100. While SDs have values with units—mg/dL for glucose or mmHg for pO2—and are difficult to remember, CVs are unitless and allow easy comparisons among various analyses without needing to recall the specific SD or units. For example, electrolyte measurements using ion-selective electrodes usually display CVs of 1% to 2%. By way of comparison, analyses that use chemical reactions with spectrophotometric or electrical detection typically have CVs of 4% to 5%. As a consequence of their complex nature involving antigen–antibody interactions, immunoassays can show even greater variability, with CVs of 5% to 10%.
Precision can be further described as intra-assay or interassay reproducibility. Intra-assay precision is assessed when the same sample is run 10, 20, or more times in a single run. A “run” is the series of same analyses that are accomplished in a single day, shift, or other period of time during which the analyzer is believed to be analytically stable (e.g., does not require recalibration; many modern analyses are so stable that calibration may not be required for many days or longer). Intra-assay comparisons would not exceed 1 day.
Intra-assay precision is almost always superior to interassay precision; interassay precision is determined by measuring the same sample serially on different days (e.g., measuring the same sample once per day for 20 or more workdays in a row). For a typical chemical analysis, the intra-assay CV might be 5% and the interassay CV might be 7%. Clinicians do need to know the total imprecision—the combined intra-assay (e.g., same-day or same-shift reproducibility) and interassay (e.g., reproducibility over several days) imprecision—because some patients may be, for example, on ventilatory support for days or weeks with various degrees of pulmonary failure. While CVs (or SDs) cannot be added together to determine total imprecision, the intra-assay and interassay variances can be added together. The square root of the total variance then provides the SD, and the SD divided by the sample mean (multiplied by 100) provides the percentage CV.
Assay Performance: Accuracy
Accuracy is a measure of bias. Bias is the difference between the “real” (or “true”) result and the measured result; bias can be positive or negative. A positive bias is present when the measured result exceeds the true result. A negative bias is found when the measured result is less than the true result. Bias must not be excessive; the bias that does exist must not lead to incorrect diagnosis, management, or disease prediction.
The true result of an assay may be difficult to define or determine. This is especially true when there is only one basic method available for the measurement of an analyte. For many measurements, the only method of analysis is the field method (i.e., the analytical procedure that is used in the central laboratories or at the POC). For example, pO2 can only be measured using an oxygen-sensitive electrode. Reference methods, by definition, are more specific for the measurement of the analyte in question than the field method. Definitive methods are the best available methods of measurement with the highest specificity. Ideally, reference and definitive methods also have better precision than field methods. Because reference intervals (i.e., the “normal” ranges) are dependent on proper calibration, if there is a significant bias in calibration between the method used to establish the reference interval and the method in real-time use in the care of the patient, errors may be made in the interpretation of the result as to whether or not it falls within the reference interval, and to what degree the result may exceed or fall below the reference interval. On the other hand, relative change (i.e., the present result compared to a previous result) will not be affected by bias if instrument calibration is stable and the assay is precise. However, a lack of precision can have a major misleading effect on the interpretation of serial results. A lack of precision (i.e., imprecision) implies that larger absolute differences occur between serial measurements. With a highly precise assay, small serial differences are more likely to represent a true difference in the patient’s condition. With a highly imprecise assay, larger serial differences are required to indicate a true difference in the patient’s condition. To further complicate the consideration of a normal versus an abnormal result, we must consider biologic variation: The normal variation in a biologic measurement that can represent minute-to-minute or hour-to-hour fluctuations: Ultradian rhythms (e.g., luteinizing hormone [LH] or follicle-stimulating hormone [FSH] secretion); daily variations: Circadian rhythms (e.g., am vs. pm levels of cortisol); or variations greater than a day: Infradian rhythm (e.g., the menstrual period).
Analytical Sensitivity and Specificity
In analytical terms, sensitivity is the lowest concentration of an analyte that can reliably be measured. As measurements approach zero concentration of the analyte, the uncertainty of the measurement increases. At a certain point with a progressive decline in analyte concentration, the uncertainty of the measurement is so great that to report a lower number becomes meaningless. Analyzer manufacturers should define their lower limit of detection (LLD) to inform the user of the analyzer’s expected analytical sensitivity. In addition, it is routine policy for laboratories to define their own LLD or, at a minimum, to confirm the manufacturer’s stated LLD. In the ICU setting, LLD is most probably important in the measurement of glucose: “How low a glucose concentration can our POCT analyzer reliably report?” There are two forms of LLD. It is important to define which one the laboratory is using. One is the Limit of Detection (LOD) while the other is the Limit of Quantitation (LOQ). The LOD is the lowest concentration of an analyte that can reliably be distinguished from zero; the LOQ (also called the functional sensitivity) is the lowest concentration of an analyte that gives a reasonable precision, usually a CV of not more than 20%.
Analytical specificity is the certainty that the assay only measures the analyte of interest and does not measure other unintended substances in solution (e.g., “What is the assay’s cross-reactivity to other analytes?”). Cross-reactivity is not usually an issue for POCT in ICUs based on the types of assays run in such situations. However, in the central laboratory, cross-reactivity can be a significant issue. For example, cardiac troponin-T or troponin-I measurements should not cross-react with skeletal muscle troponin-T or troponin-I. On the other hand, assay cross-reactivity is desirable if one wishes to test for a class of drugs (e.g., drug abuse testing for benzodiazepines, opiates, sympathomimetics, or barbiturates).
Quality Control Testing
For all inpatient testing, whether waived-regulated testing, moderate complexity testing, or high-complexity testing (see below), quality control must be assessed at least daily for all analytes measured on the device. For certain types of testing, such as radioimmunoassays or enzyme-linked immunosorbent assays (ELISAs), control testing may need to be performed with each run of patient samples.
To perform quality control testing, a sample of known concentration is measured with the device in question (5). This is the “control material” or, simply, the “control.” The control material is usually available in a large volume and is prepared in many aliquots (e.g., >100) in a stable (e.g., frozen) form, so that the control material can be used over the course of many months to even longer than 1 year. If the control result for a run of samples falls within previously defined limits, the device and run are said to be “in control,” and patient results can be reported. If the control result is outside defined limits, the device and run are said to be “out of control,” meaning an analytical error has occurred and patient results cannot be reported. Another way to express an out-of-control run is to state that the run was “rejected” or “failed.” Thus, before any patient results can be reported, the operator must ensure that the analyzer is functioning correctly. Clearly, the control material must be measured prior to the release of any patient results. For moderate- and high-complexity testing, at least two levels of control are usually assessed. For example, the mean value of one control can be near a clinical decision point, while the mean value of the other control value can be considerably above the clinical decision point.
If the assay is out of control, the operator must troubleshoot the problem. Possible causes of out-of-control runs include:
- Machine mechanical errors (e.g., pipetting too little or too much liquid)
- Outdated reagents
- Reagents that have lost potency due to heating or lack of refrigeration
- Degraded control materials
- Operator error (e.g., mislabeled or switched controls, as in reversing the low-level and high-level controls)
- Spectrophotometric error (e.g., bulb loss or degraded function)
- Detector error
Fortunately, most POCT devices, even if moderately complex, are self-contained, are fairly robust, and can be simply “fixed” by replacing the reagent cartridge. If nothing else, another POCT analyzer can be used.
James Westgard created a series of rules that can be used to determine if a run or device is in control or is out of control (5). These “Westgard rules,” or their variations, are used essentially universally throughout the laboratory community. For each control material, the performance of the material is initially established by running this sample daily over the course of 20 to 30 days when the assay is otherwise known to be in control by using previously characterized control materials. From these data, the mean and the SD for the sample’s measurement (e.g., the control material’s “performance”) on the device in question can be calculated.
Once the performance of the control material is known (i.e., its mean value and SD are established), this material can then be used to determine if subsequent runs are in control. If a single control value is 3 or fewer SDs away from the mean, the assay is in control and the results can be released. While, strictly speaking, being in control—a control result of +2 to +3 SDs above or –2 to –3 SDs below the mean—is a “warning,” the operator should review previous control data and confirm that other instrument parameters are functioning normally.
TABLE 31.3 Westgard Quality Control Rules for a Single Level of Control | |