A systematic review on quality indicators for tight glycaemic control in critically ill patients: need for an unambiguous indicator reference subset

Introduction The objectives of this study were to systematically identify and summarize quality indicators of tight glycaemic control in critically ill patients, and to inspect the applicability of their definitions. Methods We searched in MEDLINE® for all studies evaluating a tight glycaemic control protocol and/or quality of glucose control that reported original data from a clinical trial or observational study on critically ill adult patients. Results Forty-nine studies met the inclusion criteria; 30 different indicators were extracted and categorized into four nonorthogonal categories: blood glucose zones (for example, 'hypoglycaemia'); blood glucose levels (for example, 'mean blood glucose level'); time intervals (for example, 'time to occurrence of an event'); and protocol characteristics (for example, 'blood glucose sampling frequency'). Hypoglycaemia-related indicators were used in 43 out of 49 studies, acting as a proxy for safety, but they employed many different definitions. Blood glucose level summaries were used in 41 out of 49 studies, reported as means and/or medians during the study period or at a certain time point (for example, the morning blood glucose level or blood glucose level upon starting insulin therapy). Time spent in the predefined blood glucose level range, time needed to reach the defined blood glucose level target, hyperglycaemia-related indicators and protocol-related indicators were other frequently used indicators. Most indicators differ in their definitions even when they are meant to measure the same underlying concept. More importantly, many definitions are not precise, prohibiting their applicability and hence the reproducibility and comparability of research results. Conclusions An unambiguous indicator reference subset is necessary. The result of this systematic review can be used as a starting point from which to develop a standard list of well defined indicators that are associated with clinical outcomes or that concur with clinicians' subjective views on the quality of the regulatory process.


Introduction
Hyperglycaemia is frequently encountered in critically ill patients [1,2]. Even critically ill patients without diabetes develop hyperglycaemia. Until recently it was common practice to treat only marked hyperglycaemia in these patients, because hyperglycaemia was considered to be an adaptive response to critical illness [3]. Blood glucose control aiming to achieve normoglycaemia (blood glucose levels of 80 to 110 mg/dl), frequently referred to as 'tight glycaemic control' (TGC), decreases mortality and morbidity in critically ill patients [4,5]. It is the lowered blood glucose level (BGL) rather than the insulin dose that is related to reduced mortality and morbidity [6]. Attempts at achieving TGC, however, are not perfect and carry a risk for hypoglycaemia [4,5].
Several observational studies have reported on the quality of the glucose control process itself. The results and conclusions of these studies are contradictory [7]. Some show that the protocol prescribing the control process improves blood glucose control whereas others do not. Apart from differences in casemix and in the associated therapy (for example, steroid therapy), two important issues hamper comparability between studies. The first impediment is the existing variability in the intervention's evaluation. The following interpretations, based BGL: blood glucose level; TGC: tight glycaemic control.
(page number not for citation purposes) on intention and on process, are both possible: the patient is intended to be treated according to a TGC protocol (for example, when a specific intensive care unit is designated an intervention group), independent of actual adherence to the glucose control protocol; or the characterization of the patient's blood glucose regulation is evaluated according to the actual intensity of blood glucose control. The latter interpretation requires agreement on the level of adherence to the TGC protocol in terms of timing of glucose measurements and insulin provision to qualify a patient as being on TGC. The second impediment concerns the variability in outcome measures; studies may not use a standard list of well defined indicators for evaluating the quality of glucose control. Work presented in this paper concerns this second impediment.
The objective of the present systematic review is to identify and summarize quality indicators for glucose control in published studies of TGC in critically ill patients. It also assesses the applicability of definitions of quality indicators and organizes the indicators into categories. This review may form a basis for future developments of a standard list of well defined indicators that may correlate with clinical outcomes or that reflect clinicians' intuition regarding the quality of a given regulatory process.

Materials and methods
We searched for relevant English language articles based on keywords in title, abstract and MeSH terms, using Ovid MEDLINE ® and Ovid MEDLINE ® In-Process (1950 to 31 December 2007). The final literature search was performed on 31 December 2007.
The following search strategy was used to identify the relevant articles. In the first stage we searched for 'glucose' and 'insulin'. In the second stage we limited the search using 'critical illness', 'critical care' or 'intensive care'. The results of these two stages were combined using the Boolean operator 'and'. Searching was supplemented by scanning the bibliographies of the identified articles.
Two reviewers independently examined all titles and abstracts. Discrepancies between the two reviewers were resolved by consensus involving a third reviewer. Articles were selected if they reported original data from a clinical trial or observational study conducted in critically ill adult patients, and only if one of their main objectives concerned the evaluation of quality of TGC, with or without implementing an explicitly specified protocol. A study was defined as evaluating a TGC protocol if the (implicit or explicit) protocol implied an upper target range. Adherence to the protocol did not influence whether the study was included. Opinion papers, surveys and letters were excluded. Studies employing glucose-insulin-potassium protocols were excluded because they are not originally designed to achieve TGC.
From the selected papers, the same two reviewers extracted data on TGC quality indicators (their definition and applicability). A quality indicator was defined as a measurable quantity of the TGC process that may, alone or in combination with other quantities, indicate some aspect of its quality. This includes, for example, mean (or median) BGLs as well as interpretations thereof in terms of counts of hyperglycaemic events. Discrepancies between the two reviewers were again resolved by consensus after involving the same third reviewer. We then attempted to categorize the quality indicators into coherent categories that capture their essence.

Results
Searching the online databases yielded 486 articles. Initial screening of titles and abstracts resulted in 50 articles eligible for further full-text review. One additional article was identified by reviewing bibliographies, for a total of 51 articles. Based on the full-text review, two studies were excluded because they turned out not to address original data, leaving 49 articles for detailed analysis. Only five out of 49 studies reported on a target upper limit above 150 mg/dl.
All quality indicators of the 49 studies are summarized in Tables 1 and 2. Most papers evaluated multiple quality indicators. The median number of quality indicators was five (range 2 to 10).
By inspecting the quality indicators, we arrived at four indicator categories based on the following: zones (adverse-zone [hypoglycaemia and hyperglycaemia] and in-range zone); BGLs (for example, mean morning BGL); time intervals (for example, time elapsed until an event occurs or time spent in some state); and protocol characteristics (for instance, blood sampling frequency).
The categories are not mutually exclusive. For example, the amount of time during which a patient is regarded to be in a hyperglycaemic state is related to an adverse-zone as well as to time. Below, we list indicators, in decreasing order of reported frequency, and describe our findings about them.

Hypoglycaemia (adverse-zone and time categories)
Almost all studies reported at least one hypoglycaemia-related indicator (43/49 studies). Hypoglycaemia-related indicators address TGC safety. Because of its central position among the TGC quality indicators reported, hypoglycaemia is reported in Table 1 as an overall class of indicators.  moderate hypoglycaemia, and three different levels for defining severe or marked hypoglycaemia. Although a BGL <40 mg/dl was reported in eight studies as a hypoglycaemic event, 10 other studies considered this to be severe hypoglycaemia.
One study reported severe hypoglycaemia only when a low BGL was accompanied by clinical symptoms such as sweating and decreased level of consciousness [8].
In some studies the number and/or percentage of BGL measurements below a given threshold and/or the number and percentage of patients with at least one measurement below this threshold were used as safety-related quality indicators. One article considered all measurements below the selected hypoglycaemic threshold value over a period of at least 1 hour to represent a single hypoglycaemic event; hence only when the BGL increased to within the normal range and then dropped below the hypoglycaemic threshold in a subsequent hour was it counted as a second hypoglycaemic event.
In other studies, the definition of hypoglycaemia was not clear, and it appeared that any measurement below the threshold was considered a hypoglycaemic event. Seven studies reported the number and/or percentage of dextrose injections when BGL was under a threshold value (using five different thresholds from 45 to 65 mg/dl) as quality indicators. Seven other indicators in this category were reported in at least one out of nine studies. Except for 'time from starting TGC till first hypoglycaemia', the other six indicators referred to the duration of hypoglycaemia or speed and quality of recovery after a hypoglycaemic event.

BGL summaries over time (BGL category)
BGL summaries were used in 41 out of 49 studies. BGL summaries correspond to the efficiency of TGC. This indicator was calculated in different ways and was represented as mean and/or median. In some studies the BGL itself was the unit of observation. In other studies the mean BGL per patient or per time unit (for example, 1 hour) was the unit of observation. One

Measurements and time in predefined BGL ranges (inrange zone and time categories)
Thirty-eight studies out of 49 examining the number of measurements and/or the time during which BGL was within a predefined range were reported. These indicators are intended to address TGC efficiency. In 31 out of 49 of these studies, the percentage of measurements within the predefined range was considered a proxy for the proportion of time in each predefined BGL range. In 12 out of 49 studies, the percentages of time during which BGL was within the predefined range were calculated, in most of them under the assumption that BGL was linear over time. As shown in Figure 1, under this assumption a straight line is drawn between each two consecutive BGL measurements, and the time to the intersection between the line and a threshold value defining the range was taken as the time spent within the predefined BGL range. Five studies used both the number/percentage of measurements within the predefined range as well as the time during which BGL was within the predefined range. The unit of observation differed also among these studies. Only in one study was the percentage of measurements within the predefined range calculated a Unit of all BGL thresholds is mg/dl. BGL, blood glucose level; IIT, intensive insulin therapy. Next BGL after hypoglycaemia Represented as mean BGL 2 articles [8,49] per patient, and the mean percentage per patient was reported [11].

Time to capture the defined BGL target (time category)
The time needed to capture the defined BGL target was reported in 25 out of 49 studies and was represented as mean or median.
Similar to the time-related indicators in the predefined BGL range subcategory, in most of these studies it was unclear how this indicator was calculated. Linearity of BGL over time was explicitly mentioned in some studies [12-15] and appeared to have been assumed in other studies. It is possible that some of the studies might have used the time needed to capture the actual first BGL measurement within the target range, instead of the interpolated value shown in Figure 1.

Hyperglycaemic indicators (adverse zone category)
Although the reduction in duration of a hyperglycaemic period forms a major goal of TGC, only 13 out of 49 studies explicitly mentioned how a hyperglycaemic event or indicator were defined.  [31][32][33][34][35][36], frequency per patient hour [23], percentage of patients with more than one measurement in a predefined time interval (2 hours) [31], or overall per day [12,21], and as the percentage of time during which at least one measurement per 2 hours was taken [32].
Frequent BGL measurement is a key element in TGC, in order to steer the process in a timely manner. However, greater sampling frequency increases nursing and laboratory utilization [30]. In some studies [14,18,19] the total number of BGL measurements was reported.
Adherence to protocol (reported in 12/49 studies) is another frequently used indicator. Evaluation of adherence to protocol mainly focused on the difference between the protocol-recommended time of the next BGL test and the actual time of testing.
The remaining indicators (19/30) were mentioned in fewer than six studies, and 12 of them in only one study.
Time until reaction to hypoglycaemic event Represented as maximum time till hypoglycaemia recognition [8] or mean time till IIT adjustment after hypoglycaemic event [28] 2 articles [8,28]   On the whole, the included studies did not comment on why a specific group of indicators was selected, and -after further inspection -we could not find an association between indicator selection and patient population, disease or specification of the designed protocols.

Discussion
We have identified, listed and categorized TGC quality indicators, as used in 49 studies. In our search for studies pertaining to TGC, we allowed any synonym, without limiting the search a priori. A limitation of our search is that we addressed only studies in which evaluation and quality measurement formed a main objective; we might therefore have missed some studies with a limited evaluation and quality measurement focus. In addition, frequency was used as the ordering principle for presenting and describing indicators. Although this approach provides a good overview of the popular indicators used, it may overlook less frequent but useful indicators. Finally, although indicator categories are useful in terms of managing and understanding indicators, their induction is subjective. One may for example also consider the complexity of calculation of indicators (for instance, calculating mean BGL is simpler and faster than time-weighted mean BGL).
To our knowledge this is the first review dedicated exclusively to quality indicators for TGC in critically ill patients. Existing reviews on TGC have focused on its effects [7,37]; evidence of its utility and its advantages were reported, and ways to implement TGC protocols successfully discussed.

Indicators and indicator groups have merits and limitations.
Measures of mean BGL may mask measurements within adverse zones (for instance, two high BGLs may 'compensate' for one or more BGLs that are too low). Looking at hypoglycaemia and hyperglycaemic events separately would solve this problem, but this requires a way to combine both indicators into one quality indicator of blood glucose management. The Glycaemic Penalty Index, proposed very recently [38], is an attempt to address each zone and combine the two results. Indicators that neglect measurement timing, including the Glycaemic Penalty Index, may be sensitive to sampling. For example, the mean BGL of two determinations yielding the same BGL value taken at t 1 and t 2 or at t 1 and t 3 , where say t 3 > t 2 , will provide the same result, although the BGL -behaving as a function of time -may differ markedly. The hyperglycaemia index, which measures the area under the BGL over the time during which it was above a threshold, can mitigate this problem. The use of BGL measurement as the independent unit of observation neglects the within-patient correlation in BGLs.
On the other hand, when providing summaries at the patient level, some information is lost. Finally, a statistical point worth noting is that most BGL distributions are log-normal rather than normal [39], and hence nonparametric measures such as the median and interquartile range are likely to be more appropriate for summarizing the data and inference [40].
Because hypoglycaemia is the main potential risk from implementation of a TGC protocol, almost all studies reported at least one indicator related to hypoglycaemia. The number of hypoglycaemic events before and after TGC implementation and/or the management of these events form the main TGC safety indicators. However, we found several definitions and ambiguous terminology for defining a blood glucose measurement (or a set of measurements) a hypoglycaemic event.
Hypoglycaemic events were usually represented as the percentage or number of measurements below a defined level. Based on most glucose management protocols, the next BGL measurement after a hypoglycaemic event should be taken within 15 to 30 minutes. Only one study clearly stated that all measurements below the hypoglycaemia threshold over 1 hour after the first hypoglycaemic measurement were considered part of a single hypoglycaemic event or episode. In other studies it was not clear how these hypoglycaemic measurements where dealt with within a short interval, and hence whether they were regarded a single or as multiple events. Some studies also reported the number of dextrose injections to address this problem, where each injection corresponds to one hypoglycaemic event regardless of the number of measurements within the short interval. Even in these studies, the criteria and the BGL threshold for dextrose injection were different.
Indicators such as the percentage or number of BGL measurements and the time during which BGL was within predefined ranges were frequently used to represent the time duration in each predefined BGL range. However, the predefined ranges were different in the various studies, once again hampering comparability among them. Summary measures themselves, like mean and median of BGLs, were calculated with different units of analysis. Studies reporting the percentage of measurements tended to base calculations on all BGL measurements, regardless of the number of measurements of per patient.
In contrast, the percentage of time was calculated by taking a summary of each patient as the unit of analysis. These two ways of performing calculations do not necessarily yield the same results, because of within-patient correlations in measurements. It seems prudent to provide both results.
The strong relation between hyperglycaemia and mortality and morbidity is well known from the literature. Hence, hyperglycaemia reduction forms the main goal of TGC. Surprisingly, only nine studies explicitly defined a hyperglycaemic event and employed different definitions of an event in terms of timing and the BGL thresholds (between 150 to 250 mg/dl). Reporting the fraction of time above a threshold instead of the percentage of measurements above a threshold, without time consideration, seems more useful as a proxy for reducing time in a hyperglycaemic state. The hyperglycaemia index -calculated as the area between the curve and the hyperglycaemia threshold divided by time -seems to be a useful timeweighted indicator for hyperglycaemia. In some other studies, the percentage of BGL measurements or time in a predefined BGL range above the defined normoglycaemic threshold was reported but without explicitly labeling them as hyperglycaemia.
The quality of TGC in individual patients was rarely reported. Useful indicators include the percentages of patients with well and poorly controlled BGL, as defined by Carr and coworkers [31]; patients with at least one BGL outside the blood glucose target range; and patients who were not within the target blood glucose range On the whole, the authors of studies did not explain their choices of specific subsets of indicators. It is conceivable that an indicator was described by a specific statistic such as a median because of an underlying non-normal distribution, in order to permit sound statistical inference. Although this may explain the specific choice of a statistic, it does not account for the choice for the underlying concept in the first place.

Conclusion
When comparing the results of studies, one must consider differences in case-mix, in insulin therapy, in other associated therapies, in the power of the analysis and in outcome measures. The latter was the focus of this paper. The ambiguity and variability in the definitions of indicators and the threshold values for reporting an event as hypoglycaemia or hyperglycaemia severely hamper comparability among studies. A main problem is the absence of a 'gold standard' against which to compare indicators. Although there are almost no studies comparing different glycaemic metrics with relevant clinical outcomes, such as severity-associated mortality, deciding upon a common glycaemic vocabulary is an essential first step.
One possible useful way to proceed is to investigate further the relationship between indicators and clinical outcomes, for example their prognostic value (for example [24,41]). A second possible way is to ask a committee of experts to assess, for a wide range of patients, the perceived adequacy of TGC.
Ideally, for this sample of patients the BGL would be continuously measured, with insulin provision being based only on protocol-based measurement sampling. Because this is an ethically questionable approach (because not all measured BGLs are acted upon), an alternative is to attempt to achieve very high sampling of BGL measurements. Then, the indicators could be assessed according to their concordance with how well BGL is controlled, as assessed by expert opinion. This approach is subjective but it can provide important insight into the merits of indicators. In the meantime, studies should report on a more comprehensive set of indicators, including at least one pertaining to each of time, hyperglycaemia and hypoglycaemia. One should also report results at the measure-ment as well as patient level. An important message of this review is that many indicators are not but should be precisely defined, using formulas when necessary, to facilitate their assessment and comparability.

Key messages
• TGC indicators differ widely in their definitions, even when they are meant to measure the same underlying concept.
• Many definitions of indicators are not precise, limiting their applicability and hence the reproducibility and comparability of research findings.
• An unambiguous indicator reference subset is necessary for evaluating quality of TGC.
• The result of this systematic review can be used as a starting point from which to develop a standard list of well defined indicators, which are associated with clinical outcomes or concur with clinicians' subjective views on the quality of the regulatory process.