Clinical review: Can we predict which patients are at risk of complications following surgery?

There are a vast number of operations carried out every year, with a small proportion of patients being at highest risk of mortality and morbidity. There has been considerable work to try and identify these high-risk patients. In this paper, we look in detail at the commonly used perioperative risk prediction models. Finally, we will be looking at the evolution and evidence for functional assessment and the National Surgical Quality Improvement Program (in the USA), both topical and exciting areas of perioperative prediction.


Introduction
Th ere are an estimated 234 million surgical operations every year worldwide [1], of which 4.2 million operations are carried out in England [2]. A precise estimation of perioperative complications and postoperative morbidity is diffi cult to gain, but it has been suggested this may occur in between 3 and 17% of cases [3,4]. Th is wide range in reported complications is probably related to variable reporting, as well as disputed classifi cation of complications. Th ese com pli cations cover a range of organ systems, including gastro intestinal, infectious, pulmo nary, renal, haemato logical and cardiovascular [5,6]. Th ese complications can be anaes thetic related (for example, postoperative nausea and vomiting or hypoxaemia in the recovery room) or surgical (for example, wound related, ileus or haemor rhage).
Postoperative mortality across all procedures is approxi mately 0.5%, although it may exceed 12% in older patients undergoing emergency surgery in the UK [7]. A small high-risk group of patients has been shown to be responsible for approximately 83% of deaths and signifi cantly longer hospital stays, despite making up only 12.5% of hospital admissions for surgery [7]. Of note, almost 90% of the patients in this high-risk group had emergency surgery, but <15% of them were admitted to critical care directly from the operating theatre. Comparatively, cardiac surgery in traditionally high-risk patients will routinely admit the majority of its patients to critical care postoperatively. Cardiac surgery has openly published mortality rates for a number of years. Th ese rates have demon strated a steady improvement, with a typical mortality rate of <2 to 3% [8].
Ideally, we would like to identify the patients who are most likely to suff er postoperative complications or mortality -both to inform the decision to operate, and to target postoperative care and critical care provision for these patients. Unfortunately, outcomes for patients under going surgery currently vary widely, and (particularly emergency) surgical care is often disjointed and may not be appropriately patient centred [9].

Complications
Accurate fi gures for surgical complication rates are diffi cult to obtain because of the lack of consensus amongst surgeons on what constitutes a postoperative complication. Th is diffi culty is further exacerbated by disagreement on a structured classifi cation of postoperative complications and morbidity, making it diffi cult to compare diff erent surgical techniques or predictive models for surgical complications. In 1992 a model for classifi cation of surgical complications was proposed by Clavien and colleagues [10]. Uptake of this model of classifi cation was slow, due in part to a lack of evidence of international validation. Th e model was updated in 2004, and evaluated in a large cohort of patients by an international survey. Th is new model allows grading of postoperative complications, regardless of the initial surgery. Th e diff erent categories are broad, permitting clear placement of complications in the various grades (Table 1).
To accurately record postoperative complications, it is important to have a validated questionnaire. Th e Postoperative Morbidity Survey is one such questionnaire [5,11]. Th is survey is well-validated and provides objective evidence of postoperative complications, fi tting the

Guidelines
Th ere are a number of guidelines available to both aid in the identifi cation of and guide the care of the high-risk patient.
In 2010 the Association of Anaesthetists of Great Britain and Ireland published guidelines on the pre operative assessment of a patient having an anaesthetic [12]. Th is document encourages a formal preoperative assessment process, which should start the process of identifying high-risk patients, as well as preparing the patient for their anaesthetic. Th ese guidelines incorporate the guidelines issued by the National Institute for Clinical Excellence in 2003 on the use of routine preoperative tests for elective surgery [13].
Th e American Heart Association published guidelines on perioperative cardiovascular evaluation and care for noncardiac surgery in 2007 [14]. Th ese were updated in 2009 to incorporate new evidence relating to perioperative β-blockade [15]. Similar guidelines were also issued by the European Society of Cardiology and endorsed by the European Society of Anesthesiology in 2009 [16]. One important predictive element suggested by the guidelines is the use of metabolic equivalents (METs): 1 MET is the oxygen consumption of a 40-yearold, 70 kg man, and is approximately 3.5 ml/minute/kg. Patients unable to reach 4 METS (equivalent of climbing a fl ight of stairs) are suggested to be at increased risk during surgery [17].
Th e Royal College of Surgeons of England and the Department of Health have also set up a Working Group on the Peri-operative Care of the Higher Risk General Surgical Patient, which has issued a set of guidelines on the care of the high-risk surgical patient [9]. In addition to the detection of complications following surgery, these guidelines emphasise the importance of a rapid, appropriate response to limit the number and severity of complications. Part of this response would include appro priate early use of critical care facilities.

Risk prediction
Evidently it would be preferable to identify high-risk patients prior to starting any operations. To make this identifi cation it is necessary to have an agreed defi nition of what constitutes a high-risk patient. Th e Royal College of Surgeons of England Working Group has defi ned a high-risk patient as one with an estimated mortality ≥5%, with consultant presence being encouraged if this value exceeds 10%. Th e group go on to suggest that any patients with esti mated mortality >10% should be admitted to critical care postoperatively.
To accurately estimate probable mortality and morbidity, we should ideally use an approach that combines the patient's physiological characteristics with the procedure to be carried out to calculate a predictive risk. Th e ideal risk prediction score should be simple, easily reproducible, objective, applicable to all patients and operations, and both sensitive and specifi c. Furthermore, this score should be equally easily applied to both the emergent and non-emergent patient and setting. Whilst in the non-emergent setting the anaesthetist has access to all of the patients' investigations and to more elaborate physio logical investigations, the emergent scenario requires decisions based on the acute physiological condition and quick investigations. Th e two scenarios can therefore be very diff erent, and it may not be possible to use one risk score for both emergent and nonemergent operations.
Th ere are various risk scoring systems that have been described in the literature. Th ese systems can be classifi ed as those estimating population risk or individual risk [18,19]. Scores predicting individual risk can be general, organ specifi c, or procedure specifi c. It is important not to use population-based scoring systems in isolation to make individual decisions because they cannot always be extrapolated to specifi c patients. An example of a general score that is based on estimating population risk is the American Society of Anesthetists (ASA) classifi cation [20]. Th e ASA classifica tion was not originally composed as a risk prediction score, although it is often used as such. Th e diff erent ASA classes have been shown to be good predictors of mortality [21], while the rate of postoperative morbidity has also been noted to vary with class [22]. Th e ASA system has the advantage of being a simple, easily applied score, which is widely known. However, the ASA classifi cation is subjective and does not provide individual or procedure specifi c information. Th e system has also been shown to have poor sensitivity and specifi city for individual patient morbidity and mortality [23].
Th e Charlson Comorbidity Index is a generic score based on weighting various preoperative diseases and predicting long-term survival [24]. Th is score is relatively simple to use, but also does not take into account the surgical operation, and relies on a subjective assessment of the patient, which may lead to errors. As such, it tends to be used as a research tool rather than in daily clinical practice [25].
In 1999 Lee and colleagues published a Revised Cardiac Risk Index [26]. Th is index is a scoring system used solely to predict the risk of major cardiac events after noncardiac surgery. Whilst the Revised Cardiac Risk Index is a simple, well-validated system that also considers the scale of surgery undertaken, it can only be used to predict single-organ risk.
Th e Acute Physiology and Chronic Health Evaluation (APACHE) score was fi rst introduced in 1981 [27] before the updated APACHE II score was published in 1985 [28]. Th e APACHE II system assigns a score based on 12 physiological variables, with further points for age and chronic health, but it does not consider the type of surgery undertaken as the score was originally designed for use in critical care. Th is score therefore provides an indi vidualised risk of mortality and morbidity, but does not diff erentiate between diff erent procedures. Despite this lack of diff erentiation, APACHE has been shown to give a better prediction of outcome than the ASA system [29], and has been shown to predict diff erent levels of surgical complications (minor, major and death). APACHE III and APACHE IV have subsequently been released, but have not been validated to the same extent as APACHE II for pre operative risk prediction. In addition, these scores are considerably more complex, requiring 17 physiological variables to be measured over the fi rst 24 hours of critical care stay. Th is requirement for the variables to be recorded over the fi rst 24 hours of critical care stay is present in all variations of the APACHE score, and is a major impediment to the regular use of this score pre operatively in emergency or urgent surgery. A derivation of the APACHE system that is useful for comparing patients with diff erent diseases is the Simplifi ed Acute Physiology Score II [30]. Th is score also requires the collection of 17 variables over the fi rst 24 hours of critical care stay, resulting in a predicted mortality score. Th e Simplifi ed Acute Physiology Score II is not designed for use in perioperative prediction, although it can be used in this fi eld.
Th e Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity (POSSUM) score was designed for use in preoperative risk prediction, allowing for both individual physiological risk and the type of surgery performed [31]. Th is scoring system examines 12 physiological and six operative variables, which are then entered into two mathematical equations to predict mortality and morbidity. Unfortunately, there was a tendency to overpredict mortality in low-risk patients as a result of using logistic regression to predict risk (the lowest possible mortality risk is 1.08%). In 1998 Portsmouth-POSSUM was published in an attempt to reduce this overprediction [32]. Whilst improving the mortality scoring, Portsmouth-POSSUM did not update the equation for morbidity scoring. Another variation of POSSUM is colorectal-POSSUM, designed in 2004 for use in colorectal surgery [33]. Despite some evidence that POSSUM may overestimate or underestimate risk in specifi c populations, POSSUM and its various surgery-specifi c iterations remain the most validated and used scoring system for predicting individual patient risk (Table 4).
Th ese scores are often used to calculate the mortality and morbidity risk prior to surgery. However, it is important to keep in mind the fact that high-risk surgery may still be of benefi t in certain patients. It is also important not to base postoperative critical care admissions purely on the scoring systems above. To this end, strict admission and discharge criteria from and to a critical care unit remain diffi cult to objectivise. Occasionally we will see patients who do not have a high score on the above systems, but clinically are frail, have multiple minor co-morbidities, or have fewer more signifi cant comorbidities. Treating these cases as high-risk patients with postoperative critical care is important despite the low score. Ultimately, the various risk stratifi cation scores can only be accurate for a proportion of patients, and there will always be patients in whom they are not accurate. Th ese patients are those who can only be selected out through clinical acumen, or by paying attention to the much-talked-about gut feeling. Important to remember is that some scores are designed to be calculated preoperatively (POSSUM), while others are designed for postoperative use (APACHE). While the scores can be adapted and used at any stage in the patient's care, they may not be as accurate.
An area of anaesthetic preoperative assessment that is receiving a high level of interest currently is functional assessment. Traditionally, functional assessment has always been a part of preoperative assessment prior to the removal of organs (pulmonary testing before pneumo nectomy or dimercaptosuccinic acid scan before nephrectomy). In addition, functional testing is often used to quantify the level of disease in a patient with known disease (stress echocardiography or pulmonary function testing). Cardiopulmonary exercise testing is an integrated test that looks at both cardiac and pulmonary function. Th is testing involves incremental physical exercise, up to the patient's maximal level (at which they are unable to do more, or become symptomatic). Whilst doing this exercise, the ventilatory eff ort, inspiratory and expiratory gasses, blood pressure and electrocardiogram are recorded. Th ese are used to calculate two values -the body's maximal oxygen uptake and the point at which anaerobic metabolism exceeds aerobic metabolism (anaer obic threshold). Th ese fi gures are used to demon strate the ability of the cardiopulmonary system to oxygenate the body. Measurement of the maximal oxygen uptake, and hence the patient's true MET status, by cardiopulmonary exercise testing has demonstrated that the traditional estimation of MET is often inaccurate. Th is inaccuracy has led to increased identifi cation of patients that have increased risk without being symptomatic or having identifi able factors in their medical and anaesthetic history. Cardiopulmonary exercise testing has long been shown to have good pre dictive value for postoperative complications in pulmo nary resection surgery [34,35].
Th ere is now increasing evidence for the benefi t of using cardiopulmonary exercise testing in general surgery as a predictive test for postoperative morbidity and mortality [36][37][38][39][40]. However, there are still doubts about the evidence base in certain surgical specialties and hence the global suitability of cardiopulmonary exercise testing at present [41].
In 1991, in the USA, the National Veterans Aff airs Surgical Risk Study prospectively collected data on major operations at 44 Veterans Aff airs hospitals [42]. Based on these data, the study developed risk-adjusted models for 30-day morbidity and mortality for a number of surgical subspecialties [43,44]. Following on from this study, the Veterans Aff airs National Surgical Quality Improvement Program (NSQIP) was set up in 1994 at all of the Veterans Aff airs hospitals, leading to a 45% reduction in morbidity and a 27% decrease in mortality (and hence large cost savings) [45]. Th e NSQIP was subsequently expanded to include a number of university teaching hospitals in the Patient Safety in Surgery study funded by the American College of Surgeons (ACS) from 2001 to 2004. Th e Patient Safety in Surgery study demonstrated a significantly lower 30-day unadjusted mortality for men in the study hospital [46,47].
As a result, in 2004 the ACS-NSQIP was started. By 2008, 198 hospitals were receiving ACS-NSQIP feedback on their outcomes [48]. Using the hospitals with lower morbidity or mortality as benchmarks to identify the adjustable factors in poor outcomes in individual hospitals, these factors can be changed to improve outcomes  [49,50]. One example of this relates to colectomies performed in ACS-NSQIP enrolled hospitals. Th ese operations have been shown to increasingly be performed laparoscopically in these hospitals, with signifi cant reductions in most major complications (including surgical-site infections, pneu mo nia and sepsis) [51]. One should remember despite the potential benefi ts of the ACS-NSQIP programme that there are limits to its usefulness. Th e input of data is labour intensive, and the results are only as good as the data input. Furthermore, the results are based on interpre ta tion of data in specifi c categories, thus missing complica tions that do not fall into these specifi c areas [52][53][54]. Th is ACS-NSQIP programme is also building up a large database of information that should hopefully produce more eff ective risk stratifi cation scores in the future. One area of healthcare policy that is very topical is the improved outcomes provided by carrying out certain operations in fewer high-volume surgical centres [55,56]. Low-risk patients, however, have been shown to have comparable outcomes in both low-volume and highvolume centres [57]. Th e moderate-risk to high-risk patients do still have better outcomes in the larger regional centres. Hence, it is important to risk stratify a patient before selecting a hospital for an elective operation (the local smaller hospital may still be an appropriate place to undergo surgery).

Conclusion
Currently, preoperative risk stratifi cation is often not part of the standard preoperative assessment (with the exception of the ASA classifi cation). Th ere are a number of reasons for this omission. Th e currently available scores are often compli cated, needing multiple tests or time to complete. Facili ties and staff time/training may not be available for functional testing. Traditionally, junior doctors, in addition to their other clinical duties, carried out pre operative assessment -they may not have been aware of the guidelines and risk stratifi cation scores for use in surgery. Additionally, mortality and morbidity tables for individual hospitals and surgeons/surgeries are not routinely published for noncardiac surgery. As a result, this is often not a priority for hospital managers or clinicians who may or may not know accurate outcome statistics for their patients. However, the current fi nancial restraints on the National Health Service are likely to lead to renewed eff orts to reduce the length of stay in hospital by reducing postoperative morbidity. Th e government's stated aim to increase competition (and in so doing improve results) is likely to lead to increased interest in also reducing mortality. In the absence of a British version of NSQIP, there is likely to be increased focus on preoperative risk stratifi cation scoring. As well as potentially reducing costs and improving performance, preoperative scoring has the potential to ensure better informed consent and patient/procedural selection, as well as appropriate targeting of postoperative critical care services.
Unfortunately, all of the currently used risk scoring systems have limitations. Th ese limitations include interobserver variability for the ASA classifi cation, the compli cated nature and need for 24 hours of observations with APACHE, and the overestimation of mortality in lower risk groups with POSSUM. Th e single-organ scores are often useful in predicting organ dysfunction, but only provide a limited picture. Th e present limitations do not preclude the use of the tests, but ensure that it is important to select the test based on the patient population and the surgery being performed. Currently assigning patients to bands of risk (that is, high, medium or low) may be the best we can achieve, but it is still not a routine calculation.
An area of great interest in preoperative assessment for elective surgery is functional testing. Th is area presently generates a lot of debate, with strong views on both sides. Th ere is good evidence for the use of functional testing in specifi c surgical specialties. However, the situation does remain unclear in other forms of surgery. In addition, functional testing is time consuming, and requires investment and training to get started. Th is investment is clearly diffi cult at present with budgets being reduced across the board. To become established, further evidence is needed to demonstrate its relevance across all surgical specialties. Th is is an area that is still in its infancy, but as further research is carried out will probably become more established and see wider use. Th e potential to provide individualised risk prediction based on an individual's physiological response to stress is an exciting area, with the possibility of high predictive value and better use of critical resources to improve patient care.