Clinicians have struggled with the capricious nature of predicting surgical outcomes for hundreds of years. If one wanders of the beaten track to the basement of the Louvre in Paris you will come across a black diorite plinth inscribed with hieroglyphics from the time of King Hammurabi of Babylon (Figure 1). As early as 1750BC he was issuing edicts aimed at practising clinicians. The best known being:
‘If a surgeon operates on a free man and the man dies or goes blind then the surgeon should have his hand cut off’.
‘If a surgeon operates on a slave and the slave dies then it is the responsibility of the surgeon to replace the slave’.
It would appear at first sight that little has changed over the intervening four thousand years, but over the past thirty years there has been an increasing clinical awareness of the importance of clinical audit and clinical governance as tools to help with overall quality improvement. Although mortality alone is often used as a quality measure itself, clearly a number of factors can influence the outcome from surgical endeavour. The quality and experience of the surgeon and the anaesthetist preparing the patient for surgery and its subsequent performance can have a significant effect on outcome. However, the patient themselves will often bring with them the major prognostic factor with regard to subsequent outcome, that of their physiological fitness. This may be reflected in their chronic disease status or the acute physiological disturbance caused by their acute illness. Finally, the procedure itself will have a major affect on surgical outcome.
All these variables are amenable to change. We can expand our clinical knowledge to encompass new procedures. We can contract our practice to those areas in which we can excel. We may be able to improve a patient’s chronic disease status or devise new methods of anaesthesia to minimise risk in particular patients or we may be able to amend a patient’s acute physiological disturbance. We can even alter the magnitude of our surgical intervention to a degree. It was with these thoughts in mind, rather than fear of lawyers and legislators, that probably led clinicians to look at methods for measuring and predicting the outcome from surgical intervention.
Let us look at some of the methodologies available for predicting and measuring surgical performance and examine the application of clinical audit and outcome measures to this field.
Clinical audit tools
Clinical audit techniques use existing research or effectiveness data to formulate the design of quality standards against which it is possible to assess performance. In the main, these standards are process rather than outcome driven and assume that improvements in process will result in improvements in care and thus, ultimately, outcome There is now available a wealth of published guidelines produced by national and international bodies as well as local specialist networks. In the UK The National Institute for Health and Clinical Excellence (NICE) publishes guidelines with regard to new treatments, new interventions and clinical treatment with an expectation that the health service as a whole will comply with this guidance within a three month period. The currently available NICE treatment guidelines focus mainly on high volume disease states such as cancer, chronic disease states (e.g. diabetes, ischaemic heart disease, hypertension) and a smaller number of acute conditions (e.g. depression, anxiety). Nationally, there is now some evidence to suggest that improvements in process measures have resulted in improvements in outcome with regard to mortality at least in medical specialities like myocardial infarction and stroke. There is, however, little national data for the surgical specialities that clinical audit has as yet significantly improved outcome. At a local level there is some evidence that implementation of guidelines and audit quality improvement programmes can improve survival and reduce complaints and litigation (see Table 1) but it is always difficult to assess the contribution made to quality improvements by audit, the introduction of new staff and the development of multidisciplinary team working.
Whatever the contribution, there can be little doubt that regular clinical audit monitoring of process guidelines prevents performance slippage and will identify outliers at an early stage, providing the guidelines up to date and widely available.
Techniques for assessing true outcome
Whereas clinical audit methods tend to concentrate on process and structures, most patients and indeed surgeons are more interested in true outcome. These could fall into three main end points death, complication or survival. Most experienced surgeons and anaesthetists are able—accepting wide confidence limits—to guess the probable mortality outcome from a particular intervention. Interestingly, the ability to predict morbidity often deteriorates with the seniority of the clinician (see Tables 2 and 3). Some specialist societies are now attempting to define outcomes for individual procedures (vascular, colorectal and orthopaedics in particular) to allow comparisons to be made between units. The Vascular Society of Great Britain and Ireland has produced mortality ranges for surgical intervention for abdominal aneurysm and peripheral vascular disease (Table 4). And the Association of Coloproctology for Great Britain and Ireland also produces mortality rates for emergency and elective colon resection and anastomotic leaks rates for anterior resection and other anastomoses (Table 5). In England the Department of Health now publishes Standardised Mortality Ratios (SMR) for 4 procedures (hip and knee replacement, elective and emergency aneurysm).
Although the latter has tried to introduce some form a risk adjustment for age, sex, social deprivation and co-morbidity. The methodology is far from accurate and confidence limits are wide. It remains to be seen whether the availability of such SMRs to the general public reassures them of the equality of care or produces patient flows from units with SMRs above 100 to those below 100 despite all units performing within 99 per cent confidence limits. As with many surgeons, the public at large do not always understand complex mathematical models but do understand the concept of good (SMR under 100) and bad (SMR over 100).
Perhaps differing models may provide the solution. Models which merely produce an assessment of high or low risk with various graduations between such as ASA clearly do not offer the solution. Neither do those similar systems apportioning risk but without a numerical individual patient outcome prediction. APACHE requires observation over a twenty four hour period and the worst variables are applied to a mathematical formula which has extensive correction weightings for individual disease conditions. In comparison with those methods discussed previously it produces an individual numerical patient prediction for mortality but clearly more variables are necessary and the mathematics can be complex usually requiring significant hardware and software support. These APACHE problems have limited its application in general surgery where successful surgical intervention can have a major and immediate effect on physiological status.
In an attempt to overcome some of these difficulties, general surgeons during the late 1980s began to develop a methodology which would produce an individual patient prediction of both mortality and morbidity utilising data which was regularly collected and easy to obtain. This lead to the development of the POSSUM system (see Table 6 and 7), first published in 1991, which has now become one of the best known and widely applied methods for surgical audit. It has been validated in a wide range of surgical specialities including vascular surgery, colorectal surgery, thoracic surgery and general surgery. An orthopaedic POSSUM has been recently described and validated in which the general equations are still utilised but there are minor modifications to the operative severity score assessment. A modification of the POSSUM system has been devised which is of particular use in individual patient prediction. The p-POSSUM (Portsmouth POSSUM) system has proved to be particularly popular in vascular surgery. The same variables are assessed but a linear rather than logistic model (Table 8) is used making it an easier mathematical model to use and to self-design applicable software.
More recently further refinements of the original POSSUM system have been described specifically for colorectal and oesophageal surgeons. Tekkis et al. have described both a CR-POSSUM for colorectal surgeons and an O-POSSUM (Table 9) for oesophagogastric surgeons. These have the advantage of reducing the variables required for prediction and improving the accuracy for these particular fields of surgery. O-POSSUM, is however, somewhat complex and requires knowledge of individual variables, coefficients similar to the APACHE systems. As yet unlike the original POSSUM equations they have not been validated in units outside the UK but the original estimation data set was obtained from many differing sites across the UK, and as the variables and weightings are similar to the original POSSUM scoring system, it is likely that their accuracy will be confirmed by other observers. However, all these adaptations, unlike the original POSSUM system, have as yet no morbidity predictive model and cross speciality comparison is, of course, not possible.
Using predictive models of surgical outcome
If one has the ability to assess and predict individual patient outcomes how can this information be utilised? The easiest and most widely utilised technique is as an audit aid when discussing adverse events. However, it soon became apparent that techniques of this sort could be used to assess individual surgeon / anaesthetist and unit performance.
Systems such as POSSUM and APACHE which produce such a prediction have obvious advantages in this regard. Some authors have suggested that the p-POSSUM mathematical model has advantages in individual case review and this may well be the case in low risk cases as both the POSSUM and APACHE models are logistic equations based on populations of patients rather than individuals. Certainly the p-POSSUM and POSSUM systems are the ones recommended by the Royal College of Surgeons of both England and Edinburgh and by NCEPOD and are probably the methods of choice. The POSSUM system is the only system that produces a numerical prediction of morbidity across the surgical spectrum.
Clinical audit of adverse outcomes can be a particularly depressing affair. While it can be of great value to discuss cases where death occurs and predictive models indicate a risk of death of less than 20 per cent, the opposite end of the spectrum (risk greater than 80 per cent) often yields little audit gain except to discuss whether the operation was indeed indicated. Predictive models of these types can produce a new audit spectrum, that of the patients whose risk exceeds a certain level (for example >50 per cent) but who survive. Often, audit of these cases can identify best practice and produces changes in resuscitative protocols which produce a sustained quality improvement. Such an approach has the added value of making clinical audit an uplifting rather than depressing experience.
Over the past fifteen years, there has been increasing interest in the outcomes from individual unit as well as individual surgeon endeavour. If one simply applied mortality rates—as any mathematician will point out if you choose to take a radical stance and close the worst performing 5 per cent, after 10 years you will have closed 40 per cent of units and probably still not improved overall care. Fortunately no country has chosen, to date, to take such a radical decision.
Methods that assess individual patient variables would appear to offer the best methodology for assessing surgeon and anaesthetist performance. Table 10 illustrates the marked differences in outcome of surgeons with varying case mix. However, with the application of the POSSUM system it is possible to predict the expected number of deaths and comparing this with the actual number yields a ratio (the observed to expected ratio; O/E ratio) which potentially produces a true quality measure (see Table 10 and 11). There is now commercially available software (the CRAB system: available from CRAB Clinical Informatics Ltd) which includes all available POSSUM algorithms and which allows for the first time analysis of all POSSUM related quality indicators from audit aid to surgeon, unit and specialty specific outcome measures.
These techniques have now been widely validated and from personal observations it would appear that when performance deteriorates, it is in the management of patients whose risk lies between 10-80 per cent that major differences in unit performance have been identified. Where O/E ratios are persistently above 1.00 examinations of individual patient deaths and of the morbidity spectrum, when compared to similar clinician or unit spectra, can often identify the cause of poor performance. Local complications and wound related problems are often surgeon related. Respiratory and cardiac problems are often anaesthetist related. Renal and to a lesser extent respiratory problems are often related to the availability of appropriate high dependency facilities and the overall quality of nursing services. While these may be oversimplifications, from a personal perspective I have found them to be useful tools over the past ten years when assessing both my own and other units.
It would seem that when assessing surgical performance process and structures are best measured using classical clinical audit techniques. When assessing true outcome, be this mortality or morbidity, then some form of refined risk adjustment is necessary to avoid the risks of utilising simple mortality or morbidity rates.