

Wald showed that the most powerful sequential comparison based on an accumulating set of data takes the form of a running ‘log-likelihood ratio’ (LLR), which is increased or decreased after each observation by a quantity depending on the event observed and, in its risk-adjusted form, the expected outcome if that event were performance ‘as normal’, i.e. Suppose we have two hypotheses: a null hypothesis H 0 corresponding to performance as expected, and H 1 to a level of performance deemed importantly divergent. The formal procedure has three components: a running test statistic, thresholds for the statistic that determine statistical significance, and actions to be taken on crossing a threshold. The SPRT is the most powerful method for discriminating between two hypotheses, and was recommended well over 40 years ago in a medical context for clinical trials and clinical experiments.

Materials and methodsįormal statistical methods for sequential analysis were developed in 1943 independently by Barnard in the UK and Wald in the US. We conclude that the risk-adjusted SPRT is a simple technique that could be useful within a clinical monitoring system. Examples are provided in three monitoring contexts: annual surgical mortality (paediatric cardiac surgery in Bristol), adverse events in a population (Harold Shipman’s practice) and individual operations (cardiac surgery by a group of surgeons). In this paper we investigate a risk-adjusted version of the classic sequential probability ratio test (SPRT) that was developed for quality control of military supplies in the Second World War: this is shown to take the form of a simple adaptation of a cumulative ‘observed – expected’ plot with horizontal thresholds. The ‘risk-adjusted CUSUM’ has been suggested as a technique for dealing with both risk-adjustment and sequential testing, but setting appropriate thresholds is not straightforward. However, this does not allow for the well-known problem of repeated testing in which a true null hypothesis is certain to be eventually rejected, which in this context could correspond to an unreasonable number of false accusations of poor performance. These can be set for a single time point using standard methods for confidence intervals. The problem then arises of setting appropriate ‘thresholds’ on such plots to indicate the need for further scrutiny. If concerned with mortality, for example, this means the cumulative ‘excess deaths’ (or conversely ‘lives saved’) can be monitored. A common suggestion is to plot the accumulating difference between the observed number of adverse events, and the number expected according to an established risk-adjustment procedure. The medical context does, however, add an additional complexity in the need to adjust for case-mix in an attempt to avoid clinicians or trusts being unfairly penalized for treating higher-risk patients. Historical industrial quality control procedures have been recommended, such as Shewhart control charts and cumulative sum (CUSUM) techniques. Rarer adverse events may require cumulative monitoring over time rather than, for example, examination of annual data. The need for systems to monitor clinical performance has been brought into particular focus by the report of the Bristol Royal Infirmary Inquiry and the finding that general practitioner Harold Shipman murdered over 200 of his patients. The use of this and related techniques deserves further attention in the context of prospectively monitoring adverse clinical outcomes.Īdverse clinical outcomes, general practitioners, monitoring, mortality, paediatric and adult cardiac surgery, risk-adjustment, sequential probability ratio tests The risk-adjusted sequential probability test is simple to implement, can be applied in a variety of contexts, and might have been useful to detect specific instances of past divergent performance. The cardiac surgeons showed no significant deviation from expected performance.Ĭonclusions. Using reasonable boundaries, the procedure could have indicated an ‘alarm’ in Bristol after publication of the 1991 Cardiac Surgical Register, and in 1985 or 1997 for Harold Shipman depending on the data source and the comparator. Patients aged 65 years and over under the care of Harold Shipman between 19, patients under 1 year of age undergoing paediatric heart surgery in Bristol Royal Infirmary between 19, adult patients receiving cardiac surgery from a team of cardiac surgeons in London, UK. Retrospective analysis of three longitudinal datasets. To investigate the use of the risk-adjusted sequential probability ratio test in monitoring the cumulative occurrence of adverse clinical outcomes.ĭesign.
