Real-world Evidence From the First Online Healthcare Analytics Platform-livingstone: Validation of Its Descriptive Epidemiology Module

Benjamin R. Heywood, Christopher Ll. Morgan, Thomas R. Berni, Darren R. Summers, Bethan I. Jones, Sara Jenkins-Jones, Sarah E. Holden, Lauren D. Riddick, Harry Fisher, James D. Bateman, Christian A. Bannister, John Threlfall, Aron Buxton, Christopher P. Shepherd, Elgan R. Mathias, Rhiannon K. Thomason, Ellen Hubbuck, Craig J. Currie


Incidence and prevalence are key epidemiological determinants characterizing the quantum of a disease. We compared incidence and prevalence estimates derived automatically from the first ever online, essentially real-time, healthcare analytics platform-Livingstone against findings from comparable peer-reviewed studies in order to validate the descriptive epidemiology module. The source of routine NHS data for Livingstone was the Clinical Practice Research Datalink (CPRD). After applying a general search strategy looking for any disease or condition, 76 relevant studies were first retrieved, of which 10 met pre-specified inclusion and exclusion criteria. Findings reported in these studies were compared with estimates produced automatically by Livingstone. The published reports described elements of the epidemiology of 14 diseases or conditions.


Two of the most commonly used metrics characterising the descriptive epidemiology of any disease, condition or clinical intervention are their incidence and prevalence. Rassen and colleagues explained some of the many technical challenges involved in deriving these parameters for chronic diseases [1]. Lifelong conditions are technically the easiest to characterise because once an individual is diagnosed with a disease they remain in the pool of prevalent cases, and only their first recorded event is incident. More complicated to characterise are the incidence and prevalence of acute or chronic conditions that do not have a lifelong duration. For instance, in determining the epidemiology of acute cough, it is not obvious whether two cough diagnoses recorded 12 weeks apart are two distinct, incident events or represent a chronic cough [2]. This can lead to differing estimates where researchers have used different case definitions. Accounting for these considerations in an automated analytical system to produce reliable, replicable descriptive epidemiology requires standardised methods for eliciting and capturing user requirements, plus algorithmic decision rules.

Materials and method

Livingstone is a cloud-based analytics platform that analyses complex healthcare data in near-real time [4]. Livingstone presents technical and non-technical users with analytical tools enabling the rapid production of complex health intelligence. Livingstone allows the user to create code lists through browsable clinical dictionaries or to upload existing code lists. Such lists can then be used to define and select a study cohort, which may then be further refined, if necessary, based upon detailed real-time exploration of various patient characteristics. The final study cohort is then analysed by Livingstone to produce the epidemiological findings. A corresponding cost module is also available, calculating the resource use and financial costs of general practice contacts, prescribed drugs and devices, outpatient attendances and inpatient admissions. Other modules are either in development or planned.


From the initial search of the PubMed archive, 76 studies were retrieved (S1 Table), of which 10 met our pre-specified criteria and were compared with estimates from Livingstone. These comparator studies are detailed in Table 1. S2 Table summarises the reasons why studies were eliminated.


This study compared incidence and prevalence estimates for a range of diseases derived from published studies with those generated automatically by Livingstone, an online, cloud-based, analytical platform. For comparison of estimates for the most recent years the concordance of prevalence was near perfect (1.00), and for incidence it was substantial (0.96).

Whilst both sets of estimates were derived from routine NHS data, they did not necessarily use the same data sources, thus we did not anticipate replicating published estimates precisely. The estimates derived from Livingstone were based on the combined CPRD Aurum (EMIS) and GOLD (Vision) datasets, but only two of the comparator studies used the same combined data [12,13]. The remaining studies used either CPRD GOLD alone or THIN, with both sources derived from primary care practices using Vision software

Citation: Heywood BR, Morgan CL, Berni TR, Summers DR, Jones BI, Jenkins-Jones S, et al. (2023) Real-world evidence from the first online healthcare analytics platform-Livingstone. Validation of its descriptive epidemiology module. PLOS Digit Health 2(7): e0000310.

Editor: Nan Liu, Duke-NUS Medical School, SINGAPORE

Received: January 16, 2023; Accepted: June 26, 2023; Published: July 25, 2023

Copyright: © 2023 Heywood et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The underlying data are available from the United Kingdom Medical and Healthcare Products Regulatory Agency (

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Harvard Medical School - Leadership in Medicine Southeast Asia47th IHF World Hospital CongressHealthcare CNO Summit - USAHealthcare CMO Summit - USA