AI-based Approach for Transcribing and Classifying Unstructured Emergency Call Data: a Methodological Proposal

Dalton Breno Costa, Felipe Coelho de Abreu Pinna, Anjni Patel Joiner, Brian Rice, João Vítor Perez de Souza, Júlia Loverde Gabella, Luciano Andrade, João Ricardo Nickenig Vissoci, João Carlos Néto


Emergency care-sensitive conditions (ECSCs) require rapid identification and treatment and are responsible for over half of all deaths worldwide. Prehospital emergency care (PEC) can provide rapid treatment and access to definitive care for many ECSCs and can reduce mortality in several different settings. The objective of this study is to propose a method for using artificial intelligence (AI) and machine learning (ML) to transcribe audio, extract, and classify unstructured emergency call data in the Serviço de Atendimento Móvel de Urgência (SAMU) system in southern Brazil. The study used all “1-9-2” calls received in 2019 by the SAMU Novo Norte Emergency Regulation Center (ERC) call center in Maringá, in the Brazilian state of Paraná.


Emergency care-sensitive conditions (ECSCs), conditions which require rapid identification and treatment, are attributable to over half of all deaths worldwide [1]. Prehospital emergency care (PEC) can provide rapid treatment and access to definitive care for many ECSCs and can reduce mortality in several different settings [2]. Prior to dispatching prehospital resources, emergency calls are typically routed through emergency call centers, which then go through varying levels of standardized or unstandardized questions in order to determine the most appropriate response. Accurately identifying the nature of the emergency and dispatching a correct response are key to optimal resource management and ensuring rapid triage and treatment of ECSCs.

Materials and method

All “1-9-2” calls received in 2019 by the SAMU Novo Norte ERC call center in Maringá, in the Brazilian state of Paraná in 2019 were included in the analysis. Emergency calls are made to a central number by either a person having an emergency or a bystander. Calls to the ERC are answered by nursing technicians who screen and classify calls before transferring them to a physician. The physician then has the option to provide telephone guidance for lower acuity emergencies or dispatch an ambulance crew to the caller. Ambulance choices include either a basic life support unit (with driver and a nurse technician) or an advanced life support unit (with driver, a physician and a registered nurse) [8].


A total of 182,273 audio files were included in analysis, with a mean duration of 157.60 seconds (SD = 123.31), with a range of 0 to 2,076 seconds. A random subset of 10,010 of these calls were manually classified and 2,326 were classified as emergency calls.


Our findings provide insight into existing emergency call data in Brazil and demonstrate the feasibility of training a machine learning model to classify emergencies using Brazilian Portuguese language audio recordings. While our findings are focused more on methods and processes, we did obtain interesting data insights. Emergency care in LMICs generally, and the SAMU system in Brazil specifically, remain largely understudied, and there is value in helping to describe this system. First, it was noteworthy that despite the relatively large number of audio recordings available, less than 25% of them were actual emergency calls.


Through adaptation of open-sourced, freely available English language ASR and NLU models, we were able to transcribe and classify a subset of Portuguese language emergency calls to a ERC in Paraná, Brazil. Our NLU model was highly accurate in differentiating between four disparate chief complaints as an initial use case for the model. We were also able to create a large corpus of artificial sentences to train the NLU model in order to augment a limited dataset of transcribed calls.


The authors acknowledge the SAMU of Maringá, Brazil for providing data used in this study; Duke University’s Compute Cluster for their assistance in data processing; and funding from the Duke Global Health Institute AI Pilot grant. They would also like to thank the research students from the State University of Maringá (UEM) who contributed to this study.

Citation: Costa DB, Pinna FCdA, Joiner AP, Rice B, Souza JVPd, Gabella JL, et al. (2023) AI-based approach for transcribing and classifying unstructured emergency call data: A methodological proposal. PLOS Digit Health 2(12): e0000406.

Editor: Mengyu Wang, Harvard University, UNITED STATES

Received: July 14, 2023; Accepted: November 7, 2023; Published: December 6, 2023

Copyright: © 2023 Costa et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data contains sensitive patient information and cannot be publicly shared as per the rules of the Serviço de Atendimento Móvel de Urgência (SAMU). Inquiries to access data can be forwarded to the Maringá Health Department at or by accessing the website:

Funding: This study was supported by the AI pilot grant from the Duke Global Health Institute (to JRNV). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.


Harvard Medical School - Leadership in Medicine Southeast Asia47th IHF World Hospital CongressHealthcare CNO Summit - USAHealthcare CMO Summit - USA