J Lipid Atheroscler. 2021 Sep;10(3):282-290. English.
Published online Jul 13, 2021.
Copyright © 2021 The Korean Society of Lipid and Atherosclerosis.
Review

Prospect of Artificial Intelligence Based on Electronic Medical Record

Suehyun Lee,1,2 and Hun-Sung Kim3,4
    • 1Department of Biomedical Informatics, College of Medicine, Konyang University, Daejeon, Korea.
    • 2Health Care Data Science Center, Konyang University Hospital, Daejeon, Korea.
    • 3Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Seoul, Korea.
    • 4Division of Endocrinology and Metabolism, Department of Internal Medicine, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
Received April 01, 2021; Revised June 04, 2021; Accepted July 05, 2021.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

With the advent of the big data era, the interest of the international community is focusing on increasing the utilization of medical big data. Many hospitals are attempting to increase the efficiency of their operations and patient management by adopting artificial intelligence (AI) technology that enables the use of electronic medical record (EMR) data. EMR includes information about a patient's health history, such as diagnoses, medicines, tests, allergies, immunizations, treatment plans, personalized medical care, and improvement of medical quality and safety. EMR data can also be used for AI-based new drug development. In particular, it is effective to develop AI that can predict the occurrence of specific diseases or provide individualized customized treatments by classifying the individualized characteristics of patients. In order to improve performance of artificial intelligence research using EMR data, standardization and refinement of data are essential. In addition, since EMR data deal with sensitive personal information of patients, it is also vital to protect the patient's privacy. There are already various supports for the use of EMR data in the Korean government, and researchers are encouraged to be proactive.

Keywords
Artificial intelligence; Electronic medical record; Big data; Efficiency; Organizational; Patient-centered care; Personalized medicine

INTRODUCTION

In recent years, especially in the era of big data and artificial intelligence (AI), the demand for the use of medical big data is increasing, and the international community's attention is also focused on this endeavor.1 The electronic medical record (EMR), which stores all medical processes such as patient reception, examination, blood test, medication, surgery, and medical expenses, is evaluated as the most reliable medical data in the healthcare system.2, 3 With the development of artificial intelligence (AI) the use of large-scale medical data and the necessity of individualized customized treatments are highlighted. Thus, further use of AI in collating EMR data is critical.4 For this reason, in the United States, not only has the use of EMR increased, but efforts are on to improve the quality of EMR.5 Through Health Information Technology for Economic and Clinical Health (HITECH), an investment plan to improve the poor informatization situation of US medical institutions was specified, and for this purpose, a certification system was created to introduce an accredited EMR.6 In Korea, in the era of big data and AI, many researchers are interested in structuring and standardizing data in order to properly use medical data clinically, and are trying to develop guidelines or certification standards for the same.5

Many researchers have attempted to conduct various studies or AI algorithms using EMR data. However, in order to increase the clinical use-value of EMR data, not only the data collection stage but also the quality of data must be thoroughly managed.7 Because EMR data are sensitive information containing confidential patient information, an appropriate balance is required between data security and clinical use.8 Therefore, researchers should consider the items to be prepared or considered for various AI studies using EMR data.

TYPES OF MEDICAL DATA IN THE EMR

The EMR systematically stores and manages each patient's health information record, personal information, medical information such as medical/family history, drug reaction, health status, medical examination, and admission/discharge records in a database (DB) format.9 In general, EMR data can be classified into structured data, semi-structured data, and unstructured data according to the degree of structuring (Fig. 1).2, 7, 10

Fig. 1
Classification of medical data (structured, semi-structured, unstructured data).10
EMR, electronic medical record; CT, computed tomography; MRI, magnetic resonance imaging; ECG, electrocardiogram.

1. Structured data

“Structured data” denote data that are structured to be stored according to a predetermined format and structure. A typical example are the data entered in a specified format (numerical value, date, etc.) or selected as an item in a fixed field in the EMR system mainly used in hospitals. In addition, as various metadata, personal information (name, age, physical information, etc.) and information related to data generation (creating organization, creator, creation date, etc.) may be included. Structured data are known to be relatively easy to use for research purposes or to make an AI prediction model, but a strict operational definition of the disease is required, and data quality management is essential.2, 7 There are already ongoing discussions on the merits and limitations of data quality management, and various methodological solutions have been proposed to overcome limitation of data quality management. Structured data are expected to have the highest utilization value at present.

2. Semi-structured data

“Semi-Structured data” are data whose format and structure can be changed. It is a file format that provides structural information of the data along with the data itself. In general, text is classified as unstructured data, but there are many cases in which there is a regular pattern in the contents of the text. This class of data is classified as semi-structured data. In the medical image readout report, the medical staff provides a brief information about the patient, such as the procedure, smoking status, chronic disease, and pain level, as well as the test/diagnosis results in text format or comments.10

3. Unstructured data

“Unstructured data” are data that lack a defined structure. They are difficult to define because of their irregular shapes. Typically, text and images are correspond to unstructured data.10 In the medical field, radiographic images or photo data are classified as unstructured data, including various types of video data, such as coronary angiography or various ultrasound images, and various types of image (picture, photo) data, such as computed tomography (CT), magnetic resonance imaging (MRI), or electrocardiogram (ECG).10 Unlike text data, because data quality management is not required for image data, there is an advantage that a large amount of data can be easily obtained, except when personal information protection or privacy issues should be considered. In recent years, the anonymization of image information, which is the area where AI algorithms can be created the fastest in the future. In fact, as a result of applying machine learning technology to the picture archiving and communication system (PACS), it was possible to increase productivity in the medical field by replacing the existing manual image reading work with AI to aid in decision-making.11

Many studies are being conducted in the case of text data, but because of its inherent unpredictable irregularity, it still has various limitations.12 Before processing AI machine learning algorithms, unstructured data must be standardized (normalization), and unnecessary or redundant data must be filtered. Data in the same format must be generated through such preprocessing, and various AI analyses are possible thereafter.13 Medical records written in the text by medical staff exist as unstructured data, and medical notes recorded in natural language and medical abbreviations are technically difficult to analyze, so they have been excluded from many previous studies; however, they have been attempted in some recent studies.14

APPLICATIONS OF EMR

Numerous hospitals around the world are already actively adopting AI technology and trying to convert into “smart” hospitals.15, 16 This is expected to increase the efficiency of hospital operations, patient management, and treatment.15 Through AI, it would be possible to accurately classify diseases, reclassify preexisting disease categories according to individual characteristics, quickly analyze images and medical data in EMR, and provide appropriate services (Fig. 2).17 With the emergence of a medical AI integrated platform that can implement many medical algorithms, AI has become essential for the creation of new services, such as improvement of medical quality and real-time health management.18

Fig. 2
Application of electronic medical record.22
IoT, internet of things.

1. New drug development using AI

To use medical data for new drug development, data standardization is essential. In the case of EMR data, various attempts have been made to standardize data, such as the “EMR certification system,”19 but there is no standard model for clinical trial data for new drug development, so there is a limitation in data utilization. In addition, clinical trial data of each institution are managed through different systems, limiting the use of data. A national-level clinical trial data standard model that combines EMR and clinical trial data should be developed, and a foundation laid for sharing all experiences. Various studies are underway to establish a nationwide clinical trial data utilization system through the spread of standard models for hospitals and guidelines for exchanging standard data for clinical trials for new drug development.19

2. Personalized/customized healthcare management service

Personalized medical service is a comprehensive and scientific analysis of a variety of medical information such as a patient's individual disease and treatment history, personal health record, genetic characteristics, daily life patterns, or eating habits, and ultimately provides an optimized and customized diagnosis/treatment for the individual.10 It is possible to predict future diseases of patients through personalized medical services; a representative example is IBM Watson, an AI program.14 In fact, it provides cancer treatment by using gene big data, and it has currently been introduced in several hospitals in Korea and applied to patient care.10

3. Improving medical quality and safety

The Korean Ministry of Health and Welfare (MOHW) conducts a “pilot system for EMR of nationally certified hospitals” to apply EMR data to the clinical field of a medical institution and evaluate its clinical utility. The Korea Health Information Service (KHIS) is also conducting standardizations of hospital/clinic EMR.20 Ultimately, the results will be used to establish national medical information policies in the future. This can contribute to and ensure patient safety and continuity of treatment and strengthen the interoperability of medical information, thereby creating a health management ecosystem based on the medical information of the patient.21

AI USING EMR

Various AI studies have already been conducted using the EMR. Based on 3 years of EMR data, a cardiac arrest algorithm was developed by noting blood pressure, pulse rate, respiration rate, and body temperature to cope with emergency situations.23 The AI program for predicting the incidence of diabetes mellitus (DM) based on routine health checkup records showed 95% accuracy.24 The AI algorithm was also developed to predict the risk of developing essential hypertension using EMR data.25

One study developed deep-learning-based artificial intelligence algorithm (DLA) predicting cardiac arrest that validated using ECG.23 They used 47,505 ECGs of 25,672 adult patients, from October 2016 to September 2019. The areas under the receiver operating characteristic curves of the DLA in predicting cardiac arrest within 24 hours were 0.913 and 0.948, respectively. This study was the first study to develop and verify DLA for cardiac arrest using electrocardiography, indicating that deep learning algorithms, one of the powerful tools of artificial intelligence, can grasp sensitive ECG changes through heart failure prediction.

Recently, AI research using EMR data has been actively used for effective hypertension management. This study predicts the risk of hypertension through AI. This is a study to evaluate and prevent the risk of hypertension with AI in order to reduce the prevalence of hypertension while reducing medical expenses for management.25 In addition, a new machine learning model was developed and validated to predict heart failure (HF) risk using patient data. A model was developed to predict the risk of HF by integrating clinical variables, experimental values, and electrocardiogram variables in type 2 diabetes patients, and the effectiveness was verified.26 As described above, AI studies related to cardiovascular disease and diabetes using EMR data have been conducted recently (Table 1).23, 25, 26, 27, 28, 29

Table 1
Cases of AI research on cardiovascular diseases and diabetes using EMR

VARIOUS POLICIES FOR THE USE OF EMR DATA

In the case of the United States of America, the Precision Medicine Initiative (PMI) was declared in 2015, and it has begun to establish voluntary PMI cohort of more than 1 million people.30 As the first way to establish a cohort, medical information, examination information, and biobanks held by existing medical institutions or research institutions were linked. To this end, the US created the HITECH Act in 2009 and is working to establish an interoperable medical record system nationwide.6

In Korea, large-scale investments in biohealth Research and Development, such as the development of innovative new drugs and medical devices, are planned,31 and big data platforms are being built, including national bio big data with a scale of up to 1 million people.32 In addition, the MOHW opened a health and medical big data platform in 2019 that links the information system of four public institutions (National Health Insurance Service, Health Insurance Review and Assessment Service, Korea Disease Control and Prevention Agency, and National Cancer Center) to target data for researchers for the public interest.33 To support the improvement of medical technology and the development of new drugs based on clinical data, a “healthcare data-centered hospital support project” is being promoted to plan a healthcare platform in the private sector.20, 34 In addition, the KHIS is implementing a certification system to secure interoperability of medical information and improve quality through verification of conformity with national standards for the function of the EMR system.34

CONSIDERATIONS WHEN RESEARCHING EMR DATA

AI algorithm development is not possible simply by collecting and utilizing EMR data. For the development of AI algorithms, knowledge of the medical and analysis domains must be harmoniously reflected. Appropriate data preprocessing technology is also essential for generating high-quality artificial intelligence algorithms according to the purpose of use. No matter how well the algorithm is developed, it is difficult to expect excellent performance without structured, good quality EMR data.7 This is why many researchers, not only medical staff but also data scientists, have teamed up to pay attention to medical big data.

From a macroscopic point of view, data standardization and domain knowledge-based refinements are also essential. In particular, in the case of data built in a single institution, there is a possibility that the artificial intelligence learning model learns the biased features that the data have. In this case, the learning model should be evaluated using data from other institutions or data generated in completely different periods.35

Researchers should also pay attention to the protection of personal information, which is a recent issue. This is because the patient's medical information has personal sensitive characteristics, and providing it to a third party may infringe on the privacy of the individual. The pseudonym information, which was processed by the “three data-related bills” opening plan to prevent identification of a specific individual, can now be used without the consent of the individual.36 It is now possible to link and integrate big data from different institutions, and then de-identify it and provide it to private researchers. However, when researching through this, consultations between hospitals should be made in advance.37 Naturally, it is necessary to familiarize yourself with the “Guideline for Pseudonymization”38 or “Guideline for Utilization of Healthcare Data”39 in advance.

CONCLUSIONS

Various studies and projects have already been conducted to utilize EMR data. In recent years, EMR data have been used in various ways as a data source for AI, and several studies are being conducted on the methodology for additional multi-center expansion.40 However, in order to develop AI that can be used clinically, not only is the correct collection of data essential but also various efforts and policies for clinical use are required.

Ultimately, to properly use EMR data for clinical research purposes, it is better to secure the data and check its characteristics in advance so that it can be used for new medical research. In addition, considering the characteristics of the medical field, while using medical data, research should be conducted with a sense of legal and ethical responsibility of the researcher.36 Only when all of these things are well harmonized and operated, will the use of EMR data be valuable, and consequently, it will be able to contribute to improving the medical services and ultimately, the health of the patients.

Notes

Funding:This study was supported by the Daewoong Pharmaceutical company.

Conflict of Interests:The authors have no conflicts of interest to declare.

Author Contributions:

  • Conceptualization: Lee S, Kim HS.

  • Methodology: Kim HS.

  • Writing - original draft: Lee S, Kim HS.

  • Writing - review & editing: Kim HS.

References

    1. Na JY. ‘Medical Bigdata’… Attention as a key resource in the era of the Fourth Industrial Revolution [Internet]. Seoul: BIOTIMES; 2020 May 27; [cited 2021 Feb 24].
    1. Kim HS, Kim JH. Proceed with caution when using real world data and real world evidence. J Korean Med Sci 2019;34:e28
    1. Kim HS, Lee S, Kim JH. Real-world evidence versus randomized controlled trial: clinical research based on electronic medical records. J Korean Med Sci 2018;33:e213
    1. Shin JW. Task for collecting and enhancing the utilization of electronic medical records. Health Welf Policy Forum 2018;262:29–38.
    1. Shin SY. Adoption of certification system for upgrading electronic medical records (EMR). HIRA Policy Trend 2018;12:17–23.
    1. Hoggle L. The Health Information Technology for Economic and Clinical Health (HITECH) Act and Nutrition Inclusion in Medicare/Medicaid Electronic Health Records: leveraging policy to support nutrition care. J Acad Nutr Diet 2012;112:1935–1940.
    1. Kim HS, Kim DJ, Yoon KH. Medical big data is not yet available: why we need realism rather than exaggeration. Endocrinol Metab (Seoul) 2019;34:349–354.
    1. Jeun YJ. EMR system and patient medical information protection. Korean J Health Serv Manag 2013;7:213–224.
    1. Jung KH, Park SC, Shim WH. Analysis of next-generation EMR technology. Proc Inf Process Soc Conf 2012;19:916–919.
    1. Lee MY, Park YS, Kim MH, Lee JW. Classification and analysis of medical data according to the level of data standardization. Inf Commun Mag 2014;31:57–63.
    1. Lee C, Kim SM, Choi Y. Case analysis for introduction of machine learning technology to the mining industry. Tunn Undergr Space 2019;29:1–11.
    1. Koleck TA, Dreisbach C, Bourne PE, Bakken S. Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review. J Am Med Inform Assoc 2019;26:364–379.
    1. Ahn YA, Cho HJ. Hospital system model for personalized medical service. J Korea Converg Soc 2017;8:77–84.
    1. Choi YS. Google's artificial intelligence predicts treatment results with medical records [Internet]. place unknown: Choi Yoon-Sup's Healthcare Innovation; 2018 Mar 14; [cited 2021 Feb 24].
    1. Uslu BÇ, Okay E, Dursun E. Analysis of factors affecting IoT-based smart hospital design. J Cloud Comput (Heidelb) 2020;9:67.
    1. Amisha MP, Malik P, Pathania M, Rathaur VK. Overview of artificial intelligence in medicine. J Family Med Prim Care 2019;8:2328–2331.
    1. Tian S, Yang W, Le Grange JM, Wang P, Huang W, Ye Z. Smart healthcare: making medical care more intelligent. Glob Health J 2019;3:62–65.
    1. Lee KY, Kim JH, Kim HC. KHIDI Brief. Current status and challenges of medical artificial intelligence [Internet]. Cheongju: Korea Health Industry Development Institute; 2016 Aug 22; [cited 2021 Feb 24].
    1. Bae BJ. Discovering opportunities for new drug development and patient treatment: how medical real-data can be used. HIRA Policy Trend 2020;14:15–20.
    1. Korea Health Information Service. Standardization of hospital/clinic electric medical record (EMR) [Internet]. Seoul: Korea Health Information Service; 2021 [cited 2021 Feb 24].
    1. Ministry of Health and Welfare. Introducing a pilot system for electronic medical records of nationally certified hospitals [Internet]. Sejong: Korea Policy Briefing; 2020 Nov 13; [cited 2021 Feb 24].
    1. Seo YH. Artificial intelligence, key to healthcare future [Internet]. Seongnam: Software Policy & Research Institute; 2016 Jul 19; [cited 2021 Feb 24].
    1. Kwon JM, Kim KH, Jeon KH, Lee SY, Park J, Oh BH. Artificial intelligence algorithm for predicting cardiac arrest using electrocardiography. Scand J Trauma Resusc Emerg Med 2020;28:98.
    1. Nomura A, Yamamoto S, Hayakawa Y, Taniguchi K, Higashitani T, Aono D, et al. SAT-LB121 development of a machine-learning method for predicting new onset of diabetes mellitus: a retrospective analysis of 509,153 annual specific health checkup records. J Endocr Soc 2020;4:SAT-LB121
    1. Ye C, Fu T, Hao S, Zhang Y, Wang O, Jin B, et al. Prediction of incident hypertension within the next year: prospective study using statewide electronic health records and machine learning. J Med Internet Res 2018;20:e22
    1. Segar MW, Vaduganathan M, Patel KV, McGuire DK, Butler J, Fonarow GC, et al. Machine learning to predict the risk of incident heart failure hospitalization among patients with diabetes: the WATCH-DM risk score. Diabetes Care 2019;42:2298–2306.
    1. Fan Y, Li Y, Li Y, Feng S, Bao X, Feng M, et al. Development and assessment of machine learning algorithms for predicting remission after transsphenoidal surgery among patients with acromegaly. Endocrine 2020;67:412–422.
    1. Choi BG, Rha SW, Kim SW, Kang JH, Park JY, Noh YK. Machine learning for the prediction of new-onset diabetes mellitus during 5-year follow-up in non-diabetic patients with cardiovascular risks. Yonsei Med J 2019;60:191–199.
    1. Somnay YR, Craven M, McCoy KL, Carty SE, Wang TS, Greenberg CC, et al. Improving diagnostic recognition of primary hyperparathyroidism with machine learning. Surgery 2017;161:1113–1121.
    1. Wagner JK, Peltz-Rauchman C, Rahm AK, Johnson CC. Precision engagement: the PMI’s success will depend on more than genomes and big data. Genet Med 2017;19:620–624.
    1. Kim SG. The Ministry of Health and Welfare will provide 787.8 billion won in research and development (R&D) budget for 2021 to respond to infectious diseases, foster the bio-health industry, and address high-burden diseases [Internet]. Sejong: Ministry of Health and Welfare; 2020 Dec 15; [cited 2021 Feb 24].
    1. Cha HS, Jung JM, Shin SY, Jang YM, Park P, Lee JW, et al. The Korea Cancer Big Data Platform (K-CBP) for cancer research. Int J Environ Res Public Health 2019;16:2290.
    1. Choi J, Nam T, Cho RM. Issues related to the public use of healthcare big data and medical platform: focusing on the implementation of the Healthcare Big Data Platform pilot project. J Gov Stud 2020;15:139–176.
    1. Jung YC, Shin JW, Lee KH. Information and statistics policy in health and welfare. Health Welf Policy Forum 2020;279:66–80.
    1. Lee SH, Kim JY. Artificial Intelligence Technology Trends Based on Medical Big Data. Inf Commun Mag 2020;37:85–91.
    1. Korea Legislation Research Institute. Personal Information Protection Act [Internet]. Seoul: Korea Legislation Research Institute; 2020 [cited 2021 Mar 13].
    1. Lee KH, Kim KH. A study on the contents and limitation of guidelines for utilization of healthcare data. Hannam J Law Technol 2020;26:89–118.
    1. Personal Information Protection Commission. Processing of personal information alias and combination of alias information [Internet]. Seoul: Personal Information Protection Commission; 2020 Sep 03; [cited 2021 Feb 24].
    1. Ministry of Health and Welfare. Guidelines for health care data utilization [Internet]. Sejong: Ministry of Health and Welfare; 2020 Sep 25; [cited 2021 Feb 24].
    1. Lee CS, Kim JE, No SH, Kim TH, Yoon KH, Jeong CW. Construction of artificial intelligence training platform for multi-center clinical research. KIPS Trans Comput Commun Syst 2020;9:239–246.

Metrics
Share
Figures

1 / 2

Tables

1 / 1

PERMALINK