%0 Journal Article %J J Am Med Inform Assoc %D 2016 %T A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD). %A Wu, Yonghui %A Denny, Joshua C %A Rosenbloom, S Trent %A Miller, Randolph A %A Giuse, Dario A %A Wang, Lulu %A Blanquicett, Carmelo %A Soysal, Ergin %A Xu, Jun %A Xu, Hua %X

OBJECTIVE: The goal of this study was to develop a practical framework for recognizing and disambiguating clinical abbreviations, thereby improving current clinical natural language processing (NLP) systems' capability to handle abbreviations in clinical narratives.

METHODS: We developed an open-source framework for clinical abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of abbreviations, and (3) profile-based word sense disambiguation methods for clinical abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system.Results and Conclusion CARD detected 27 317 and 107 303 distinct abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache's clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap's performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at https://sbmi.uth.edu/ccb/resources/abbreviation.htm We believe the CARD framework can be a valuable resource for improving abbreviation identification in clinical NLP systems.

%B J Am Med Inform Assoc %8 2016 Aug 18 %G eng %R 10.1093/jamia/ocw109 %0 Journal Article %J Can Fam Physician %D 2015 %T Identifying patients with asthma in primary care electronic medical record systems Chart analysis-based electronic algorithm validation study. %A Xi, Nancy %A Wallace, Rebecca %A Agarwal, Gina %A Chan, David %A Gershon, Andrea %A Gupta, Samir %K Adult %K Aged %K Algorithms %K Asthma %K Data Accuracy %K electronic health records %K Female %K Humans %K Male %K Middle Aged %K Ontario %K Primary Health Care %K Pulmonary Disease, Chronic Obstructive %K Registries %K Retrospective Studies %K Sensitivity and Specificity %X

OBJECTIVE: To develop and test a variety of electronic medical record (EMR) search algorithms to allow clinicians to accurately identify their patients with asthma in order to enable improved care.

DESIGN: A retrospective chart analysis identified 5 relevant unique EMR information fields (electronic disease registry, cumulative patient profile, billing diagnostic code, medications, and chart notes); asthma-related search terms were designated for each field. The accuracy of each term was tested for its ability to identify the asthma patients among all patients whose charts were reviewed. Increasingly sophisticated search algorithms were then designed and evaluated by serially combining individual searches with Boolean operators.

SETTING: Two large academic primary care clinics in Hamilton, Ont.

PARTICIPANTS: Charts for 600 randomly selected patients aged 16 years and older identified in an initial EMR search as likely having asthma (n = 150), chronic obstructive pulmonary disease (n = 150), other respiratory conditions (n = 150), or nonrespiratory conditions (n = 150) were reviewed until 100 patients per category were identified (or until all available names were exhausted). A total of 398 charts were reviewed in full and included.

MAIN OUTCOME MEASURES: Sensitivity and specificity of each search for asthma diagnosis (against the reference standard of a physician chart review-based diagnosis).

RESULTS: Two physicians reviewed the charts identified in the initial EMR search using a standardized data collection form and ascribed the following diagnoses in 398 patients: 112 (28.1%) had asthma, 81 (20.4%) had chronic obstructive pulmonary disease, 104 (26.1%) had other respiratory conditions, and 101 (25.4%) had nonrespiratory conditions. Concordance between reviewers in chart abstraction diagnosis was high (κ = 0.89, 95% CI 0.80 to 0.97). Overall, the algorithm searching for patients who had asthma in their cumulative patient profiles or for whom an asthma billing code had been used was the most accurate (sensitivity of 90.2%, 95% CI 87.3% to 93.1%; specificity of 83.9%, 95% CI 80.3% to 87.5%).

CONCLUSION: Usable, practical search algorithms that accurately identify patients with asthma in existing EMRs are presented. Clinicians can apply 1 of these algorithms to generate asthma registries for targeted quality improvement initiatives and outcome measurements. This methodology can be emulated for other diseases.

%B Can Fam Physician %V 61 %P e474-83 %8 2015 Oct %G eng %N 10 %0 Journal Article %J J Am Med Inform Assoc %D 2015 %T OpenFDA: an innovative platform providing access to a wealth of FDA's publicly available data. %A Kass-Hout, Taha A %A Xu, Zhiheng %A Mohebbi, Matthew %A Nelsen, Hans %A Baker, Adam %A Levine, Jonathan %A Johanson, Elaine %A Bright, Roselie A %X

OBJECTIVE: The objective of openFDA is to facilitate access and use of big important Food and Drug Administration public datasets by developers, researchers, and the public through harmonization of data across disparate FDA datasets provided via application programming interfaces (APIs).

MATERIALS AND METHODS: Using cutting-edge technologies deployed on FDA's new public cloud computing infrastructure, openFDA provides open data for easier, faster (over 300 requests per second per process), and better access to FDA datasets; open source code and documentation shared on GitHub for open community contributions of examples, apps and ideas; and infrastructure that can be adopted for other public health big data challenges.

RESULTS: Since its launch on June 2, 2014, openFDA has developed four APIs for drug and device adverse events, recall information for all FDA-regulated products, and drug labeling. There have been more than 20 million API calls (more than half from outside the United States), 6000 registered users, 20,000 connected Internet Protocol addresses, and dozens of new software (mobile or web) apps developed. A case study demonstrates a use of openFDA data to understand an apparent association of a drug with an adverse event.

CONCLUSION: With easier and faster access to these datasets, consumers worldwide can learn more about FDA-regulated products.

%B J Am Med Inform Assoc %8 2015 Dec 7 %G eng %R 10.1093/jamia/ocv153 %0 Journal Article %J J Digit Imaging %D 2012 %T Integration of the Image-Guided Surgery Toolkit (IGSTK) into the Medical Imaging Interaction Toolkit (MITK). %A Lu, Tong %A Liang, Ping %A Wu, Wen-Bo %A Xue, Jin %A Lei, Cheng-Long %A Li, Yin-Yan %A Sun, Yun-Na %A Liu, Fang-Yi %X The development cycle of an image-guided surgery navigation system is too long to meet current clinical needs. This paper presents an integrated system developed by the integration of two open-source software (IGSTK and MITK) to shorten the development cycle of the image-guided surgery navigation system and save human resources simultaneously. An image-guided surgery navigation system was established by connecting the two aforementioned open-source software libraries. It used the Medical Imaging Interaction Toolkit (MITK) as a framework providing image processing tools for the image-guided surgery navigation system of medical imaging software with a high degree of interaction and used the Image-Guided Surgery Toolkit (IGSTK) as a library that provided the basic components of the system for location, tracking, and registration. The electromagnetic tracking device was used to measure the real-time position of surgical tools and fiducials attached to the patient's anatomy. IGSTK was integrated into MITK; at the same time, the compatibility and the stability of this system were emphasized. Experiments showed that an integrated system of the image-guided surgery navigation system could be developed in 2 months. The integration of IGSTK into MITK is feasible. Several techniques for 3D reconstruction, geometric analysis, mesh generation, and surface data analysis for medical image analysis of MITK can connect with the techniques for location, tracking, and registration of IGSTK. This integration of advanced modalities can decrease software development time and emphasize the precision, safety, and robustness of the image-guided surgery navigation system. %B J Digit Imaging %8 2012 Apr 26 %G eng %R 10.1007/s10278-012-9477-3 %0 Conference Paper %B Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on %D 2012 %T Research and implementation of Electronic Medical Records Editing System based on CDA %A Lu Xiaoqi %A Yu Ning %A Gu Yu %A Jia Weitao %K CDA document structure %K CDA standards %K clinical document architecture %K document handling %K electronic medical records editing system %K EMR %K health level 7 %K HIS %K HL7 %K hospital information systems %K information exchange %K medical information systems %K regional medical system establishment %X It is difficult to exchange information between heterogeneous Hospital Information Systems (HIS). It becomes the obstacle of regional medical system establishment. The coming forth of HL7 (Health Level 7) CDA (Clinical Document Architecture) standard provides the technical basic conditions. This paper describes CDA standards and CDA document structure. Through a complete analysis of Electronic Medical Record (EMR), the model of Electronic Medical Records Editing Subsystem is given. Based on this model, functional modules of Electronic Medical Records Editing Subsystem are designed and implemented. Through Electronic Medical Records Editing Subsystem, clinical documents can effectively produce, parse, validate and view. It provides a technical support for exchanges of medical and health information. %B Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on %8 april %G eng %U http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6201451&contentType=Conference+Publications&queryText%3DResearch+and+implementation+of+Electronic+Medical+Records+Editing+System+based+on+CDA %R 10.1109/CECNet.2012.6201451 %0 Journal Article %J Bioinformatics %D 2011 %T DDN: a caBIG® analytical tool for differential network analysis. %A Zhang, Bai %A Tian, Ye %A Jin, Lu %A Li, Huai %A Shih, Ie-Ming %A Madhavan, Subha %A Clarke, Robert %A Hoffman, Eric P %A Xuan, Jianhua %A Hilakivi-Clarke, Leena %A Wang, Yue %K Animals %K Computational Biology %K Epigenesis, Genetic %K Female %K Gene Regulatory Networks %K Mammary Glands, Animal %K Rats %K Software %K Systems Biology %X

UNLABELLED: Differential dependency network (DDN) is a caBIG® (cancer Biomedical Informatics Grid) analytical tool for detecting and visualizing statistically significant topological changes in transcriptional networks representing two biological conditions. Developed under caBIG®'s In Silico Research Centers of Excellence (ISRCE) Program, DDN enables differential network analysis and provides an alternative way for defining network biomarkers predictive of phenotypes. DDN also serves as a useful systems biology tool for users across biomedical research communities to infer how genetic, epigenetic or environment variables may affect biological networks and clinical phenotypes. Besides the standalone Java application, we have also developed a Cytoscape plug-in, CytoDDN, to integrate network analysis and visualization seamlessly.

AVAILABILITY: The Java and MATLAB source code can be downloaded at the authors' web site http://www.cbil.ece.vt.edu/software.htm.

%B Bioinformatics %V 27 %P 1036-8 %8 2011 Apr 1 %G eng %N 7 %R 10.1093/bioinformatics/btr052