首页期刊导航|Journal of biomedical informatics.
期刊信息/Journal information
Journal of biomedical informatics.
Academic Press,
Journal of biomedical informatics.

Academic Press,

1532-0464

Journal of biomedical informatics./Journal Journal of biomedical informatics.
正式出版
收录年代

    DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx

    Mehrabi, SaeedKrishnan, AnandSohn, SunghwanRoch, Alexandra M....
    7页
    查看更多>>摘要:In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients' condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx's false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs. (C) 2015 Elsevier Inc. All rights reserved.

    An improved electromagnetism-like mechanism algorithm and its application to the prediction of diabetes mellitus

    Wang, Kung-JengAdrian, Angelia MelaniChen, Kun-HuangWang, Kung-Min...
    10页
    查看更多>>摘要:Recently, the use of artificial intelligence based data mining techniques for massive medical data classification and diagnosis has gained its popularity, whereas the effectiveness and efficiency by feature selection is worthy to further investigate. In this paper, we presents a novel method for feature selection with the use of opposite sign test (OST) as a local search for the electromagnetism-like mechanism (EM) algorithm, denoted as improved electromagnetism-like mechanism (IEM) algorithm. Nearest neighbor algorithm is served as a classifier for the wrapper method. The proposed IEM algorithm is compared with nine popular feature selection and classification methods. Forty-six datasets from the UCI repository and eight gene expression microarray datasets are collected for comprehensive evaluation. Non-parametric statistical tests are conducted to justify the performance of the methods in terms of classification accuracy and Kappa index. The results confirm that the proposed IEM method is superior to the common state-of-art methods. Furthermore, we apply IEM to predict the occurrence of Type 2 diabetes mellitus (DM) after a gestational DM. Our research helps identify the risk factors for this disease; accordingly accurate diagnosis and prognosis can be achieved to reduce the morbidity and mortality rate caused by DM. (C) 2015 Elsevier Inc. All rights reserved.

    Filtering big data from social media - Building an early warning system for adverse drug reactions

    Yang, MingKiang, MelodyShang, Wei
    11页
    查看更多>>摘要:Objectives: Adverse drug reactions (ADRs) are believed to be a leading cause of death in the world. Pharmacovigilance systems are aimed at early detection of ADRs. With the popularity of social media, Web forums and discussion boards become important sources of data for consumers to shake their drug use experience, as a result may provide useful information on drugs and their adverse reactions. In this study, we propose an automated ADR related posts filtering mechanism using text classification methods. In real-life settings, ADR related messages are highly distributed in social media, while non-ADR related messages are unspecific and topically diverse. It is expensive to manually label a large amount of ADR related messages (positive examples) and non-ADR related messages (negative examples) to train classification systems. To mitigate this challenge, we examine the use of a partially supervised learning classification method to automate the process.

    Visual aggregate analysis of eligibility features of clinical trials

    Carini, SimonaHe, ZheSim, IdaWeng, Chunhua...
    15页
    查看更多>>摘要:Objective: To develop a method for profiling the collective populations targeted for recruitment by multiple clinical studies addressing the same medical condition using one eligibility feature each time.

    Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences

    Ali, SafdarMajid, Abdul
    14页
    查看更多>>摘要:The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naive Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostMl, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. (C) 2015 Elsevier Inc. All rights reserved.

    LGscore: A method to identify disease-related genes using biological literature and Google data

    Kim, HyunjinYoon, YoungmiPark, SanghyunKim, Jeongwoo...
    13页
    查看更多>>摘要:Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimer's disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods. (C) 2015 Elsevier Inc. All rights reserved.

    A spline-based tool to assess and visualize the calibration of multiclass risk predictions

    Van Hoorde, K.Van Huffel, S.Timmerman, D.Bourne, T....
    11页
    查看更多>>摘要:When validating risk models (or probabilistic classifiers), calibration is often overlooked. Calibration refers to the reliability of the predicted risks, i.e. whether the predicted risks correspond to observed probabilities. In medical applications this is important because treatment decisions often rely on the estimated risk of disease. The aim of this paper is to present generic tools to assess the calibration of multiclass risk models.

    Clustering clinical models from local electronic health records based on semantic similarity

    Goeg, Kirstine RosenbeckCornet, RonaldAndersen, Stig Kjaer
    11页
    查看更多>>摘要:Background: Clinical models in electronic health records are typically expressed as templates which support the multiple clinical workflows in which the system is used. The templates are often designed using local rather than standard information models and terminology, which hinders semantic interoperability. Semantic challenges can be solved by harmonizing and standardizing clinical models. However, methods supporting harmonization based on existing clinical models are lacking. One approach is to explore semantic similarity estimation as a basis of an analytical framework. Therefore, the aim of this study is to develop and apply methods for intrinsic similarity-estimation based analysis that can compare and give an overview of multiple clinical models.

    A novel neural-inspired learning algorithm with application to clinical risk prediction

    Tay, DarwinPoh, Chueh LooKitney, Richard I.
    10页
    查看更多>>摘要:Clinical risk prediction - the estimation of the likelihood an individual is at risk of a disease - is a coveted and exigent clinical task, and a cornerstone to the recommendation of life saving management strategies. This is especially important for individuals at risk of cardiovascular disease (CVD) given the fact that it is the leading causes of death in many developed counties. To this end, we introduce a novel learning algorithm - a key factor that influences the performance of machine learning-based prediction models - and utilities it to develop CVD risk prediction tool. This novel neural-inspired algorithm, called the Artificial Neural Cell System for classification (ANCSc), is inspired by mechanisms that develop the brain and empowering it with capabilities such as information processing/storage and recall, decision making and initiating actions on external environment. Specifically, we exploit on 3 natural neural mechanisms responsible for developing and enriching the brain - namely neurogenesis, neuroplasticity via nurturing and apoptosis - when implementing ANCSc algorithm. Benchmark testing was conducted using the Honolulu Heart Program (HHP) dataset and results are juxtaposed with 2 other algorithms - i.e. Support Vector Machine (SVM) and Evolutionary Data-Conscious Artificial Immune Recognition System (EDC-AIRS). Empirical experiments indicate that ANCSc algorithm (statistically) outperforms both SVM and EDC-AIRS algorithms. Key clinical markers identified by ANCSc algorithm include risk factors related to diet/lifestyle, pulmonary function, personal/family/medical history, blood data, blood pressure, and electrocardiography. These clinical markers, in general, are also found to be clinically significant - providing a promising avenue for identifying potential cardiovascular risk factors to be evaluated in clinical trials. (C) 2015 Elsevier Inc. All rights reserved.

    Mapping patient path in the Pediatric Emergency Department: A workflow model driven approach

    Ajmi, InesZgaya, HayfaGammoudi, LotfiHammadi, Slim...
    14页
    查看更多>>摘要:The workflow models of the patient journey in a Pediatric Emergency Department (PED) seems to be an effective approach to develop an accurate and complete representation of the PED processes. This model can drive the collection of comprehensive quantitative and qualitative service delivery and patient treatment data as an evidence base for the PED service planning. Our objective in this study is to identify crowded situation indicators and bottlenecks that contribute to over-crowding. The greatest source of delay in patient flow is the waiting time from the health care request, and especially the bed request to exit from the PED for hospital admission. It represented 70% of the time that these patients occupied in the PED waiting rooms. The use of real data to construct the workflow model of the patient path is effective in identifying sources of delay in patient flow, and aspects of the PED activity that could be improved. The development of this model was based on accurate visits made in the PED of the Regional University Hospital Center (CHRU) of Lille (France). This modeling, which has to represent most faithfully possible the reality of the PED of CHRU of Lille, is necessary. It must be detailed enough to produce an analysis allowing to identify the dysfunctions of the PED and also to propose and to estimate prevention indicators of crowded situations. Our survey is integrated into the French National Research Agency (ANR) project, titled: "Hospital: Optimization, Simulation and avoidance of strain" (HOST).(1) (C) 2014 Elsevier Inc. All rights reserved.