查看更多>>摘要:The main objective of this study is to generate decision tree (DT) models/graphs (i.e., a type of supervised machine learning (ML) research method) through RapidMiner Studio (i.e., a popular visual platform for predictive analytics). The dataset used in the study contains attributes regarding the car accidents such as 'gender', 'casualty class', 'age group' and 'type of vehicle'. These are important features to decide whether the 'survival chance' of traffic accident patients would be 'high' or 'low'. Therefore, our goal is to apply 'DTs' for predicting the 'survival attribute' with the purpose of identifying high risk groups within the dataset. The resulting 'DTs' show that whenever the attribute 'gender' has the value 'male', and the attribute 'casualty class' has the value 'passenger', and the attribute 'gender' has the value 'male', and the attribute 'age group' has the value 'teenager'; then the 'survival chance' of the traffic/accident patient would be extremely 'low'.
查看更多>>摘要:In the absence of an Amazigh knowledge database structuring biographical information, we propose a methodology to elaborate such database. In this paper, we give a brief overview of the concerned topics while introducing the respective approaches, and we present the process of making our biographical collection. Furthermore, we discuss the challenges we meet in finding the right balance between what we need to collect and what is available online. Our methodology focuses on biographical information extraction from press dispatches that have been annotated with named entities. It is based on linguistic patterns and lexical markers, while adopting 'local grammars' concept.
查看更多>>摘要:Social networks have become a customary news media source in recent times. However, the openness and unrestricted way of sharing information on social networks fosters spreading rumours which may cause severe damages economically, socially, etc. Motivated by this, our paper focuses on the rumour detection problem in Algerian Arabizi. Studying linguistic rules of Algerian Arabizi, we propose a lemmatiser and a parser for analysing and standardising the text to produce better rumour detection models. An approach for classifying rumours and news in social networks based on emotions' expression and users' positions is proposed. The experiments were done on many ngram representations where the best one has reached more than 94% of F-score. In addition to that this research deals with resources creation for Algerian Arabizi which is an under-resourced dialect. A corpus and several lexicons have been built, which can be the subject of other works dealing with this dialect.
查看更多>>摘要:Today, millions of people are suffering from diabetes which contributes to many other lethal diseases i.e., heart, kidney, and nerve damage. Diabetes mellitus is a chronic disease characterised by the body's inability to metabolise glucose, which could be life-threatening. Thus, several researchers have attempted to construct an accurate diabetes predictive model over the years. Big data analytics has played a vital role in healthcare by building predictive models for diabetes mellitus using various machine learning techniques. A large amount of data is collected which opened the opportunity to develop more complex, accurate predictions of the model. This paper aims to discuss various machine learning models to predict diabetes mellitus more accurately over the years. We have conducted a thorough review of the literature on predicting diabetes using PIMA and other datasets, which demonstrates how various machine-learning algorithms can be used to predict diabetes.