首页|Faculty of Computer Science Researcher Highlights Research in Machine Learning (Automated Text Annotation Using a Semi- Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection)

Faculty of Computer Science Researcher Highlights Research in Machine Learning (Automated Text Annotation Using a Semi- Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection)

扫码查看
Investigators publish new report on artificial intelligence. According to news reporting out of Krakow, Poland, by NewsRx editors, research stated, “Text annotation is an essential element of the natural language processing approaches. The manual annotation process performed by humans has various drawbacks, such as subjectivity, slowness, fatigue, and possibly carelessness.” The news journalists obtained a quote from the research from Faculty of Computer Science: “In addition, annotators may annotate ambiguous data. Therefore, we have developed the concept of automated annotation to get the best annotations using several machine-learning approaches. The proposed approach is based on an ensemble algorithm of meta-learners and meta-vectorizer techniques. The approach employs a semi-supervised learning technique for automated annotation to detect hate speech. This involves leveraging various machine learning algorithms, including Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neighbors (KNN), and Naive Bayes (NB), in conjunction with Word2Vec and TF-IDF text extraction methods. The annotation process is performed using 13,169 Indonesian YouTube comments data. The proposed model used a Stemming approach using data from Sastrawi and new data of 2245 words. Semi-supervised learning uses 5%, 10%, and 20% of labeled data compared to performing labeling based on 80% of the datasets. In semi-supervised learning, the model learns from the labeled data, which provides explicit information, and the unlabeled data, which offers implicit insights. This hybrid approach enables the model to generalize and make informed predictions even when limited labeled data is available (based on self-learning). Ultimately, this enhances its ability to handle real-world scenarios with scarce annotated information.”

Faculty of Computer ScienceKrakowPolandEuropeAlgorithmsCyborgsEmerging TechnologiesMachine LearningSupervised Learning

2024

Robotics & Machine Learning Daily News

Robotics & Machine Learning Daily News

ISSN:
年,卷(期):2024.(Feb.8)
  • 59