首页|U.S. Environmental Protection Agency (EPA) Reports Findings in Machine Learning (Systematic Approaches for the Encoding of Chemical Groups: A Case Study)

U.S. Environmental Protection Agency (EPA) Reports Findings in Machine Learning (Systematic Approaches for the Encoding of Chemical Groups: A Case Study)

扫码查看
By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News-New research on Machine Learning is th e subject of a report. According to news originating from Research Triangle Park , North Carolina, by NewsRx correspondents, research stated, "Regulatory authori ties aim to organize substances into groups to facilitate prioritization within hazard and risk assessment processes. Often, such chemical groupings are not exp licitly defined by structural rules or physicochemical property information." Our news journalists obtained a quote from the research from U.S. Environmental Protection Agency (EPA), "This is largely due to how these groupings are develop ed, namely, a manual expert curation process, which in turn makes updating and r efining groupings, as new substances are evaluated, a practical challenge. Herei n, machine learning methods were leveraged to build models that could preliminar ily assign substances to predefined groups. A set of 86 groupings containing 2,1 84 substances as published on the European Chemicals Agency (ECHA) website were mapped to the U.S. Environmental Protection Agency (EPA) Distributed Toxicity St ructure Database (DSSTox) content to extract chemical and structural information . Substances were represented using Morgan fingerprints, and two machine learnin g approaches were used to classify test substances into 56 groups containing at least 10 substances with a structural representation in the data set: k-nearest neighbor (kNN) and random forest (RF), that led to mean 5- fold cross-validation test accuracies (average F1 scores) of 0.781 and 0.853, respectively. With a 9% improvement, the RF classifier was significantly more accurate than KNN (-value = 0.001). The approach offers promise as a means of the initial profiling of new substances into predefined groups to facilitate prioritization efforts and stre amline the assessment of new substances when earlier groupings are available." According to the news editors, the research concluded: "The algorithm to fit and use these models has been made available in the accompanying repository, thereb y enabling both use of the produced models and refitting of these models, as new groupings become available by regulatory authorities or industry." This research has been peer-reviewed.

Research Triangle ParkNorth CarolinaUnited StatesNorth and Central AmericaCyborgsEmerging TechnologiesMachin e Learning

2024

Robotics & Machine Learning Daily News

Robotics & Machine Learning Daily News

ISSN:
年,卷(期):2024.(Apr.1)