首页|Study Data from Carnegie Mellon University Provide New Insights into Machine Lea rning (An Empirical Investigation of the Role of Pre-training In Lifelong Learni ng)
Study Data from Carnegie Mellon University Provide New Insights into Machine Lea rning (An Empirical Investigation of the Role of Pre-training In Lifelong Learni ng)
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News-Researchers detail new data in Machine Learning. According to news reporting from Pittsburgh, Pennsylvania, by NewsRx journalists, research stated, "The lifelong learning paradigm in machine learnin g is an attractive alternative to the more prominent isolated learning scheme no t only due to its resemblance to biological learning but also its potential to r educe energy waste by obviating excessive model re-training. A key challenge to this paradigm is the phenomenon of catastrophic forgetting." Financial supporters for this research include DSO National Laboratories, Canada CIFAR AI Chair, Natural Sciences and Engineering Research Council of Canada (NS ERC). The news correspondents obtained a quote from the research from Carnegie Mellon University, "With the increasing popularity and success of pre-trained models in machine learning, we pose the question: What role does pre-training play in lif elong learning, specifically with respect to catastrophic forgetting? We investi gate existing methods in the context of large, pre-trained models and evaluate t heir performance on a variety of text and image classification tasks, including a large-scale study using a novel data set of 15 diverse NLP tasks. Across all s ettings, we observe that generic pre-training implicitly alleviates the effects of catastrophic forgetting when learning multiple tasks sequentially compared to randomly initialized models. We then further investigate why pre-training allev iates forgetting in this setting. We study this phenomenon by analyzing the loss landscape, finding that pre-trained weights appear to ease forgetting by leadin g to wider minima. Based on this insight, we propose jointly optimizing for curr ent task loss and loss basin sharpness to explicitly encourage wider basins duri ng sequential fine-tuning."
PittsburghPennsylvaniaUnited StatesNorth and Central AmericaCyborgsEmerging TechnologiesMachine LearningCa rnegie Mellon University