首页|基于深度学习的自然语言处理技术研究

基于深度学习的自然语言处理技术研究

扫码查看
为保证以英语词汇、中文语料文本为主的自然语言挖掘处理质量,选用词向量模型、改进Apriori模型算法,根据不同词特征间的相关性、将同一类型词映射到某一独立的向量空间,作出单词或词语的词性标注、词汇语块切分、专名识别,随后针对英文或中文语料库的候选项集,使用改进的Apriori动态关联规则算法,对不同类型的数据集作出预处理、支持度分析、置信度分析,将出现频率大于最小支持度的单词或字加入到相应频繁项集,得出网络数据库中不同项目集之间的关联关系,并对挖掘的自然语言文本数据作出清洗(筛选)、提取与分类存储,保证英文或中文分词检索、词性词义识别、图片识别与分类存储的准确率.
Research on Natural Language Processing Technology Based on Deep Learning
In order to ensure the quality of natural language mining processing for English vocabulary and Chinese corpus text,we use word vector model and improved Apriori model algorithm to map the same type of words into an independent vector space according to the correlation between different word features,make lexical annotation,lexical chunking,and onomastic identification of words or phrases,and then use improved Apriori dynamic association rule algorithm to evaluate the candidate items of English or Chinese corpus,and then use the improved Apriori dynamic association rule algorithm to evaluate the candidate items of different types of words and phrases.Subsequently,using the improved Apriori dynamic association rule algorithm,different types of datasets are preprocessed,analyzed in terms of support and confidence,and words or characters with frequency greater than the minimum support are added to the corresponding frequent itemsets,so as to derive the association relationship between different itemsets in the network database,and the mined textual data in natural language are cleaned(screened),extracted and categorized and stored to ensure that the English or Chinese participle retrieval,lexical characterization,and lexical identity of words or phrases can be achieved in an efficient and effective way.We also clean(screen),extract and categorize the mined natural language text data to ensure the accuracy of English or Chinese word segmentation search,lexical meaning recognition,picture recognition and categorization storage.

deep learningword vector modelingimproved Apriori modeling algorithmnatural language processing

谌颃、张袖斌、钟贵

展开 >

广州科技贸易职业学院,广州 511442

深度学习 词向量模型 改进Apriori模型算法 自然语言处理

2024

数码设计

数码设计

ISSN:1672-9129
年,卷(期):2024.(12)