TechWord: Development of a technology lexical database for structuring textual technology information based on natural language processing

作者:

Highlights:

• This paper proposes a methodology for designing a TechWord-based lexical database.

• The approach can improve the text mining performance of technological information.

• This paper defines TechWord, a technology lexical information.

• A TechSynset is constructed by network analysis based on word embedding vector.

• A case of the automotive field is illustrated to validate the proposed approach.

摘要

•This paper proposes a methodology for designing a TechWord-based lexical database.•The approach can improve the text mining performance of technological information.•This paper defines TechWord, a technology lexical information.•A TechSynset is constructed by network analysis based on word embedding vector.•A case of the automotive field is illustrated to validate the proposed approach.

论文关键词:Patent mining,Natural language processing,Text mining,Lexical analysis,WordNet

论文评审过程:Received 15 April 2020, Revised 22 August 2020, Accepted 22 September 2020, Available online 25 September 2020, Version of Record 29 September 2020.

论文官网地址:https://doi.org/10.1016/j.eswa.2020.114042