RCM-extractor: an automated NLP-based approach for extracting a semi formal representation model from natural language requirements

作者:Aya Zaki-Ismail, Mohamed Osama, Mohamed Abdelrazek, John Grundy, Amani Ibrahim

摘要

Most existing (semi-)automated requirements formalisation techniques assume requirements to be specified in predefined templates. They also employ template-specific transformation rules to provide the corresponding formal representation. Hence, such techniques have limited expressiveness and more importantly require system engineers to re-write their system requirements following defined templates for maintenance and evolution. In this paper, we introduce an automated requirements extraction technique (RCM-Extractor) to automatically extract the key constructs of a comprehensive and formalisable semi-formal representation model from textual requirements. This avoids the expressiveness issues affecting the existing requirement specification templates, and eliminates the need to rewriting the requirements to match the structure of such templates. We evaluated RCM-Extractor on a dataset of 162 requirements curated from several papers in the literature. RCM-Extractor achieved 87% precision, 98% recall, 92% F-measure, and 86% accuracy. In addition, we evaluated the capabilities of RCM-Extractor to extract requirements on a dataset of 15,000 automatically synthesised requirements that are constructed specifically to evaluate our approach. This dataset has a complete coverage of the possible structures and arrangements of the properties that can exist in system requirements. Our approach achieved 57%, 92% and 100% accuracy for un-corrected, partially-corrected and fully-corrected Stanford typed-dependencies representations of the synthesised requirements, respectively.

论文关键词:Requirements extraction, Requirements formalization, Natural-language extraction

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10515-021-00312-y