Reference : Automated Demarcation of Requirements in Textual Specifications: A Machine Learning-B...
Scientific journals : Article
Engineering, computing & technology : Computer science
Security, Reliability and Trust
http://hdl.handle.net/10993/43584
Automated Demarcation of Requirements in Textual Specifications: A Machine Learning-Based Approach
English
Abualhaija, Sallam mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Arora, Chetan []
Sabetzadeh, Mehrdad mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Briand, Lionel mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Traynor, Michael mailto []
2020
Empirical Software Engineering
Yes
International
[en] Natural-language Requirements ; Requirements Identification and Classification ; Machine Learning ; Natural language processing
[en] A simple but important task during the analysis of a textual
requirements specification is to determine which statements in the
specification represent requirements. In principle, by following
suitable writing and markup conventions, one can provide an immediate
and unequivocal demarcation of requirements at the time a specification
is being developed. However, neither the presence nor a fully accurate
enforcement of such conventions is guaranteed. The result is that, in
many practical situations, analysts end up resorting to after-the-fact
reviews for sifting requirements from other material in a requirements
specification. This is both tedious and time-consuming.

We propose an automated approach for demarcating requirements in
free-form requirements specifications. The approach, which is based on
machine learning, can be applied to a wide variety of specifications in
different domains and with different writing styles. %The approach is
push-button, requiring no user-provided parameters before it can process
a given specification. We train and evaluate our approach over an
independently labeled dataset comprised of 33 industrial requirements
specifications. Over this dataset, our approach yields an average
precision of 81.2% and an average recall of 95.7%. Compared to simple
baselines that demarcate requirements based on the presence of modal
verbs and identifiers, our approach leads to an average gain of 16.4%
in precision and 25.5% in recall. We collect and analyze expert
feedback on the demarcations produced by our approach for industrial
requirements specifications. The results indicate that experts find our
approach useful and efficient in practice. We
developed a prototype tool, named DemaRQ, in support of our approach. To
facilitate replication, we make available to the research community this
prototype tool alongside the non-proprietary portion of our training
data.
QRA Corp ; Fonds National de la Recherche Luxembourg ; H2020 European Research Council ; NSERC
http://hdl.handle.net/10993/43584
FnR ; FNR12632261 > Mehrdad Sabetzadeh > EQUACS > Early QUality Assurance of Critical Systems > 01/01/2019 > 31/12/2021 > 2018

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
Abualhaija20.pdfPublisher postprint4.2 MBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.