User simulations for evaluating answers to question series

作者:

Highlights:

摘要

Recently, question series have become one focus of research in question answering. These series are comprised of individual factoid, list, and “other” questions organized around a central topic, and represent abstractions of user–system dialogs. Existing evaluation methodologies have yet to catch up with this richer task model, as they fail to take into account contextual dependencies and different user behaviors. This paper presents a novel simulation-based methodology for evaluating answers to question series that addresses some of these shortcomings. Using this methodology, we examine two different behavior models: a “QA-styled” user and an “IR-styled” user. Results suggest that an off-the-shelf document retrieval system is competitive with state-of-the-art QA systems in this task. Advantages and limitations of evaluations based on user simulations are also discussed.

论文关键词:Question answering,Evaluation,User simulations

论文评审过程:Received 20 April 2006, Revised 28 June 2006, Accepted 28 June 2006, Available online 22 August 2006.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.06.006