Male or female: What traits characterize questions prompted by each gender in community question answering?

作者:

Highlights:

• We study discriminant models for automatically recognizing gender across cQA members.

• Textual, demographics, meta-data, and web search features were considered.

• Good non-linguistic indicators were age, industry and 2nd-level question categories.

• Models can still infer them from textual sources via semantic/dependency relations linguistic traits from the question and the self-description were also useful.

摘要

•We study discriminant models for automatically recognizing gender across cQA members.•Textual, demographics, meta-data, and web search features were considered.•Good non-linguistic indicators were age, industry and 2nd-level question categories.•Models can still infer them from textual sources via semantic/dependency relations linguistic traits from the question and the self-description were also useful.

论文关键词:Demographics,User analysis,Question classification,Natural language processing,Community question answering,Large-scale experimentation,Data-driven information systems

论文评审过程:Received 23 May 2017, Revised 19 August 2017, Accepted 20 August 2017, Available online 24 August 2017, Version of Record 31 August 2017.

论文官网地址:https://doi.org/10.1016/j.eswa.2017.08.037