Quantifying and alleviating political bias in language models

作者:

摘要

Current large-scale language models can be politically biased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real-world settings. In this paper, we first describe metrics for measuring political bias in GPT-2 generation, and discuss several interesting takeaways: 1) The generation of vanilla GPT-2 model is mostly liberal-leaning, 2) Such political bias depends on the sensitive attributes mentioned in the context, and 3) Priming the generation with a explicit political identifier, the extent of political bias is imbalanced (between liberal and conservative).

论文关键词:Bias in language models,Natural language generation,Political bias,Measuring bias,Mitigating bias

论文评审过程:Received 31 October 2021, Revised 26 December 2021, Accepted 28 December 2021, Available online 3 January 2022, Version of Record 21 January 2022.

论文官网地址:https://doi.org/10.1016/j.artint.2021.103654