Using Wikipedia to learn semantic feature representations of concrete concepts in neuroimaging experiments

作者:

摘要

In this paper we show that a corpus of a few thousand Wikipedia articles about concrete or visualizable concepts can be used to produce a low-dimensional semantic feature representation of those concepts. The purpose of such a representation is to serve as a model of the mental context of a subject during functional magnetic resonance imaging (fMRI) experiments. A recent study by Mitchell et al. (2008) [19] showed that it was possible to predict fMRI data acquired while subjects thought about a concrete concept, given a representation of those concepts in terms of semantic features obtained with human supervision. We use topic models on our corpus to learn semantic features from text in an unsupervised manner, and show that these features can outperform those in Mitchell et al. (2008) [19] in demanding 12-way and 60-way classification tasks. We also show that these features can be used to uncover similarity relations in brain activation for different concepts which parallel those relations in behavioral data from human subjects.

论文关键词:Wikipedia,Matrix factorization,fMRI,Semantic features

论文评审过程:Available online 10 July 2012.

论文官网地址:https://doi.org/10.1016/j.artint.2012.06.005