Few-shot learning with adaptively initialized task optimizer: a practical meta-learning approach
作者:Han-Jia Ye, Xiang-Rong Sheng, De-Chuan Zhan
摘要
Considering the data collection and labeling cost in real-world applications, training a model with limited examples is an essential problem in machine learning, visual recognition, etc. Directly training a model on such few-shot learning (FSL) tasks falls into the over-fitting dilemma, which would turn to an effective task-level inductive bias as a key supervision. By treating the few-shot task as an entirety, extracting task-level pattern, and learning a task-agnostic model initialization, the model-agnostic meta-learning (MAML) framework enables the applications of various models on the FSL tasks. Given a training set with a few examples, MAML optimizes a model via fixed gradient descent steps from an initial point chosen beforehand. Although this general framework possesses empirically satisfactory results, its initialization neglects the task-specific characteristics and aggravates the computational burden as well. In this manuscript, we propose our AdaptiVely InitiAlized Task OptimizeR (Aviator) approach for few-shot learning, which incorporates task context into the determination of the model initialization. This task-specific initialization facilitates the model optimization process so that it obtains high-quality model solutions efficiently. To this end, we decouple the model and apply a set transformation over the training set to determine the initial top-layer classifier. Re-parameterization of the first-order gradient descent approximation promotes the gradient back-propagation. Experiments on synthetic and benchmark data sets validate that our Aviator approach achieves the state-of-the-art performance, and visualization results demonstrate the task-adaptive features of our proposed Aviator method.
论文关键词:Few-shot learning, Meta-learning, Supervised-learning, Multi-task learning, Task-specific
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10994-019-05838-7