Language to Network: Conditional Parameter Adaptation with Natural Language Descriptions

Transfer learning using ImageNet pre-trained models has been the de facto approach in a wide range of computer vision tasks. However, fine-tuning still requires task-specific training data. In this paper, we propose \textbf{N3} (\textbf{N}eural \textbf{N}etworks from \textbf{N}atural Language) - a new paradigm of synthesizing task-specific neural networks from language descriptions and a generic pre-trained model. \textbf{N3} leverages language descriptions to generate parameter adaptations as well as a new task-specific classification layer for a pre-trained neural network, effectively {``}fine-tuning{''} the network for a new task using only language descriptions as input. To the best of our knowledge, \textbf{N3} is the first method to synthesize entire neural networks from natural language. Experimental results show that \textbf{N3} can out-perform previous natural-language based zero-shot learning methods across 4 different zero-shot image classification benchmarks. We also demonstrate a simple method to help identify keywords in language descriptions leveraged by \textbf{N3} when synthesizing model parameters.

PDF Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here