Building computers able to answer questions on any subject is a long standing
goal of artificial intelligence. Promising progress has recently been achieved
by methods that learn to map questions to logical forms or database queries.
Such approaches can be effective but at the cost of either large amounts of
human-labeled data or by defining lexicons and grammars tailored by
practitioners. In this paper, we instead take the radical approach of learning
to map questions to vectorial feature representations. By mapping answers into
the same space one can query any knowledge base independent of its schema,
without requiring any grammar or lexicon. Our method is trained with a new
optimization procedure combining stochastic gradient descent followed by a
fine-tuning step using the weak supervision provided by blending automatically
and collaboratively generated resources. We empirically demonstrate that our
model can capture meaningful signals from its noisy supervision leading to
major improvements over paralex, the only existing method able to be trained on
similar weakly labeled data.