Existing deep multitask learning (MTL) approaches align layers shared between
tasks in a parallel ordering. Such an organization significantly constricts the
types of shared structure that can be learned...
The necessity of parallel
ordering for deep MTL is first tested by comparing it with permuted ordering of
shared layers. The results indicate that a flexible ordering can enable more
effective sharing, thus motivating the development of a soft ordering approach,
which learns how shared layers are applied in different ways for different
tasks. Deep MTL with soft ordering outperforms parallel ordering methods across
a series of domains. These results suggest that the power of deep MTL comes
from learning highly general building blocks that can be assembled to meet the
demands of each task.