Lesion detection is an important problem within medical imaging analysis. Most previous work focuses on detecting and segmenting a specialized category of lesions (e.g., lung nodules). However, in clinical practice, radiologists are responsible for finding all possible types of anomalies. The task of universal lesion detection (ULD) was proposed to address this challenge by detecting a large variety of lesions from the whole body. There are multiple heterogeneously labeled datasets with varying label completeness: DeepLesion, the largest dataset of 32,735 annotated lesions of various types, but with even more missing annotation instances; and several fully-labeled single-type lesion datasets, such as LUNA for lung nodules and LiTS for liver tumors. In this work, we propose a novel framework to leverage all these datasets together to improve the performance of ULD. First, we learn a multi-head multi-task lesion detector using all datasets and generate lesion proposals on DeepLesion. Second, missing annotations in DeepLesion are retrieved by a new method of embedding matching that exploits clinical prior knowledge. Last, we discover suspicious but unannotated lesions using knowledge transfer from single-type lesion detectors. In this way, reliable positive and negative regions are obtained from partially-labeled and unlabeled images, which are effectively utilized to train ULD. To assess the clinically realistic protocol of 3D volumetric ULD, we fully annotated 1071 CT sub-volumes in DeepLesion. Our method outperforms the current state-of-the-art approach by 29% in the metric of average sensitivity.