no code implementations • 27 Sep 2023 • Haichao Yu, Yu Tian, Sateesh Kumar, Linjie Yang, Heng Wang
DataComp is a new benchmark dedicated to evaluating different methods for data filtering.
1 code implementation • 23 Jul 2023 • Yiming Cui, Linjie Yang, Haichao Yu
Transformer-based detection and segmentation methods use a list of learned detection queries to retrieve information from the transformer network and learn to predict the location and category of one specific object from each query.
1 code implementation • ICCV 2023 • Cheng-En Wu, Yu Tian, Haichao Yu, Heng Wang, Pedro Morgado, Yu Hen Hu, Linjie Yang
Vision-language models such as CLIP learn a generic text-image embedding from large-scale training data.
no code implementations • 21 Jun 2023 • YuHan Shen, Linjie Yang, Longyin Wen, Haichao Yu, Ehsan Elhamifar, Heng Wang
Recent focus in video captioning has been on designing architectures that can consume both video and text modalities, and using large-scale video datasets with text transcripts for pre-training, such as HowTo100M.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 30 Nov 2022 • Haichao Yu, Haoxiang Li, Gang Hua, Gao Huang, Humphrey Shi
To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.
no code implementations • 13 Oct 2021 • Haichao Yu, Zhe Chen, Dong Lin, Gil Shamir, Jie Han
Dropout has been commonly used to quantify prediction uncertainty, i. e, the variations of model predictions on a given input example.
no code implementations • 16 May 2021 • Haichao Yu, Linjie Yang, Humphrey Shi
Post-training quantization methods use a set of calibration data to compute quantization ranges for network parameters and activations.
no code implementations • 14 Sep 2020 • Haichao Yu, Ning Xu, Zilong Huang, Yuqian Zhou, Humphrey Shi
Image matting is a key technique for image and video editing and composition.
no code implementations • CVPR 2020 • Hanchao Yu, Shanhui Sun, Haichao Yu, Xiao Chen, Honghui Shi, Thomas Huang, Terrence Chen
In clinical deployment, however, they suffer dramatic performance drops due to mismatched distributions between training and testing datasets, commonly encountered in the clinical environment.
2 code implementations • 17 Nov 2019 • Haichao Yu, Haoxiang Li, Honghui Shi, Thomas S. Huang, Gang Hua
When all layers are set to low-bits, we show that the model achieved accuracy comparable to dedicated models trained at the same precision.