no code implementations • 1 Nov 2023 • Chen Wei, Chenxi Liu, Siyuan Qiao, Zhishuai Zhang, Alan Yuille, Jiahui Yu
We demonstrate text as a strong cross-modal interface.
no code implementations • 22 Jun 2023 • Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor, Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Frank
AudioPaLM inherits the capability to preserve paralinguistic information such as speaker identity and intonation from AudioLM and the linguistic knowledge present only in text large language models such as PaLM-2.
no code implementations • 1 Jun 2023 • Jiachen Li, Xinwei Shi, Feiyu Chen, Jonathan Stroud, Zhishuai Zhang, Tian Lan, Junhua Mao, Jeonhyung Kang, Khaled S. Refaat, Weilong Yang, Eugene Ie, CongCong Li
Accurate understanding and prediction of human behaviors are critical prerequisites for autonomous vehicles, especially in highly dynamic and interactive scenarios such as intersections in dense urban areas.
no code implementations • 8 Feb 2023 • Qingqing Huang, Daniel S. Park, Tao Wang, Timo I. Denk, Andy Ly, Nanxin Chen, Zhengdong Zhang, Zhishuai Zhang, Jiahui Yu, Christian Frank, Jesse Engel, Quoc V. Le, William Chan, Zhifeng Chen, Wei Han
We introduce Noise2Music, where a series of diffusion models is trained to generate high-quality 30-second music clips from text prompts.
Ranked #2 on
Text-to-Music Generation
on MusicCaps
1 code implementation • CVPR 2022 • Qing Liu, Adam Kortylewski, Zhishuai Zhang, Zizhang Li, Mengqi Guo, Qihao Liu, Xiaoding Yuan, Jiteng Mu, Weichao Qiu, Alan Yuille
We believe our dataset provides a rich testbed to study UDA for part segmentation and will help to significantly push forward research in this area.
no code implementations • 1 Dec 2020 • Mengqi Guo, Yutong Bai, Zhishuai Zhang, Adam Kortylewski, Alan Yuille
Specifically, given a training image, we find a set of similar images that show instances of the same object category in the same pose, through an affine alignment of their corresponding feature maps.
no code implementations • CVPR 2020 • Zhishuai Zhang, Jiyang Gao, Junhua Mao, Yukai Liu, Dragomir Anguelov, Cong-Cong Li
For the Waymo Open Dataset, we achieve a bird-eyes-view (BEV) detection AP of 80. 73 and trajectory prediction average displacement error (ADE) of 33. 67cm for pedestrians, which establish the state-of-the-art for both tasks.
no code implementations • 18 Nov 2019 • Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille
Our experimental results demonstrate that the proposed extensions increase the model's performance at localizing occluders as well as at classifying partially occluded objects.
no code implementations • 3 Sep 2019 • Yuyin Zhou, Yingwei Li, Zhishuai Zhang, Yan Wang, Angtian Wang, Elliot Fishman, Alan Yuille, Seyoun Park
Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers with an overall five-year survival rate of 8%.
no code implementations • 23 Jun 2019 • Yuyin Zhou, David Dreizin, Yingwei Li, Zhishuai Zhang, Yan Wang, Alan Yuille
Trauma is the worldwide leading cause of death and disability in those younger than 45 years, and pelvic fractures are a major source of morbidity and mortality.
no code implementations • 28 May 2019 • Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille
In this work, we combine DCNNs and compositional object models to retain the best of both approaches: a discriminative model that is robust to partial occlusion and mask attacks.
1 code implementation • 9 Dec 2018 • Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, Alan Yuille
The critical principle of ghost networks is to apply feature-level perturbations to an existing model to potentially create a huge set of diverse models.
1 code implementation • 28 Nov 2018 • Zhishuai Zhang, Wei Shen, Siyuan Qiao, Yan Wang, Bo wang, Alan Yuille
In this paper, we propose that the robustness of a face detector against hard faces can be improved by learning small faces on hard images.
Ranked #8 on
Face Detection
on WIDER Face (Hard)
1 code implementation • 31 Mar 2018 • Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe
To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.
1 code implementation • CVPR 2019 • Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jian-Yu Wang, Zhou Ren, Alan Yuille
We hope that our proposed attack strategy can serve as a strong benchmark baseline for evaluating the robustness of networks to adversaries and the effectiveness of different defense methods in the future.
1 code implementation • ECCV 2018 • Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo wang, Alan Yuille
We present Deep Co-Training, a deep learning based method inspired by the Co-Training framework.
no code implementations • CVPR 2018 • Zhishuai Zhang, Siyuan Qiao, Cihang Xie, Wei Shen, Bo wang, Alan L. Yuille
Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module.
no code implementations • ICML 2018 • Siyuan Qiao, Zhishuai Zhang, Wei Shen, Bo wang, Alan Yuille
Our method is by introducing computation orderings to the channels within convolutional layers or blocks, based on which we gradually compute the outputs in a channel-wise manner.
no code implementations • 13 Nov 2017 • Jianyu Wang, Zhishuai Zhang, Cihang Xie, Yuyin Zhou, Vittal Premachandran, Jun Zhu, Lingxi Xie, Alan Yuille
We use clustering algorithms to study the population activities of the features and extract a set of visual concepts which we show are visually tight and correspond to semantic parts of vehicles.
2 code implementations • ICLR 2018 • Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille
Convolutional neural networks have demonstrated high accuracy on various tasks in recent years.
no code implementations • CVPR 2018 • Zhishuai Zhang, Cihang Xie, Jian-Yu Wang, Lingxi Xie, Alan L. Yuille
The first layer extracts the evidence of local visual cues, and the second layer performs a voting mechanism by utilizing the spatial relationship between visual cues and semantic parts.
no code implementations • 25 Jul 2017 • Jianyu Wang, Cihang Xie, Zhishuai Zhang, Jun Zhu, Lingxi Xie, Alan Yuille
Our approach detects semantic parts by accumulating the confidence of local visual cues.
2 code implementations • ICCV 2017 • Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, Alan Yuille
Our observation is that both segmentation and detection are based on classifying multiple targets on an image (e. g., the basic target is a pixel or a receptive field in segmentation, and an object proposal in detection), which inspires us to optimize a loss function over a set of pixels/proposals for generating adversarial perturbations.
13 code implementations • 3 Oct 2016 • Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, Rujun Long, Patrick McDaniel
An adversarial example library for constructing attacks, building defenses, and benchmarking both
1 code implementation • 21 Nov 2015 • Jianyu Wang, Zhishuai Zhang, Cihang Xie, Vittal Premachandran, Alan Yuille
We address the key question of how object part representations can be found from the internal states of CNNs that are trained for high-level tasks, such as object classification.