no code implementations • 12 Feb 2025 • Zhenxing Mi, Kuan-Chieh Wang, Guocheng Qian, Hanrong Ye, Runtao Liu, Sergey Tulyakov, Kfir Aberman, Dan Xu
Without complex training and datasets, ThinkDiff effectively unleashes understanding, reasoning, and composing capabilities in diffusion models.
1 code implementation • 1 Feb 2025 • Jie Zhang, Kuan-Chieh Wang, Bo-Wei Chiu, Min-Te Sun
Recent advances in deep learning have established Transformer architectures as the predominant modeling paradigm.
Ranked #3 on
Long-range modeling
on LRA
no code implementations • 2 Jan 2025 • Gaurav Parmar, Or Patashnik, Kuan-Chieh Wang, Daniil Ostashev, Srinivasa Narasimhan, Jun-Yan Zhu, Daniel Cohen-Or, Kfir Aberman
A key challenge in this task is to preserve the identity of the objects depicted in the input visual prompts, while also generating diverse compositions across different images.
no code implementations • 12 Dec 2024 • Guocheng Qian, Kuan-Chieh Wang, Or Patashnik, Negin Heravi, Daniil Ostashev, Sergey Tulyakov, Daniel Cohen-Or, Kfir Aberman
Our approach uses a few-to-many identity reconstruction training paradigm, where a limited set of input images is used to reconstruct multiple target images of the same individual in various poses and expressions.
no code implementations • 15 Nov 2024 • Jaewoo Heo, Kuan-Chieh Wang, Karen Liu, Serena Yeung-Levy
Motion capture technologies have transformed numerous fields, from the film and gaming industries to sports science and healthcare, by providing a tool to capture and analyze human movement in great detail.
no code implementations • 1 Oct 2024 • Laura Bravo-Sánchez, Jaewoo Heo, Zhenzhen Weng, Kuan-Chieh Wang, Serena Yeung-Levy
Social dynamics in close human interactions pose significant challenges for Human Mesh Estimation (HME), particularly due to the complexity of physical contacts and the scarcity of training data.
1 code implementation • 13 Jun 2024 • Amil Dravid, Yossi Gandelsman, Kuan-Chieh Wang, Rameen Abdal, Gordon Wetzstein, Alexei A. Efros, Kfir Aberman
First, sampling a set of weights from this space results in a new model encoding a novel identity.
no code implementations • 17 Apr 2024 • Kuan-Chieh Wang, Daniil Ostashev, Yuwei Fang, Sergey Tulyakov, Kfir Aberman
MoA is designed to retain the original model's prior by fixing its attention layers in the prior branch, while minimally intervening in the generation process with the personalized branch that learns to embed subjects in the layout and context generated by the prior branch.
no code implementations • 15 Dec 2023 • Purvi Goel, Kuan-Chieh Wang, C. Karen Liu, Kayvon Fatahalian
Text-to-motion diffusion models can generate realistic animations from text prompts, but do not support fine-grained motion editing controls.
no code implementations • 10 Dec 2023 • Orr Zohar, Alejandro Lozano, Shelly Goel, Serena Yeung, Kuan-Chieh Wang
We exploit the inherent connection between classes in application-driven datasets and introduce a novel method, Foundation Object detection Model for the Open world, or FOMO, which identifies unknown objects based on their shared attributes with the base known objects.
1 code implementation • 14 Sep 2023 • James Burgess, Kuan-Chieh Wang, Serena Yeung-Levy
We conclude that since the view token controls the 3D `rendering' viewpoint, there is likely a scene representation embedded in frozen 2D diffusion models.
no code implementations • ICCV 2023 • Jeffrey Gu, Kuan-Chieh Wang, Serena Yeung
Neural fields, which represent signals as a function parameterized by a neural network, are a promising alternative to traditional discrete vector or grid-based representations.
1 code implementation • NeurIPS 2023 • Orr Zohar, Shih-Cheng Huang, Kuan-Chieh Wang, Serena Yeung
As the number of open-source VLM variants increases, there is a need for an efficient model selection strategy that does not require access to a curated evaluation dataset.
1 code implementation • 8 Feb 2023 • Yuhui Zhang, Jeff Z. HaoChen, Shih-Cheng Huang, Kuan-Chieh Wang, James Zou, Serena Yeung
Our proposed method can discover high-error data slices, identify influential attributes and further rectify undesirable model behaviors, without requiring any visual data.
no code implementations • CVPR 2023 • Kuan-Chieh Wang, Zhenzhen Weng, Maria Xenochristou, João Pedro Araújo, Jeffrey Gu, Karen Liu, Serena Yeung
Empirically, we show that NeMo can recover 3D motion in sports using videos from the Penn Action dataset, where NeMo outperforms existing HMR methods in terms of 2D keypoint detection.
1 code implementation • 28 Dec 2022 • Kuan-Chieh Wang, Zhenzhen Weng, Maria Xenochristou, Joao Pedro Araujo, Jeffrey Gu, C. Karen Liu, Serena Yeung
Empirically, we show that NeMo can recover 3D motion in sports using videos from the Penn Action dataset, where NeMo outperforms existing HMR methods in terms of 2D keypoint detection.
1 code implementation • CVPR 2023 • Orr Zohar, Kuan-Chieh Wang, Serena Yeung
The resulting Probabilistic Objectness transformer-based open-world detector, PROB, integrates our framework into traditional object detection models, adapting them for the open-world setting.
1 code implementation • 21 Jun 2022 • Zhenzhen Weng, Kuan-Chieh Wang, Angjoo Kanazawa, Serena Yeung
The ability to perceive 3D human bodies from a single image has a multitude of applications ranging from entertainment and robotics to neuroscience and healthcare.
1 code implementation • NeurIPS 2021 • Jixuan Wang, Kuan-Chieh Wang, Frank Rudzicz, Michael Brudno
Large pretrained language models (LMs) like BERT have improved performance in many disparate natural language processing (NLP) tasks.
1 code implementation • NeurIPS 2021 • Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani
In this work, we provide a probabilistic interpretation of model inversion attacks, and formulate a variational objective that accounts for both diversity and accuracy.
1 code implementation • 29 Dec 2021 • Christina M. Funke, Paul Vicol, Kuan-Chieh Wang, Matthias Kümmerer, Richard Zemel, Matthias Bethge
Exploiting such correlations may increase predictive performance on noisy data; however, often correlations are not robust (e. g., they may change between domains, datasets, or applications) and models that exploit them do not generalize when correlations shift.
no code implementations • 1 Jan 2021 • Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel
In this work, we consider a realistic setting where the relationship between examples can change from episode to episode depending on the task context, which is not given to the learner.
no code implementations • 10 Dec 2020 • Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel
Despite impressive progress in deep learning, generalizing far beyond the training distribution is an important open challenge.
1 code implementation • 16 Jun 2020 • Jens Behrmann, Paul Vicol, Kuan-Chieh Wang, Roger Grosse, Jörn-Henrik Jacobsen
For problems where global invertibility is necessary, such as applying normalizing flows on OOD data, we show the importance of designing stable INN building blocks.
1 code implementation • ICML 2020 • Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel
We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data.
4 code implementations • ICLR 2020 • Will Grathwohl, Kuan-Chieh Wang, Jörn-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, Kevin Swersky
In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y).
no code implementations • 25 Sep 2019 • Jens Behrmann, Paul Vicol, Kuan-Chieh Wang, Roger B. Grosse, Jörn-Henrik Jacobsen
Guarantees in deep learning are hard to achieve due to the interplay of flexible modeling schemes and complex tasks.
no code implementations • 25 Sep 2019 • Kuan-Chieh Wang, Paul Vicol, Eleni Triantafillou, Chia-Cheng Liu, Richard Zemel
In this work, we propose tasks for out-of-distribution detection in the few-shot setting and establish benchmark datasets, based on four popular few-shot classification datasets.
2 code implementations • 21 Feb 2019 • Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon
Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.
no code implementations • 6 Feb 2019 • Jixuan Wang, Kuan-Chieh Wang, Marc Law, Frank Rudzicz, Michael Brudno
Speaker embedding models that utilize neural networks to map utterances to a space where distances reflect similarity between speakers have driven recent progress in the speaker recognition task.
no code implementations • ICML 2018 • Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel
We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN).
1 code implementation • 27 Jun 2018 • Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel
We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN).
9 code implementations • ICML 2018 • Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, Richard Zemel
Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics.
no code implementations • NeurIPS 2017 • Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel
We start from linear discriminators in which case conjugate duality provides a mechanism to reformulate the saddle point objective into a maximization problem, such that both the generator and the discriminator of this 'dualing GAN' act in concert.