2 code implementations • 16 Jan 2025 • Wanqi Yin, Zhongang Cai, Ruisi Wang, Ailing Zeng, Chen Wei, Qingping Sun, Haiyi Mei, Yanjun Wang, Hui En Pang, Mingyuan Zhang, Lei Zhang, Chen Change Loy, Atsushi Yamashita, Lei Yang, Ziwei Liu
To exclude the influence of algorithmic design, we base our experiments on two minimalist architectures: SMPLer-X, which consists of an intermediate step for hand and face localization, and SMPLest-X, an even simpler version that reduces the network to its bare essentials and highlights significant advances in the capture of articulated hands.
no code implementations • 5 Dec 2024 • Zhouyingcheng Liao, Mingyuan Zhang, Wenjia Wang, Lei Yang, Taku Komura
While motion generation has made substantial progress, its practical application remains constrained by dataset diversity and scale, limiting its ability to handle out-of-distribution scenarios.
no code implementations • 22 Oct 2024 • Ryuma Nakahata, Shehtab Zaman, Mingyuan Zhang, Fake Lu, Kenneth Chiu
We present PtychoFormer, a hierarchical transformer-based model for data-driven single-shot ptychographic phase retrieval.
1 code implementation • 30 Aug 2024 • Mingyuan Zhang, Zhicheng Zhang, Hao Wu, Yong Wang
We present flow matching for reaction coordinates (FMRC), a novel deep learning algorithm designed to identify optimal reaction coordinates (RC) in biomolecular reversible dynamics.
no code implementations • 8 Jul 2024 • Xinying Guo, Mingyuan Zhang, Haozhe Xie, Chenyang Gu, Ziwei Liu
Crowd Motion Generation is essential in entertainment industries such as animation and games as well as in strategic fields like urban simulation and planning.
Ranked #1 on
Motion Generation
on KIL-ML
no code implementations • 5 Jul 2024 • Zhikun Zhang, Yiting Duan, Xiangjun Wang, Mingyuan Zhang
This paper proposes a novel fast online methodology for outlier detection called the exception maximization outlier detection method(EMODM), which employs probabilistic models and statistical algorithms to detect abnormal patterns from the outputs of complex systems.
1 code implementation • 8 Apr 2024 • Xingyu Zheng, Xianglong Liu, Haotong Qin, Xudong Ma, Mingyuan Zhang, Haojie Hao, Jiakai Wang, Zixiang Zhao, Jinyang Guo, Michele Magno
From the optimization perspective, a Low-rank Representation Mimicking (LRM) is applied to assist the optimization of binarized DMs.
no code implementations • 1 Apr 2024 • Mingyuan Zhang, Daisheng Jin, Chenyang Gu, Fangzhou Hong, Zhongang Cai, Jingfang Huang, Chongzhi Zhang, Xinying Guo, Lei Yang, Ying He, Ziwei Liu
In this work, we present Large Motion Model (LMM), a motion-centric, multi-modal framework that unifies mainstream motion generation tasks into a generalist model.
1 code implementation • 1 Feb 2024 • Mingyuan Zhang, Shivani Agarwal
Most work on learning from noisy labels has focused on standard loss-based performance measures.
no code implementations • 16 Jan 2024 • Chongzhi Zhang, Mingyuan Zhang, Zhiyang Teng, Jiayi Li, Xizhou Zhu, Lewei Lu, Ziwei Liu, Aixin Sun
Our method involves the direct generation of a global 2D temporal map via a conditional denoising diffusion process, based on the input video and language query.
no code implementations • NeurIPS 2023 • Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Xiao Ma, Liang Pan, Ziwei Liu
Generating animation of physics-based characters with intuitive control has long been a desirable task with numerous applications.
1 code implementation • NeurIPS 2023 • Mingyuan Zhang, Huirong Li, Zhongang Cai, Jiawei Ren, Lei Yang, Ziwei Liu
Notably, FineMoGen further enables zero-shot motion editing capabilities with the aid of modern large language models (LLM), which faithfully manipulates motion sequences with fine-grained instructions.
Ranked #6 on
Motion Synthesis
on KIT Motion-Language
no code implementations • CVPR 2024 • Zhongang Cai, Jianping Jiang, Zhongfei Qing, Xinying Guo, Mingyuan Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan, Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren, Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu
In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment.
Ranked #3 on
Motion Synthesis
on InterHuman
2 code implementations • NeurIPS 2023 • Zhongang Cai, Wanqi Yin, Ailing Zeng, Chen Wei, Qingping Sun, Yanjun Wang, Hui En Pang, Haiyi Mei, Mingyuan Zhang, Lei Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
1) For the data scaling, we perform a systematic investigation on 32 EHPS datasets, including a wide range of scenarios that a model trained on any single dataset cannot handle.
Ranked #2 on
3D Human Pose Estimation
on UBody
no code implementations • 5 Sep 2023 • Mingyuan Zhang, Ambuj Tewari
In online ranking, a learning algorithm sequentially ranks a set of items and receives feedback on its ranking in the form of relevance scores.
no code implementations • 28 Aug 2023 • Zhongang Cai, Liang Pan, Chen Wei, Wanqi Yin, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
To tackle these challenges, we propose a principled framework, PointHPS, for accurate 3D HPS from point clouds captured in real-world settings, which iteratively refines point features through a cascaded architecture.
1 code implementation • ICCV 2023 • Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu
However, the performance on more diverse motions remains unsatisfactory.
Ranked #2 on
Motion Synthesis
on KIT Motion-Language
1 code implementation • 26 Jan 2023 • Haotong Qin, Mingyuan Zhang, Yifu Ding, Aoyu Li, Zhongang Cai, Ziwei Liu, Fisher Yu, Xianglong Liu
Network binarization emerges as one of the most promising compression approaches offering extraordinary computation and memory savings by minimizing the bit-width.
2 code implementations • 31 Aug 2022 • Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, Ziwei Liu
Instead of a deterministic language-motion mapping, MotionDiffuse generates motions through a series of denoising steps in which variations are injected.
Ranked #25 on
Motion Synthesis
on KIT Motion-Language
1 code implementation • 17 May 2022 • Fangzhou Hong, Mingyuan Zhang, Liang Pan, Zhongang Cai, Lei Yang, Ziwei Liu
Our key insight is to take advantage of the powerful vision-language model CLIP for supervising neural human generation, in terms of 3D geometry, texture and animation.
no code implementations • 28 Apr 2022 • Zhongang Cai, Daxuan Ren, Ailing Zeng, Zhengyu Lin, Tao Yu, Wenjia Wang, Xiangyu Fan, Yang Gao, Yifan Yu, Liang Pan, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
4D human sensing and modeling are fundamental tasks in vision and graphics with numerous applications.
1 code implementation • CVPR 2022 • Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu
Data imbalance exists ubiquitously in real-world visual regressions, e. g., age estimation and pose estimation, hurting the model's generalizability and fairness.
1 code implementation • ICLR 2022 • Haotong Qin, Yifu Ding, Mingyuan Zhang, Qinghua Yan, Aishan Liu, Qingqing Dang, Ziwei Liu, Xianglong Liu
The large pre-trained BERT has achieved remarkable performance on Natural Language Processing (NLP) tasks but is also computation and memory expensive.
no code implementations • 14 Oct 2021 • Zhongang Cai, Mingyuan Zhang, Jiawei Ren, Chen Wei, Daxuan Ren, Zhengyu Lin, Haiyu Zhao, Lei Yang, Chen Change Loy, Ziwei Liu
Specifically, we contribute GTA-Human, a large-scale 3D human dataset generated with the GTA-V game engine, featuring a highly diverse set of subjects, actions, and scenarios.
1 code implementation • 29 Sep 2021 • Ricardo Bigolin Lanfredi, Mingyuan Zhang, William F. Auffermann, Jessica Chan, Phuong-Anh T. Duong, Vivek Srikumar, Trafton Drew, Joyce D. Schroeder, Tolga Tasdizen
Furthermore, a small subset of the data contains readings from all radiologists, allowing for the calculation of inter-rater scores.
no code implementations • 29 Sep 2021 • Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu
Compared to imbalanced and long-tailed classification, imbalanced regression has its unique challenges as the regression label space can be continuous, boundless, and high-dimensional.
1 code implementation • ICCV 2021 • Daxuan Ren, Jianmin Zheng, Jianfei Cai, Jiatong Li, Haiyong Jiang, Zhongang Cai, Junzhe Zhang, Liang Pan, Mingyuan Zhang, Haiyu Zhao, Shuai Yi
Generating an interpretable and compact representation of 3D shapes from point clouds is an important and challenging problem.
1 code implementation • CVPR 2022 • Chongzhi Zhang, Mingyuan Zhang, Shanghang Zhang, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Xianglong Liu, Ziwei Liu
By comprehensively investigating these GE-ViTs and comparing with their corresponding CNN models, we observe: 1) For the enhanced model, larger ViTs still benefit more for the OOD generalization.
no code implementations • 1 Jan 2021 • Hangfeng He, Mingyuan Zhang, Qiang Ning, Dan Roth
Real-world applications often require making use of {\em a range of incidental supervision signals}.
no code implementations • 23 Dec 2020 • Daisheng Jin, Xiao Ma, Chongzhi Zhang, Yizhuo Zhou, Jiashu Tao, Mingyuan Zhang, Haiyu Zhao, Shuai Yi, Zhoujun Li, Xianglong Liu, Hongsheng Li
We observe that during training, the relationship proposal distribution is highly imbalanced: most of the negative relationship proposals are easy to identify, e. g., the inaccurate object detection, which leads to the under-fitting of low-frequency difficult proposals.
no code implementations • 15 Dec 2020 • Jiawei Ren, Cunjun Yu, Zhongang Cai, Mingyuan Zhang, Chongsong Chen, Haiyu Zhao, Shuai Yi, Hongsheng Li
Panoptic segmentation aims at generating pixel-wise class and instance predictions for each pixel in the input image, which is a challenging task and far more complicated than naively fusing the semantic and instance segmentation results.
Ranked #11 on
Panoptic Segmentation
on COCO test-dev
no code implementations • NeurIPS 2020 • Mingyuan Zhang, Shivani Agarwal
When H is the class of linear models, the class F consists of certain piecewise linear scoring functions that are characterized by the same number of parameters as in the linear case, and minimization over which can be performed using an adaptation of the min-pooling idea from neural network training.
1 code implementation • ICLR 2021 • Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Liu, Hao Su
To alleviate the resource constraint for real-time point cloud applications that run on edge devices, in this paper we present BiPointNet, the first model binarization approach for efficient deep learning on point clouds.
no code implementations • ICML 2020 • Mingyuan Zhang, Harish G. Ramaswamy, Shivani Agarwal
In particular, the F-measure explicitly balances recall (fraction of active labels predicted to be active) and precision (fraction of labels predicted to be active that are actually so), both of which are important in evaluating the overall performance of a multi-label classifier.
2 code implementations • EMNLP 2021 • Hangfeng He, Mingyuan Zhang, Qiang Ning, Dan Roth
Real-world applications often require improved models by leveraging a range of cheap incidental supervision signals.
14 code implementations • 4 Dec 2018 • Zhuoran Shen, Mingyuan Zhang, Haiyu Zhao, Shuai Yi, Hongsheng Li
Dot-product attention has wide applications in computer vision and natural language processing.
Ranked #2 on
Extractive Text Summarization
on GovReport