Search Results for author: Michael Bi Mi

Found 22 papers, 14 papers with code

GhostRNN: Reducing State Redundancy in RNN with Cheap Operations

no code implementations20 Nov 2024 Hang Zhou, Xiaoxu Zheng, Yunhe Wang, Michael Bi Mi, Deyi Xiong, Kai Han

Recurrent neural network (RNNs) that are capable of modeling long-distance dependencies are widely used in various speech tasks, eg., keyword spotting (KWS) and speech enhancement (SE).

Keyword Spotting Speech Enhancement

Vista3D: Unravel the 3D Darkside of a Single Image

1 code implementation18 Sep 2024 Qiuhong Shen, Xingyi Yang, Michael Bi Mi, Xinchao Wang

We embark on the age-old quest: unveiling the hidden dimensions of objects from mere glimpses of their visible parts.

3D Generation Diversity

Isomorphic Pruning for Vision Models

1 code implementation5 Jul 2024 Gongfan Fang, Xinyin Ma, Michael Bi Mi, Xinchao Wang

For instance, we improve the accuracy of DeiT-Tiny from 74. 52% to 77. 50% by pruning an off-the-shelf DeiT-Base model.

Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching

1 code implementation3 Jun 2024 Xinyin Ma, Gongfan Fang, Michael Bi Mi, Xinchao Wang

In this study, we make an interesting and somehow surprising observation: the computation of a large proportion of layers in the diffusion transformer, through introducing a caching mechanism, can be readily removed even without updating the model parameters.

Denoising

MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation

1 code implementation20 Jan 2024 Nhat M. Hoang, Kehong Gong, Chuan Guo, Michael Bi Mi

Specifically, we separate the denoising objectives of a diffusion model into two stages: obtaining conditional rough motion approximations in the initial $T-T^*$ steps by learning the noisy annotated motions, followed by the unconditional refinement of these preliminary motions during the last $T^*$ steps using unannotated motions.

Denoising Motion Generation

Semantic Segmentation in Multiple Adverse Weather Conditions with Domain Knowledge Retention

no code implementations15 Jan 2024 Xin Yang, Wending Yan, Yuan Yuan, Michael Bi Mi, Robby T. Tan

They struggle to acquire new knowledge while also retaining previously learned knowledge. To address these problems, we propose a semantic segmentation method for multiple adverse weather conditions that incorporates adaptive knowledge acquisition, pseudolabel blending, and weather composition replay.

Multi-target Domain Adaptation Semantic Segmentation +1

HEAP: Unsupervised Object Discovery and Localization with Contrastive Grouping

no code implementations29 Dec 2023 Xin Zhang, Jinheng Xie, Yuan Yuan, Michael Bi Mi, Robby T. Tan

Further, to ensure the distinguishability among various regions, we introduce a region-level contrastive clustering loss to pull closer similar regions across images.

Object Object Discovery +2

DreamDrone: Text-to-Image Diffusion Models are Zero-shot Perpetual View Generators

no code implementations14 Dec 2023 Hanyang Kong, Dongze Lian, Michael Bi Mi, Xinchao Wang

We introduce DreamDrone, a novel zero-shot and training-free pipeline for generating unbounded flythrough scenes from textual prompts.

Image Generation Perpetual View Generation +1

Priority-Centric Human Motion Generation in Discrete Latent Space

no code implementations ICCV 2023 Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, Xinchao Wang

We also present a motion discrete diffusion model that employs an innovative noise schedule, determined by the significance of each motion token within the entire motion sequence.

Motion Generation

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

1 code implementation CVPR 2024 Kai Xu, Ziwei Yu, Xin Wang, Michael Bi Mi, Angela Yao

We show that bilinear interpolation inherently attenuates high-frequency information while an MLP-based coordinate network can approximate more frequencies.

Video Super-Resolution

DepGraph: Towards Any Structural Pruning

1 code implementation CVPR 2023 Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, Xinchao Wang

Structural pruning enables model acceleration by removing structurally-grouped parameters from neural networks.

Network Pruning Neural Network Compression

Bias-Compensated Integral Regression for Human Pose Estimation

no code implementations25 Jan 2023 Kerui Gu, Linlin Yang, Michael Bi Mi, Angela Yao

Experimental results on both the human body and hand benchmarks show that BCIR is faster to train and more accurate than the original integral regression, making it competitive with state-of-the-art detection methods.

Hand Pose Estimation regression

Improving Deep Regression with Ordinal Entropy

1 code implementation21 Jan 2023 Shihao Zhang, Linlin Yang, Michael Bi Mi, Xiaoxu Zheng, Angela Yao

In computer vision, it is often observed that formulating regression problems as a classification task often yields better performance.

Classification Crowd Counting +2

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision

1 code implementation CVPR 2022 Kehong Gong, Bingbing Li, Jianfeng Zhang, Tao Wang, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang

Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions like consistency loss to guide the learning, which, inevitably, leads to inferior results in real-world scenarios with unseen poses.

3D Human Pose Estimation Hallucination

Point2Seq: Detecting 3D Objects as Sequences

1 code implementation CVPR 2022 Yujing Xue, Jiageng Mao, Minzhe Niu, Hang Xu, Michael Bi Mi, Wei zhang, Xiaogang Wang, Xinchao Wang

We further propose a lightweight scene-to-sequence decoder that can auto-regressively generate words conditioned on features from a 3D scene as well as cues from the preceding words.

3D Object Detection Decoder +2

Cannot find the paper you are looking for? You can Submit a new open access paper.