Search Results for author: Ka Chun Cheung

Found 29 papers, 10 papers with code

One-Minute Video Generation with Test-Time Training

no code implementations7 Apr 2025 Karan Dalal, Daniel Koceja, Gashon Hussein, Jiarui Xu, Yue Zhao, Youjin Song, Shihao Han, Ka Chun Cheung, Jan Kautz, Carlos Guestrin, Tatsunori Hashimoto, Sanmi Koyejo, Yejin Choi, Yu Sun, Xiaolong Wang

We have only experimented with one-minute videos due to resource constraints, but the approach can be extended to longer videos and more complex stories.

Mamba Video Generation

M3Net: Multimodal Multi-task Learning for 3D Detection, Segmentation, and Occupancy Prediction in Autonomous Driving

no code implementations23 Mar 2025 Xuesong Chen, Shaoshuai Shi, Tao Ma, Jingqiu Zhou, Simon See, Ka Chun Cheung, Hongsheng Li

In this paper, we introduce M3Net, a novel multimodal and multi-task network that simultaneously tackles detection, segmentation, and 3D occupancy prediction for autonomous driving and achieves superior performance than single task model.

Autonomous Driving Decoder +2

Parallel Sequence Modeling via Generalized Spatial Propagation Network

no code implementations21 Jan 2025 Hongjun Wang, Wonmin Byeon, Jiarui Xu, Jinwei Gu, Ka Chun Cheung, Xiaolong Wang, Kai Han, Jan Kautz, Sifei Liu

We present the Generalized Spatial Propagation Network (GSPN), a new attention mechanism optimized for vision tasks that inherently captures 2D spatial structures.

16k Computational Efficiency +3

Geometry Cloak: Preventing TGS-based 3D Reconstruction from Copyrighted Images

no code implementations30 Oct 2024 Qi Song, Ziyuan Luo, Ka Chun Cheung, Simon See, Renjie Wan

Single-view 3D reconstruction methods like Triplane Gaussian Splatting (TGS) have enabled high-quality 3D model generation from just a single image input within seconds.

3D Reconstruction Single-View 3D Reconstruction

GeometrySticker: Enabling Ownership Claim of Recolorized Neural Radiance Fields

no code implementations18 Jul 2024 Xiufeng Huang, Ka Chun Cheung, Simon See, Renjie Wan

While approaches like CopyRNeRF have been introduced to embed binary messages into NeRF models as digital signatures for copyright protection, the process of recolorization can remove these binary messages.

NeRF

Unlocking Continual Learning Abilities in Language Models

1 code implementation25 Jun 2024 Wenyu Du, Shuang Cheng, Tongxu Luo, Zihan Qiu, Zeyu Huang, Ka Chun Cheung, Reynold Cheng, Jie Fu

To address this limitation, we introduce $\textbf{MIGU}$ ($\textbf{M}$agn$\textbf{I}$tude-based $\textbf{G}$radient $\textbf{U}$pdating for continual learning), a rehearsal-free and task-label-free method that only updates the model parameters with large magnitudes of output in LMs' linear layers.

Continual Learning Inductive Bias

RegionGPT: Towards Region Understanding Vision Language Model

no code implementations CVPR 2024 Qiushan Guo, Shalini De Mello, Hongxu Yin, Wonmin Byeon, Ka Chun Cheung, Yizhou Yu, Ping Luo, Sifei Liu

Vision language models (VLMs) have experienced rapid advancements through the integration of large language models (LLMs) with image-text pairs, yet they struggle with detailed regional visual understanding due to limited spatial awareness of the vision encoder, and the use of coarse-grained training data that lacks detailed, region-specific captions.

Language Modeling Language Modelling +1

Resilient Practical Test-Time Adaptation: Soft Batch Normalization Alignment and Entropy-driven Memory Bank

no code implementations26 Jan 2024 Xingzhi Zhou, Zhiliang Tian, Ka Chun Cheung, Simon See, Nevin L. Zhang

Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference.

Test-time Adaptation

SVD-PINNs: Transfer Learning of Physics-Informed Neural Networks via Singular Value Decomposition

no code implementations16 Nov 2022 Yihang Gao, Ka Chun Cheung, Michael K. Ng

Physics-informed neural networks (PINNs) have attracted significant attention for solving partial differential equations (PDEs) in recent years because they alleviate the curse of dimensionality that appears in traditional methods.

Transfer Learning

Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation

no code implementations22 Oct 2022 Dongkyu Lee, Ka Chun Cheung, Nevin L. Zhang

Furthermore, inspired by recent work in bridging label smoothing and knowledge distillation, our work utilizes self-knowledge as a prior label distribution in softening target labels, and presents theoretical support for the regularization effect by knowledge distillation and the dynamic smoothing parameter.

Knowledge Distillation Text Generation

Hard Gate Knowledge Distillation -- Leverage Calibration for Robust and Reliable Language Model

no code implementations22 Oct 2022 Dongkyu Lee, Zhiliang Tian, Yingxiu Zhao, Ka Chun Cheung, Nevin L. Zhang

The question is answered in our work with the concept of model calibration; we view a teacher model not only as a source of knowledge but also as a gauge to detect miscalibration of a student.

Knowledge Distillation Language Modeling +3

NeuralMarker: A Framework for Learning General Marker Correspondence

no code implementations19 Sep 2022 Zhaoyang Huang, Xiaokun Pan, Weihong Pan, Weikang Bian, Yan Xu, Ka Chun Cheung, Guofeng Zhang, Hongsheng Li

We tackle the problem of estimating correspondences from a general marker, such as a movie poster, to an image that captures such a marker.

Video Editing

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

1 code implementation12 May 2022 Xuesong Chen, Shaoshuai Shi, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li

Accurate and reliable 3D detection is vital for many applications including autonomous driving vehicles and service robots.

Autonomous Driving object-detection +1

FlowFormer: A Transformer Architecture for Optical Flow

1 code implementation30 Mar 2022 Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Ka Chun Cheung, Hongwei Qin, Jifeng Dai, Hongsheng Li

We introduce optical Flow transFormer, dubbed as FlowFormer, a transformer-based neural network architecture for learning optical flow.

Decoder Optical Flow Estimation

Adaptive Label Smoothing with Self-Knowledge

no code implementations29 Sep 2021 Dongkyu Lee, Ka Chun Cheung, Nevin Zhang

Overconfidence has been shown to impair generalization and calibration of a neural network.

Knowledge Distillation Machine Translation

LIFE: Lighting Invariant Flow Estimation

no code implementations7 Apr 2021 Zhaoyang Huang, Xiaokun Pan, Runsen Xu, Yan Xu, Ka Chun Cheung, Guofeng Zhang, Hongsheng Li

However, local image contents are inevitably ambiguous and error-prone during the cross-image feature matching process, which hinders downstream tasks.

Understanding Top-k Sparsification in Distributed Deep Learning

1 code implementation20 Nov 2019 Shaohuai Shi, Xiaowen Chu, Ka Chun Cheung, Simon See

Distributed stochastic gradient descent (SGD) algorithms are widely deployed in training large-scale deep learning models, while the communication overhead among workers becomes the new system bottleneck.

Deep Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.