Search Results for author: Dit-yan Yeung

Found 74 papers, 22 papers with code

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

18 code implementations • NeurIPS 2015 • Xingjian Shi, Zhourong Chen, Hao Wang, Dit-yan Yeung, Wai-kin Wong, Wang-chun Woo

The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time.

Ranked #1 on Video Prediction on KTH (Cond metric)

BIG-bench Machine Learning Video Prediction +1

1,811

Paper
Code

SongRewriter: A Chinese Song Rewriting System with Controllable Content and Rhyme Scheme

1 code implementation • 28 Nov 2022 • Yusen Sun, Liangyou Li, Qun Liu, Dit-yan Yeung

Although lyrics generation has achieved significant progress in recent years, it has limited practical applications because the generated lyrics cannot be performed without composing compatible melodies.

833

Paper
Code

Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model

4 code implementations • NeurIPS 2017 • Xingjian Shi, Zhihan Gao, Leonard Lausen, Hao Wang, Dit-yan Yeung, Wai-kin Wong, Wang-chun Woo

To address these problems, we propose both a new model and a benchmark for precipitation nowcasting.

Ranked #1 on Video Prediction on KTH (Cond metric)

Optical Flow Estimation Video Prediction

532

Paper
Code

A Survey on Bayesian Deep Learning

1 code implementation • 6 Apr 2016 • Hao Wang, Dit-yan Yeung

The past decade has seen major advances in many perception tasks such as visual object recognition and speech recognition using deep learning models.

Object Recognition Recommendation Systems +3

489

Paper
Code

Earthformer: Exploring Space-Time Transformers for Earth System Forecasting

1 code implementation • 12 Jul 2022 • Zhihan Gao, Xingjian Shi, Hao Wang, Yi Zhu, Yuyang Wang, Mu Li, Dit-yan Yeung

With the explosive growth of the spatiotemporal Earth observation data in the past decade, data-driven models that apply Deep Learning (DL) are demonstrating impressive potential for various Earth system forecasting tasks.

Ranked #1 on Earth Surface Forecasting on EarthNet2021 OOD Track

Earth Observation Earth Surface Forecasting +1

326

Paper
Code

Addressing Two Problems in Deep Knowledge Tracing via Prediction-Consistent Regularization

2 code implementations • 6 Jun 2018 • Chun-kit Yeung, Dit-yan Yeung

In recent years, a recurrent neural network model called deep knowledge tracing (DKT) has been proposed to handle the knowledge tracing task and literature has shown that DKT generally outperforms traditional methods.

Knowledge Tracing Vocal Bursts Valence Prediction

109

Paper
Code

Collaborative Deep Learning for Recommender Systems

1 code implementation • 10 Sep 2014 • Hao Wang, Naiyan Wang, Dit-yan Yeung

(CF-based) input and propose in this paper a hierarchical Bayesian model called collaborative deep learning (CDL), which jointly performs deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix.

Collaborative Filtering Recommendation Systems +1

Paper
Code

Multilingual and Multi-Aspect Hate Speech Analysis

1 code implementation • IJCNLP 2019 • Nedjma Ousidhoum, Zizheng Lin, Hongming Zhang, Yangqiu Song, Dit-yan Yeung

Current research on hate speech analysis is typically oriented towards monolingual and single classification tasks.

Classification General Classification +1

Paper
Code

Detection Recovery in Online Multi-Object Tracking with Sparse Graph Tracker

1 code implementation • 2 May 2022 • Jeongseok Hyun, Myunggu Kang, Dongyoon Wee, Dit-yan Yeung

The strong edge features allow SGT to track targets with tracking candidates selected by top-K scored detections with large K. As a result, even low-scored detections can be tracked, and the missed detections are also recovered.

Ranked #2 on Multi-Object Tracking on HiEve

motion prediction Multi-Object Tracking +3

Paper
Code

GaAN: Gated Attention Networks for Learning on Large and Spatiotemporal Graphs

1 code implementation • 20 Mar 2018 • Jiani Zhang, Xingjian Shi, Junyuan Xie, Hao Ma, Irwin King, Dit-yan Yeung

We propose a new network architecture, Gated Attention Networks (GaAN), for learning on graphs.

Ranked #1 on Node Property Prediction on ogbn-proteins

General Classification Node Classification

Paper
Code

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

1 code implementation • ICLR 2021 • Tsz-Him Cheung, Dit-yan Yeung

Data augmentation is an efficient way to expand a training dataset by creating additional artificial data.

Data Augmentation Time Series +1

Paper
Code

Stereo Waterdrop Removal with Row-wise Dilated Attention

1 code implementation • 7 Aug 2021 • Zifan Shi, Na Fan, Dit-yan Yeung, Qifeng Chen

Thus, we propose a learning-based model for waterdrop removal with stereo images.

Autonomous Driving

Paper
Code

MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving

1 code implementation • ICCV 2021 • Kai Chen, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung

By pre-training on SODA10M, a large-scale autonomous driving dataset, MultiSiam exceeds the ImageNet pre-trained MoCo-v2, demonstrating the potential of domain-specific pre-training.

Autonomous Driving Image Clustering +2

Paper
Code

Mixed Autoencoder for Self-supervised Visual Representation Learning

1 code implementation • CVPR 2023 • Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung

Specifically, our MixedAE outperforms MAE by +0. 3% accuracy, +1. 7 mIoU and +0. 9 AP on ImageNet-1K, ADE20K and COCO respectively with a standard ViT-Base.

Contrastive Learning Data Augmentation +1

Paper
Code

Dynamic Key-Value Memory Networks for Knowledge Tracing

1 code implementation • 24 Nov 2016 • Jiani Zhang, Xingjian Shi, Irwin King, Dit-yan Yeung

Knowledge Tracing (KT) is a task of tracing evolving knowledge state of students with respect to one or more concepts as they engage in a sequence of learning activities.

Knowledge Tracing

Paper
Code

Knowledge Query Network for Knowledge Tracing

1 code implementation • International Conference on Learning Analytics & Knowledge 2019 • Jinseok Lee, Dit-yan Yeung

This involves abstract concepts of students' states of knowledge and the interactions between those states and skills.

Knowledge Tracing

Paper
Code

Incorporating Features Learned by an Enhanced Deep Knowledge Tracing Model for STEM/Non-STEM Job Prediction

1 code implementation • 6 Jun 2018 • Chun-kit Yeung, Zizheng Lin, Kai Yang, Dit-yan Yeung

The 2017 ASSISTments Data Mining competition aims to use data from a longitudinal study for predicting a brand-new outcome of students which had never been studied before by the educational data mining research community.

Job Prediction Knowledge Tracing

Paper
Code

AdaAug: Learning Class- and Instance-adaptive Data Augmentation Policies

1 code implementation • ICLR 2022 • Tsz-Him Cheung, Dit-yan Yeung

However, the augmentation policies found are not adaptive to the dataset used, hindering the effectiveness of these AutoDA methods.

Data Augmentation

Paper
Code

Natural-Parameter Networks: A Class of Probabilistic Neural Networks

1 code implementation • NeurIPS 2016 • Hao Wang, Xingjian Shi, Dit-yan Yeung

Another shortcoming of NN is the lack of flexibility to customize different distributions for the weights and neurons according to the data, as is often done in probabilistic graphical models.

Decision Making Under Uncertainty Link Prediction

Paper
Code

Probing Toxic Content in Large Pre-Trained Language Models

1 code implementation • ACL 2021 • Nedjma Ousidhoum, Xinran Zhao, Tianqing Fang, Yangqiu Song, Dit-yan Yeung

Large pre-trained language models (PTLMs) have been shown to carry biases towards different social groups which leads to the reproduction of stereotypical and toxic content by major NLP systems.

Probing Language Models Sentence

Paper
Code

Comparative Evaluation of Label-Agnostic Selection Bias in Multilingual Hate Speech Datasets

1 code implementation • EMNLP 2020 • Nedjma Ousidhoum, Yangqiu Song, Dit-yan Yeung

Work on bias in hate speech typically aims to improve classification performance while relatively overlooking the quality of the data.

Hate Speech Detection Selection bias +3

Paper
Code

Towards General Error Diagnosis via Behavioral Testing in Machine Translation

1 code implementation • 20 Oct 2023 • Junjie Wu, Lemao Liu, Dit-yan Yeung

Behavioral testing offers a crucial means of diagnosing linguistic errors and assessing capabilities of NLP models.

Machine Translation Translation

Paper
Code

Learning Unmanned Aerial Vehicle Control for Autonomous Target Following

no code implementations • 24 Sep 2017 • Siyi Li, Tianbo Liu, Chi Zhang, Dit-yan Yeung, Shaojie Shen

While deep reinforcement learning (RL) methods have achieved unprecedented successes in a range of challenging problems, their applicability has been mainly limited to simulation or game domains due to the high sample complexity of the trial-and-error learning process.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Temporal Dynamic Graph LSTM for Action-driven Video Object Detection

no code implementations • ICCV 2017 • Yuan Yuan, Xiaodan Liang, Xiaolong Wang, Dit-yan Yeung, Abhinav Gupta

A common issue, however, is that objects of interest that are not involved in human actions are often absent in global action descriptions known as "missing label".

Ranked #3 on Weakly Supervised Object Detection on Charades

Object object-detection +3

Paper
Add Code

Spatiotemporal Modeling for Crowd Counting in Videos

no code implementations • ICCV 2017 • Feng Xiong, Xingjian Shi, Dit-yan Yeung

To exploit the otherwise very useful temporal information in video sequences, we propose a variant of a recent deep learning model called convolutional LSTM (ConvLSTM) for crowd counting.

Crowd Counting Transfer Learning

Paper
Add Code

Fine-Grained Categorization via CNN-Based Automatic Extraction and Integration of Object-Level and Part-Level Features

no code implementations • 22 Jun 2017 • Ting Sun, Lin Sun, Dit-yan Yeung

Fine-grained categorization can benefit from part-based features which reveal subtle visual differences between object categories.

Object

Paper
Add Code

ZM-Net: Real-time Zero-shot Image Manipulation Network

no code implementations • 21 Mar 2017 • Hao Wang, Xiaodan Liang, Hao Zhang, Dit-yan Yeung, Eric P. Xing

We cast this problem as manipulating an input image according to a parametric model whose key parameters can be conditionally generated from any guiding signal (even unseen ones).

Colorization Descriptive +2

Paper
Add Code

Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks

no code implementations • NeurIPS 2016 • Hao Wang, Xingjian Shi, Dit-yan Yeung

To address this problem, we develop a collaborative recurrent autoencoder (CRAE) which is a denoising recurrent autoencoder (DRAE) that models the generation of content sequences in the collaborative filtering (CF) setting.

Collaborative Filtering Denoising +1

Paper
Add Code

Sparse Boltzmann Machines with Structure Learning as Applied to Text Analysis

no code implementations • 17 Sep 2016 • Zhourong Chen, Nevin L. Zhang, Dit-yan Yeung, Peixian Chen

We are interested in exploring the possibility and benefits of structure learning for deep models.

Paper
Add Code

Towards Bayesian Deep Learning: A Framework and Some Existing Methods

no code implementations • 24 Aug 2016 • Hao Wang, Dit-yan Yeung

While perception tasks such as visual object recognition and text understanding play an important role in human intelligence, the subsequent tasks that involve inference, reasoning and planning require an even higher level of intelligence.

Object Recognition Recommendation Systems +1

Paper
Add Code

Human Action Recognition using Factorized Spatio-Temporal Convolutional Networks

no code implementations • ICCV 2015 • Lin Sun, Kui Jia, Dit-yan Yeung, Bertram E. Shi

Human actions in video sequences are three-dimensional (3D) spatio-temporal signals characterizing both the visual appearance and motion dynamics of the involved humans and objects.

Action Recognition Image Classification +1

Paper
Add Code

Understanding and Diagnosing Visual Tracking Systems

no code implementations • ICCV 2015 • Naiyan Wang, Jianping Shi, Dit-yan Yeung, Jiaya Jia

Surprisingly, our findings are discrepant with some common beliefs in the visual tracking research community.

Visual Tracking

Paper
Add Code

Transferring Rich Feature Hierarchies for Robust Visual Tracking

no code implementations • 19 Jan 2015 • Naiyan Wang, Siyi Li, Abhinav Gupta, Dit-yan Yeung

To fit the characteristics of object tracking, we first pre-train the CNN to recognize what is an object, and then propose to generate a probability map instead of producing a simple class label.

Image Classification Object +4

Paper
Add Code

Machine Learning for Spatiotemporal Sequence Forecasting: A Survey

no code implementations • 21 Aug 2018 • Xingjian Shi, Dit-yan Yeung

Forecasting the multi-step future of these spatiotemporal systems based on the past observations, or, Spatiotemporal Sequence Forecasting (STSF), is a significant and challenging problem.

BIG-bench Machine Learning Trajectory Forecasting

Paper
Add Code

Point-cloud-based place recognition using CNN feature extraction

no code implementations • 23 Oct 2018 • Ting Sun, Ming Liu, Haoyang Ye, Dit-yan Yeung

This paper proposes a novel point-cloud-based place recognition system that adopts a deep learning approach for feature extraction.

Paper
Add Code

Semi-Semantic Line-Cluster Assisted Monocular SLAM for Indoor Environments

no code implementations • 5 Nov 2018 • Ting Sun, Dezhen Song, Dit-yan Yeung, Ming Liu

In the back end, we optimize the map imposing the constraint that the line segments of the same cluster should be the same.

Simultaneous Localization and Mapping

Paper
Add Code

Effective Feature Learning with Unsupervised Learning for Improving the Predictive Models in Massive Open Online Courses

no code implementations • 12 Dec 2018 • Mucong Ding, Kai Yang, Dit-yan Yeung, Ting-Chuen Pong

A major challenge that has to be addressed when building such models is to design handcrafted features that are effective for the prediction task at hand.

Paper
Add Code

Learning a Deep Compact Image Representation for Visual Tracking

no code implementations • NeurIPS 2013 • Naiyan Wang, Dit-yan Yeung

In this paper, we study the challenging problem of tracking the trajectory of a moving object in a video with possibly very complex background.

Denoising General Classification +3

Paper
Add Code

Co-Regularized Hashing for Multimodal Data

no code implementations • NeurIPS 2012 • Yi Zhen, Dit-yan Yeung

Hashing-based methods provide a very promising approach to large-scale similarity search.

Paper
Add Code

Probabilistic Multi-Task Feature Selection

no code implementations • NeurIPS 2010 • Yu Zhang, Dit-yan Yeung, Qian Xu

In this paper, we unify the $l_{1, 2}$ and $l_{1,\infty}$ norms by considering a family of $l_{1, q}$ norms for $1 < q\le\infty$ and study the problem of determining the most appropriate sparsity enforcing norm to use in the context of multi-task feature selection.

feature selection Multi-Task Learning

Paper
Add Code

Worst-Case Linear Discriminant Analysis

no code implementations • NeurIPS 2010 • Yu Zhang, Dit-yan Yeung

In this paper, we first analyze the scatter measures used in the conventional linear discriminant analysis~(LDA) model and note that the formulation is based on the average-case view.

Dimensionality Reduction Metric Learning

Paper
Add Code

Probabilistic Relational PCA

no code implementations • NeurIPS 2009 • Wu-Jun Li, Dit-yan Yeung, Zhihua Zhang

assumption is unreasonable for relational data.

Dimensionality Reduction

Paper
Add Code

Posterior Consistency of the Silverman g-prior in Bayesian Model Choice

no code implementations • NeurIPS 2008 • Zhihua Zhang, Michael. I. Jordan, Dit-yan Yeung

The duality between regularization and prior leads to interpreting regularization methods in terms of maximum a posteriori estimation and has motivated Bayesian interpretations of kernel methods.

Paper
Add Code

Bayesian Adaptive Matrix Factorization With Automatic Model Selection

no code implementations • CVPR 2015 • Peixian Chen, Naiyan Wang, Nevin L. Zhang, Dit-yan Yeung

Low-rank matrix factorization has long been recognized as a fundamental problem in many computer vision applications.

Model Selection

Paper
Add Code

DevNet: A Deep Event Network for Multimedia Event Detection and Evidence Recounting

no code implementations • CVPR 2015 • Chuang Gan, Naiyan Wang, Yi Yang, Dit-yan Yeung, Alex G. Hauptmann

Taking key frames of videos as input, we first detect the event of interest at the video level by aggregating the CNN features of the key frames.

Action Recognition Event Detection +2

Paper
Add Code

Fully Using Classifiers for Weakly Supervised Semantic Segmentation with Modified Cues

no code implementations • 3 Apr 2019 • Ting Sun, Lei Tai, Zhihan Gao, Ming Liu, Dit-yan Yeung

This paper proposes a novel weakly-supervised semantic segmentation method using image-level label only.

Segmentation Weakly supervised Semantic Segmentation +1

Paper
Add Code

Marginalized Average Attentional Network for Weakly-Supervised Learning

no code implementations • ICLR 2019 • Yuan Yuan, Yueming Lyu, Xi Shen, Ivor W. Tsang, Dit-yan Yeung

The MAAN employs a novel marginalized average aggregation (MAA) module and learns a set of latent discriminative probabilities in an end-to-end fashion.

Ranked #11 on Weakly Supervised Action Localization on ActivityNet-1.3 (mAP@0.5 metric)

Weakly Supervised Action Localization Weakly-supervised Learning +2

Paper
Add Code

Movable-Object-Aware Visual SLAM via Weakly Supervised Semantic Segmentation

no code implementations • 9 Jun 2019 • Ting Sun, Yuxiang Sun, Ming Liu, Dit-yan Yeung

Moving objects can greatly jeopardize the performance of a visual simultaneous localization and mapping (vSLAM) system which relies on the static-world assumption.

Segmentation Simultaneous Localization and Mapping +2

Paper
Add Code

Knowledge Query Network: How Knowledge Interacts with Skills

no code implementations • 3 Aug 2019 • Jinseok Lee, Dit-yan Yeung

This involves abstract concepts of students' states of knowledge and the interactions between those states and skills.

Clustering Knowledge Tracing

Paper
Add Code

Robust path-based spectral clustering

no code implementations • 1 Jan 2018 • Hong Chang, Dit-yan Yeung

In this paper, based on M-estimation from robust statistics, we develop a robust path-based spectral clustering method by defining a robust path-based similarity measure for spectral clustering under both unsupervised and semi-supervised settings.

Clustering Image Segmentation +2

Paper
Add Code

Trading Quality for Efficiency of Graph Partitioning: An Inductive Method across Graphs

no code implementations • 29 Sep 2021 • Meng Qin, Chaorui Zhang, Bo Bai, Gong Zhang, Dit-yan Yeung

IGP is also a generic framework that can capture the permutation invariant partitioning ground-truth of historical snapshots in the offline training and tackle the online GP on graphs with non-fixed number of nodes and clusters.

Combinatorial Optimization graph partitioning

Paper
Add Code

Aug-ILA: More Transferable Intermediate Level Attacks with Augmented References

no code implementations • 29 Sep 2021 • Chiu Wai Yan, Dit-yan Yeung

We start by looking into the effect of common image augmentation techniques and exploring novel augmentation with the aid of adversarial perturbations.

Adversarial Attack Image Augmentation

Paper
Add Code

3D-Aware Indoor Scene Synthesis with Depth Priors

no code implementations • 17 Feb 2022 • Zifan Shi, Yujun Shen, Jiapeng Zhu, Dit-yan Yeung, Qifeng Chen

In this way, the discriminator can take the spatial arrangement into account and advise the generator to learn an appropriate depth condition.

3D-Aware Image Synthesis Indoor Scene Synthesis

Paper
Add Code

CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving

no code implementations • 15 Mar 2022 • Kaican Li, Kai Chen, Haoyu Wang, Lanqing Hong, Chaoqiang Ye, Jianhua Han, Yukuai Chen, Wei zhang, Chunjing Xu, Dit-yan Yeung, Xiaodan Liang, Zhenguo Li, Hang Xu

One main reason that impedes the development of truly reliably self-driving systems is the lack of public datasets for evaluating the performance of object detectors on corner cases.

Autonomous Driving Object +2

Paper
Add Code

Controlled Text Generation Using Dictionary Prior in Variational Autoencoders

no code implementations • Findings (ACL) 2022 • Xianghong Fang, Jian Li, Lifeng Shang, Xin Jiang, Qun Liu, Dit-yan Yeung

While variational autoencoders (VAEs) have been widely applied in text generation tasks, they are troubled by two challenges: insufficient representation capacity and poor controllability.

Contrastive Learning Language Modelling +2

Paper
Add Code

Trading off Quality for Efficiency of Community Detection: An Inductive Method across Graphs

no code implementations • 29 Sep 2022 • Meng Qin, Chaorui Zhang, Bo Bai, Gong Zhang, Dit-yan Yeung

The trained model is then directly generalized to new unseen graphs for online CD without additional optimization, where a better trade-off between quality and efficiency can be achieved.

Combinatorial Optimization Community Detection

Paper
Add Code

Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator

no code implementations • 30 Sep 2022 • Zifan Shi, Yinghao Xu, Yujun Shen, Deli Zhao, Qifeng Chen, Dit-yan Yeung

We argue that, considering the two-player game in the formulation of GANs, only making the generator 3D-aware is not enough.

3D-Aware Image Synthesis domain classification +2

Paper
Add Code

Learning 3D-aware Image Synthesis with Unknown Pose Distribution

no code implementations • CVPR 2023 • Zifan Shi, Yujun Shen, Yinghao Xu, Sida Peng, Yiyi Liao, Sheng Guo, Qifeng Chen, Dit-yan Yeung

Existing methods for 3D-aware image synthesis largely depend on the 3D pose distribution pre-estimated on the training set.

3D-Aware Image Synthesis

Paper
Add Code

Deep COVID-19 Forecasting for Multiple States with Data Augmentation

no code implementations • 2 Feb 2023 • Chung Yan Fong, Dit-yan Yeung

As such, it has a two-fold advantage: 1) more actual observations can be used for training, and 2) the model can be validated on data which has distribution closer to the expected situation.

Data Augmentation Time Series +1

Paper
Add Code

CLIP$^2$: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data

no code implementations • 22 Mar 2023 • Yihan Zeng, Chenhan Jiang, Jiageng Mao, Jianhua Han, Chaoqiang Ye, Qingqiu Huang, Dit-yan Yeung, Zhen Yang, Xiaodan Liang, Hang Xu

Contrastive Language-Image Pre-training, benefiting from large-scale unlabeled text-image pairs, has demonstrated great performance in open-world vision understanding tasks.

Ranked #3 on Zero-shot 3D Point Cloud Classification on ScanNetV2

Zero-shot 3D Point Cloud Classification

Paper
Add Code

CLIP2: Contrastive Language-Image-Point Pretraining From Real-World Point Cloud Data

no code implementations • CVPR 2023 • Yihan Zeng, Chenhan Jiang, Jiageng Mao, Jianhua Han, Chaoqiang Ye, Qingqiu Huang, Dit-yan Yeung, Zhen Yang, Xiaodan Liang, Hang Xu

Contrastive Language-Image Pre-training, benefiting from large-scale unlabeled text-image pairs, has demonstrated great performance in open-world vision understanding tasks.

Paper
Add Code

A Convex Formulation for Learning Task Relationships in Multi-Task Learning

no code implementations • 15 Mar 2012 • Yu Zhang, Dit-yan Yeung

In this paper, we propose a regularization formulation for learning the relationships between tasks in multi-task learning.

Multi-Task Learning

Paper
Add Code

GeoDiffusion: Text-Prompted Geometric Control for Object Detection Data Generation

no code implementations • 7 Jun 2023 • Kai Chen, Enze Xie, Zhe Chen, Yibo Wang, Lanqing Hong, Zhenguo Li, Dit-yan Yeung

Diffusion models have attracted significant attention due to the remarkable ability to create content and generate data for tasks like image classification.

Image Classification Layout-to-Image Generation +2

Paper
Add Code

SCAT: Robust Self-supervised Contrastive Learning via Adversarial Training for Text Classification

no code implementations • 4 Jul 2023 • Junjie Wu, Dit-yan Yeung

Specifically, SCAT modifies random augmentations of the data in a fully labelfree manner to generate adversarial examples.

Contrastive Learning text-classification +1

Paper
Add Code

SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation

no code implementations • ICCV 2023 • Xuechao Chen, Shuangjie Xu, Xiaoyi Zou, Tongyi Cao, Dit-yan Yeung, Lu Fang

To take full advantage of the historical frames high-efficiently, we shunt the historical points into two groups with reference to the current points.

Autonomous Driving LIDAR Semantic Segmentation +1

Paper
Add Code

MagicDrive: Street View Generation with Diverse 3D Geometry Control

no code implementations • 4 Oct 2023 • Ruiyuan Gao, Kai Chen, Enze Xie, Lanqing Hong, Zhenguo Li, Dit-yan Yeung, Qiang Xu

Recent advancements in diffusion models have significantly enhanced the data synthesis with 2D control.

3D Object Detection Object +1

Paper
Add Code

Implicit Concept Removal of Diffusion Models

no code implementations • 9 Oct 2023 • Zhili Liu, Kai Chen, Yifan Zhang, Jianhua Han, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung, James Kwok

To address this, we utilize the intrinsic geometric characteristics of implicit concepts and present the Geom-Erasing, a novel concept removal method based on geometric-driven control.

Paper
Add Code

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

no code implementations • 16 Oct 2023 • Kai Chen, Chunwei Wang, Kuo Yang, Jianhua Han, Lanqing Hong, Fei Mi, Hang Xu, Zhengying Liu, Wenyong Huang, Zhenguo Li, Dit-yan Yeung, Lifeng Shang, Xin Jiang, Qun Liu

The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges.

Instruction Following

Paper
Add Code

Gaussian Shell Maps for Efficient 3D Human Generation

no code implementations • 29 Nov 2023 • Rameen Abdal, Wang Yifan, Zifan Shi, Yinghao Xu, Ryan Po, Zhengfei Kuang, Qifeng Chen, Dit-yan Yeung, Gordon Wetzstein

Instead of rasterizing the shells directly, we sample 3D Gaussians on the shells whose attributes are encoded in the texture features.

Paper
Add Code

TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models

no code implementations • 1 Dec 2023 • Pengxiang Li, Kai Chen, Zhili Liu, Ruiyuan Gao, Lanqing Hong, Guo Zhou, Hua Yao, Dit-yan Yeung, Huchuan Lu, Xu Jia

Despite remarkable achievements in video synthesis, achieving granular control over complex dynamics, such as nuanced movement among multiple interacting objects, still presents a significant hurdle for dynamic world modeling, compounded by the necessity to manage appearance and disappearance, drastic scale changes, and ensure consistency for instances across frames.

Image Classification Multi-Object Tracking +4

Paper
Add Code

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

no code implementations • 19 Dec 2023 • Yunhao Gou, Zhili Liu, Kai Chen, Lanqing Hong, Hang Xu, Aoxue Li, Dit-yan Yeung, James T. Kwok, Yu Zhang

Instruction tuning of Large Vision-language Models (LVLMs) has revolutionized the development of versatile models with zero-shot generalization across a wide range of downstream vision-language tasks.

Instruction Following Zero-shot Generalization

Paper
Add Code

Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation

no code implementations • 14 Mar 2024 • Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung, James T. Kwok, Yu Zhang

Multimodal large language models (MLLMs) have shown impressive reasoning abilities, which, however, are also more vulnerable to jailbreak attacks than their LLM predecessors.

Optical Character Recognition (OCR)

Paper
Add Code

TransformMix: Learning Transformation and Mixing Strategies from Data

no code implementations • 19 Mar 2024 • Tsz-Him Cheung, Dit-yan Yeung

Sample-mixing is a popular data augmentation approach that creates additional data by combining existing samples.

Data Augmentation Knowledge Distillation +3

Paper
Add Code

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

no code implementations • 20 Mar 2024 • Yibo Wang, Ruiyuan Gao, Kai Chen, Kaiqiang Zhou, Yingjie Cai, Lanqing Hong, Zhenguo Li, Lihui Jiang, Dit-yan Yeung, Qiang Xu, Kai Zhang

Furthermore, image syntheses from DetDiffusion can effectively augment training data, significantly enhancing downstream detection performance.

Attribute Data Augmentation +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.