Search Results for author: Lijun Yu

Found 16 papers, 3 papers with code

Towards Multi-Task Multi-Modal Models: A Video Generative Perspective

no code implementations26 May 2024 Lijun Yu

This thesis chronicles our endeavor to build multi-task models for generating videos and other modalities under diverse conditions, as well as for understanding and compression applications.

A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

no code implementations22 May 2024 Gwanghyun Kim, Alonso Martinez, Yu-Chuan Su, Brendan Jou, José Lezama, Agrim Gupta, Lijun Yu, Lu Jiang, Aren Jansen, Jacob Walker, Krishna Somandepalli

Here, we propose a novel training approach to effectively learn arbitrary conditional distributions in the audiovisual space. Our key contribution lies in how we parameterize the diffusion timestep in the forward diffusion process.

Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization

no code implementations15 May 2024 Kai Hu, Weichen Yu, Tianjun Yao, Xiang Li, Wenhe Liu, Lijun Yu, Yining Li, Kai Chen, Zhiqiang Shen, Matt Fredrikson

Our approach relaxes the discrete jailbreak optimization into a continuous optimization and progressively increases the sparsity of the optimizing vectors.

Improving and Unifying Discrete&Continuous-time Discrete Denoising Diffusion

2 code implementations6 Feb 2024 Lingxiao Zhao, Xueying Ding, Lijun Yu, Leman Akoglu

Discrete diffusion models have seen a surge of attention with applications on naturally discrete data such as language and graphs.


SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

no code implementations NeurIPS 2023 Lijun Yu, Yong Cheng, Zhiruo Wang, Vivek Kumar, Wolfgang Macherey, Yanping Huang, David A. Ross, Irfan Essa, Yonatan Bisk, Ming-Hsuan Yang, Kevin Murphy, Alexander G. Hauptmann, Lu Jiang

In this work, we introduce Semantic Pyramid AutoEncoder (SPAE) for enabling frozen LLMs to perform both understanding and generation tasks involving non-linguistic modalities such as images or videos.

In-Context Learning multimodal generation

DocumentNet: Bridging the Data Gap in Document Pre-Training

no code implementations15 Jun 2023 Lijun Yu, Jin Miao, Xiaoyu Sun, Jiayi Chen, Alexander G. Hauptmann, Hanjun Dai, Wei Wei

Document understanding tasks, in particular, Visually-rich Document Entity Retrieval (VDER), have gained significant attention in recent years thanks to their broad applications in enterprise AI.

document understanding Entity Retrieval +3

Score-based Continuous-time Discrete Diffusion Models

no code implementations30 Nov 2022 Haoran Sun, Lijun Yu, Bo Dai, Dale Schuurmans, Hanjun Dai

Score-based modeling through stochastic differential equations (SDEs) has provided a new perspective on diffusion models, and demonstrated superior performance on continuous data.

Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals

no code implementations14 Jan 2022 Lijun Yu, Yijun Qian, Wenhe Liu, Alexander G. Hauptmann

Activity detection is one of the attractive computer vision tasks to exploit the video streams captured by widely installed cameras.

Action Detection Activity Detection

Training-free Monocular 3D Event Detection System for Traffic Surveillance

no code implementations1 Feb 2020 Lijun Yu, Peng Chen, Wenhe Liu, Guoliang Kang, Alexander G. Hauptmann

To deal with the aforementioned problems, in this paper, we propose a training-free monocular 3D event detection system for traffic surveillance.

Event Detection

Traffic Danger Recognition With Surveillance Cameras Without Training Data

no code implementations29 Nov 2018 Lijun Yu, Dawei Zhang, Xiangqun Chen, Alexander Hauptmann

Therefore, we developed a model to predict and identify car crashes from surveillance cameras based on a 3D reconstruction of the road plane and prediction of trajectories.

3D Reconstruction Position

MOBA-Slice: A Time Slice Based Evaluation Framework of Relative Advantage between Teams in MOBA Games

no code implementations22 Jul 2018 Lijun Yu, Dawei Zhang, Xiangqun Chen, Xing Xie

In this paper, we introduce MOBA-Slice, a time slice based evaluation framework of relative advantage between teams in MOBA games.

Cannot find the paper you are looking for? You can Submit a new open access paper.