Search Results for author: Lu Jiang

Found 62 papers, 34 papers with code

Self-Paced Learning with Diversity

no code implementations • NeurIPS 2014 • Lu Jiang, Deyu Meng, Shoou-I Yu, Zhenzhong Lan, Shiguang Shan, Alexander Hauptmann

Self-paced learning (SPL) is a recently proposed learning regime inspired by the learning process of humans and animals that gradually incorporates easy to more complex samples into training.

Paper
Add Code

What Objective Does Self-paced Learning Indeed Optimize?

no code implementations • 19 Nov 2015 • Deyu Meng, Qian Zhao, Lu Jiang

Self-paced learning (SPL) is a recently raised methodology designed through simulating the learning principle of humans/animals.

Paper
Add Code

A Self-Paced Multiple-Instance Learning Framework for Co-Saliency Detection

no code implementations • ICCV 2015 • Dingwen Zhang, Deyu Meng, Chao Li, Lu Jiang, Qian Zhao, Junwei Han

As an interesting and emerging topic, co-saliency detection aims at simultaneously extracting common salient objects in a group of images.

Co-Salient Object Detection Multiple Instance Learning +1

Paper
Add Code

Strategies for Searching Video Content with Text Queries or Video Examples

no code implementations • 17 Jun 2016 • Shoou-I Yu, Yi Yang, Zhongwen Xu, Shicheng Xu, Deyu Meng, Zexi Mao, Zhigang Ma, Ming Lin, Xuanchong Li, Huan Li, Zhenzhong Lan, Lu Jiang, Alexander G. Hauptmann, Chuang Gan, Xingzhong Du, Xiaojun Chang

The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search.

Event Detection Retrieval +1

Paper
Add Code

Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning

1 code implementation • 16 Jul 2016 • Junwei Liang, Lu Jiang, Deyu Meng, Alexander Hauptmann

Learning video concept detectors automatically from the big but noisy web data with no additional manual annotations is a novel but challenging area in the multimedia and the machine learning community.

BIG-bench Machine Learning

Paper
Code

Self-paced Learning for Weakly Supervised Evidence Discovery in Multimedia Event Search

no code implementations • 12 Aug 2016 • Mengyi Liu, Lu Jiang, Shiguang Shan, Alexander G. Hauptmann

Multimedia event detection has been receiving increasing attention in recent years.

Event Detection

Paper
Add Code

Video Representation Learning and Latent Concept Mining for Large-scale Multi-label Video Classification

no code implementations • 5 Jul 2017 • Po-Yao Huang, Ye Yuan, Zhenzhong Lan, Lu Jiang, Alexander G. Hauptmann

We report on CMU Informedia Lab's system used in Google's YouTube 8 Million Video Understanding Challenge.

Attribute General Classification +3

Paper
Add Code

MemexQA: Visual Memex Question Answering

1 code implementation • 4 Aug 2017 • Lu Jiang, Junwei Liang, Liangliang Cao, Yannis Kalantidis, Sachin Farfade, Alexander Hauptmann

This paper proposes a new task, MemexQA: given a collection of photos or videos from a user, the goal is to automatically answer questions that help users recover their memory about events captured in the collection.

Memex Question Answering Question Answering +1

Paper
Code

Graph Distillation for Action Detection with Privileged Modalities

1 code implementation • ECCV 2018 • Zelun Luo, Jun-Ting Hsieh, Lu Jiang, Juan Carlos Niebles, Li Fei-Fei

We propose a technique that tackles action detection in multimodal videos under a realistic and challenging condition in which only limited training data and partially observed modalities are available.

Action Classification Action Detection +1

Paper
Code

MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels

1 code implementation • ICML 2018 • Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, Li Fei-Fei

Recent deep networks are capable of memorizing the entire data even when the labels are completely random.

Ranked #16 on Image Classification on WebVision-1000

Image Classification

320

Paper
Code

Decoupled Novel Object Captioner

1 code implementation • 11 Apr 2018 • Yu Wu, Linchao Zhu, Lu Jiang, Yi Yang

Thus, the sequence model can be decoupled from the novel object descriptions.

Image Captioning Novel Concepts +2

Paper
Code

Focal Visual-Text Attention for Visual Question Answering

2 code implementations • CVPR 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Li-Jia Li, Alexander Hauptmann

Recent insights on language and vision with neural networks have been successfully applied to simple single-image visual question answering.

Ranked #1 on Memex Question Answering on MemexQA

Memex Question Answering Question Answering +1

Paper
Code

Focal Visual-Text Attention for Memex Question Answering

1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Yannis Kalantidis, Li-Jia Li, and Alexander Hauptmann

In addition to a text answer, a few grounding photos are also given to justify the answer.

Ranked #1 on Memex Question Answering on MemexQA

Memex Question Answering Question Answering +1

Paper
Code

Composing Text and Image for Image Retrieval - An Empirical Odyssey

4 code implementations • CVPR 2019 • Nam Vo, Lu Jiang, Chen Sun, Kevin Murphy, Li-Jia Li, Li Fei-Fei, James Hays

In this paper, we study the task of image retrieval, where the input query is specified in the form of an image plus some text that describes desired modifications to the input image.

Ranked #2 on Image Retrieval with Multi-Modal Query on MIT-States

Image Retrieval Image Retrieval with Multi-Modal Query +1

294

Paper
Code

Contrastive Adaptation Network for Unsupervised Domain Adaptation

2 code implementations • CVPR 2019 • Guoliang Kang, Lu Jiang, Yi Yang, Alexander G. Hauptmann

Unsupervised Domain Adaptation (UDA) makes predictions for the target domain data while manual annotations are only available in the source domain.

Ranked #7 on Domain Adaptation on Office-31

Unsupervised Domain Adaptation

313

Paper
Code

Peeking into the Future: Predicting Future Person Activities and Locations in Videos

2 code implementations • CVPR 2019 • Junwei Liang, Lu Jiang, Juan Carlos Niebles, Alexander Hauptmann, Li Fei-Fei

To facilitate the training, the network is learned with an auxiliary task of predicting future location in which the activity will happen.

Ranked #1 on Activity Prediction on ActEV

Future prediction Human motion prediction +4

350

Paper
Code

Let's Transfer Transformations of Shared Semantic Representations

2 code implementations • 2 Mar 2019 • Nam Vo, Lu Jiang, James Hays

In this work we show how one can learn transformations with no training examples by learning them on another domain and then transfer to the target domain.

Attribute Image Retrieval +3

Paper
Code

Revisiting EmbodiedQA: A Simple Baseline and Beyond

no code implementations • 8 Apr 2019 • Yu Wu, Lu Jiang, Yi Yang

In this paper, we empirically study this problem and introduce 1) a simple yet effective baseline that achieves promising performance; 2) an easier and practical setting for EmbodiedQA where an agent has a chance to adapt the trained model to a new environment before it actually answers users questions.

Embodied Question Answering Question Answering

Paper
Add Code

Eidetic 3D LSTM: A Model for Video Prediction and Beyond

3 code implementations • ICLR 2019 • Yunbo Wang, Lu Jiang, Ming-Hsuan Yang, Li-Jia Li, Mingsheng Long, Li Fei-Fei

We first evaluate the E3D-LSTM network on widely-used future video prediction datasets and achieve the state-of-the-art performance.

Ranked #1 on Video Prediction on KTH (Cond metric)

Activity Recognition Video Prediction +1

572

Paper
Code

State-aware Re-identification Feature for Multi-target Multi-camera Tracking

no code implementations • 4 Jun 2019 • Peng Li, Jiabin Zhang, Zheng Zhu, Yanwei Li, Lu Jiang, Guan Huang

Multi-target Multi-camera Tracking (MTMCT) aims to extract the trajectories from videos captured by a set of cameras.

Paper
Add Code

Robust Neural Machine Translation with Doubly Adversarial Inputs

no code implementations • ACL 2019 • Yong Cheng, Lu Jiang, Wolfgang Macherey

Neural machine translation (NMT) often suffers from the vulnerability to noisy perturbations in the input.

Machine Translation NMT +1

Paper
Add Code

Feature Partitioning for Efficient Multi-Task Architectures

no code implementations • ICLR 2020 • Alejandro Newell, Lu Jiang, Chong Wang, Li-Jia Li, Jia Deng

Multi-task learning holds the promise of less data, parameters, and time than training of separate models.

Multi-Task Learning

Paper
Add Code

Synthetic vs Real: Deep Learning on Controlled Noise

no code implementations • 25 Sep 2019 • Lu Jiang, Di Huang, Weilong Yang

Performing controlled experiments on noisy data is essential in thoroughly understanding deep learning across a spectrum of noise levels.

Paper
Add Code

Confident Learning: Estimating Uncertainty in Dataset Labels

4 code implementations • 31 Oct 2019 • Curtis G. Northcutt, Lu Jiang, Isaac L. Chuang

Confident learning (CL) is an alternative approach which focuses instead on label quality by characterizing and identifying label errors in datasets, based on the principles of pruning noisy data, counting with probabilistic thresholds to estimate noise, and ranking examples to train with confidence.

Learning with noisy labels Sentiment Analysis +1

8,645

Paper
Code

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels

2 code implementations • ICML 2020 • Lu Jiang, Di Huang, Mason Liu, Weilong Yang

Due to the lack of suitable datasets, previous research has only examined deep learning on controlled synthetic label noise, and real-world label noise has never been studied in a controlled setting.

Ranked #12 on Image Classification on WebVision-1000

Image Classification

32,803

Paper
Code

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction

1 code implementation • CVPR 2020 • Junwei Liang, Lu Jiang, Kevin Murphy, Ting Yu, Alexander Hauptmann

The first contribution is a new dataset, created in a realistic 3D simulator, which is based on real world trajectory data, and then extrapolated by human annotators to achieve different latent goals.

Ranked #1 on Multi-future Trajectory Prediction on ForkingPaths

Autonomous Driving Human motion prediction +5

248

Paper
Code

Neural Design Network: Graphic Layout Generation with Constraints

no code implementations • ECCV 2020 • Hsin-Ying Lee, Lu Jiang, Irfan Essa, Phuong B Le, Haifeng Gong, Ming-Hsuan Yang, Weilong Yang

The first module predicts a graph with complete relations from a graph with user-specified relations.

Image Generation

Paper
Add Code

Controllable and Progressive Image Extrapolation

no code implementations • 25 Dec 2019 • Yijun Li, Lu Jiang, Ming-Hsuan Yang

Image extrapolation aims at expanding the narrow field of view of a given image patch.

Paper
Add Code

SimAug: Learning Robust Representations from 3D Simulation for Pedestrian Trajectory Prediction in Unseen Cameras

1 code implementation • 4 Apr 2020 • Junwei Liang, Lu Jiang, Alexander Hauptmann

We refer to our method as SimAug.

Ranked #2 on Trajectory Prediction on ActEV

Adversarial Attack Adversarial Defense +2

248

Paper
Code

AdvAug: Robust Adversarial Augmentation for Neural Machine Translation

no code implementations • ACL 2020 • Yong Cheng, Lu Jiang, Wolfgang Macherey, Jacob Eisenstein

In this paper, we propose a new adversarial augmentation method for Neural Machine Translation (NMT).

Ranked #22 on Machine Translation on WMT2014 English-German

Data Augmentation Machine Translation +3

Paper
Add Code

RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval

no code implementations • ECCV 2020 • Hung-Yu Tseng, Hsin-Ying Lee, Lu Jiang, Ming-Hsuan Yang, Weilong Yang

Image generation from scene description is a cornerstone technique for the controlled generation, which is beneficial to applications such as content creation and image editing.

Image Generation Retrieval

Paper
Add Code

Text as Neural Operator: Image Manipulation by Text Instruction

1 code implementation • 11 Aug 2020 • Tianhao Zhang, Hung-Yu Tseng, Lu Jiang, Weilong Yang, Honglak Lee, Irfan Essa

In recent years, text-guided image manipulation has gained increasing attention in the multimedia and computer vision community.

Conditional Image Generation Image Captioning +2

Paper
Code

Simplifying Reinforced Feature Selection via Restructured Choice Strategy of Single Agent

no code implementations • 19 Sep 2020 • Xiaosa Zhao, Kunpeng Liu, Wei Fan, Lu Jiang, Xiaowei Zhao, Minghao Yin, Yanjie Fu

To address the question, we develop a single-agent reinforced feature selection approach integrated with restructured choice strategy.

feature selection

Paper
Add Code

Regularizing Generative Adversarial Networks under Limited Data

1 code implementation • CVPR 2021 • Hung-Yu Tseng, Lu Jiang, Ce Liu, Ming-Hsuan Yang, Weilong Yang

Recent years have witnessed the rapid progress of generative adversarial networks (GANs).

Ranked #1 on Image Generation on CIFAR-100

Data Augmentation Image Generation

163

Paper
Code

Faster Meta Update Strategy for Noise-Robust Deep Learning

1 code implementation • 30 Apr 2021 • Youjiang Xu, Linchao Zhu, Lu Jiang, Yi Yang

It has been shown that deep neural networks are prone to overfitting on biased training data.

Ranked #1 on Image Classification on CIFAR-10, 40% Symmetric Noise

Learning with noisy labels Meta-Learning

Paper
Code

Self-supervised and Supervised Joint Training for Resource-rich Machine Translation

no code implementations • 8 Jun 2021 • Yong Cheng, Wei Wang, Lu Jiang, Wolfgang Macherey

Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT).

Low-Resource Neural Machine Translation NMT +1

Paper
Add Code

ViTGAN: Training GANs with Vision Transformers

3 code implementations • ICLR 2022 • Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu

Recently, Vision Transformers (ViTs) have shown competitive performance on image recognition while requiring less vision-specific inductive biases.

Ranked #68 on Image Generation on CIFAR-10

Image Generation

505

Paper
Code

Discrete Representations Strengthen Vision Transformer Robustness

1 code implementation • ICLR 2022 • Chengzhi Mao, Lu Jiang, Mostafa Dehghani, Carl Vondrick, Rahul Sukthankar, Irfan Essa

Vision Transformer (ViT) is emerging as the state-of-the-art architecture for image recognition.

Ranked #3 on Domain Generalization on Stylized-ImageNet

Domain Generalization Image Classification

305

Paper
Code

Pyramid Adversarial Training Improves ViT Performance

1 code implementation • CVPR 2022 • Charles Herrmann, Kyle Sargent, Lu Jiang, Ramin Zabih, Huiwen Chang, Ce Liu, Dilip Krishnan, Deqing Sun

In this work, we present pyramid adversarial training (PyramidAT), a simple and effective technique to improve ViT's overall performance.

Ranked #9 on Domain Generalization on ImageNet-C (using extra training data)

Adversarial Attack Data Augmentation +2

2,994

Paper
Code

BLT: Bidirectional Layout Transformer for Controllable Layout Generation

1 code implementation • 9 Dec 2021 • Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa

During inference, BLT first generates a draft layout from the input and then iteratively refines it into a high-quality layout by masking out low-confident attributes.

Paper
Code

MaskGIT: Masked Generative Image Transformer

6 code implementations • CVPR 2022 • Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman

At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.

Ranked #2 on Text-to-Image Generation on LHQC

Image Manipulation Image Outpainting +1

1,114

Paper
Code

Improved Masked Image Generation with Token-Critic

1 code implementation • 9 Sep 2022 • José Lezama, Huiwen Chang, Lu Jiang, Irfan Essa

Given a masked-and-reconstructed real image, the Token-Critic model is trained to distinguish which visual tokens belong to the original image and which were sampled by the generative transformer.

Image Generation

712

Paper
Code

Visual Prompt Tuning for Generative Transfer Learning

1 code implementation • CVPR 2023 • Kihyuk Sohn, Yuan Hao, José Lezama, Luisa Polania, Huiwen Chang, Han Zhang, Irfan Essa, Lu Jiang

We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers.

Image Generation Transfer Learning +1

Paper
Code

MAGVIT: Masked Generative Video Transformer

1 code implementation • CVPR 2023 • Lijun Yu, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann, Ming-Hsuan Yang, Yuan Hao, Irfan Essa, Lu Jiang

We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various video synthesis tasks with a single model.

Ranked #1 on Video Prediction on Something-Something V2

Multi-Task Learning Text-to-Video Generation +2

846

Paper
Code

Streaming Traffic Flow Prediction Based on Continuous Reinforcement Learning

no code implementations • 24 Dec 2022 • Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang

We propose to formulate the problem as a continuous reinforcement learning task, where the agent is the next flow value predictor, the action is the next time-series flow value in the sensor, and the environment state is a dynamically fused representation of the sensor and transportation network.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Multi-View MOOC Quality Evaluation via Information-Aware Graph Representation Learning

no code implementations • 1 Jan 2023 • Lu Jiang, Yibin Wang, Jianan Wang, Pengyang Wang, Minghao Yin

To tackle the challenges, we formulate the problem as a course representation learning task-based and develop an Information-aware Graph Representation Learning(IaGRL) for multi-view MOOC quality evaluation.

Graph Representation Learning

Paper
Add Code

A Multi-Source Information Learning Framework for Airbnb Price Prediction

no code implementations • 1 Jan 2023 • Lu Jiang, Yuanhan Li, Na Luo, Jianan Wang, Qiao Ning

Thirdly, we uses the points of interest(POI) around the rental house information generates a variety of spatial network graphs, and learns the embedding of the network to obtain the spatial feature embedding.

Paper
Add Code

Muse: Text-To-Image Generation via Masked Generative Transformers

4 code implementations • 2 Jan 2023 • Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

Ranked #1 on Text-to-Image Generation on MS-COCO (FID metric)

Language Modelling Large Language Model +1

814

Paper
Code

Auditing Gender Presentation Differences in Text-to-Image Models

1 code implementation • 7 Feb 2023 • Yanzhe Zhang, Lu Jiang, Greg Turk, Diyi Yang

Text-to-image models, which can generate high-quality images based on textual input, have recently enabled various content-creation tools.

Paper
Code

StyleDrop: Text-to-Image Generation in Any Style

3 code implementations • 1 Jun 2023 • Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan

Pre-trained large text-to-image models synthesize impressive images with an appropriate use of text prompts.

Text-to-Image Generation

548

Paper
Code

Learning Disentangled Prompts for Compositional Image Synthesis

no code implementations • 1 Jun 2023 • Kihyuk Sohn, Albert Shaw, Yuan Hao, Han Zhang, Luisa Polania, Huiwen Chang, Lu Jiang, Irfan Essa

We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis.

Domain Adaptation Image Generation +1

Paper
Add Code

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

no code implementations • NeurIPS 2023 • Lijun Yu, Yong Cheng, Zhiruo Wang, Vivek Kumar, Wolfgang Macherey, Yanping Huang, David A. Ross, Irfan Essa, Yonatan Bisk, Ming-Hsuan Yang, Kevin Murphy, Alexander G. Hauptmann, Lu Jiang

In this work, we introduce Semantic Pyramid AutoEncoder (SPAE) for enabling frozen LLMs to perform both understanding and generation tasks involving non-linguistic modalities such as images or videos.

In-Context Learning multimodal generation

Paper
Add Code

VideoGLUE: Video General Understanding Evaluation of Foundation Models

1 code implementation • 6 Jul 2023 • Liangzhe Yuan, Nitesh Bharadwaj Gundavarapu, Long Zhao, Hao Zhou, Yin Cui, Lu Jiang, Xuan Yang, Menglin Jia, Tobias Weyand, Luke Friedman, Mikhail Sirotenko, Huisheng Wang, Florian Schroff, Hartwig Adam, Ming-Hsuan Yang, Ting Liu, Boqing Gong

We evaluate existing foundation models video understanding capabilities using a carefully designed experiment protocol consisting of three hallmark tasks (action recognition, temporal localization, and spatiotemporal localization), eight datasets well received by the community, and four adaptation methods tailoring a foundation model (FM) for a downstream task.

Action Recognition Temporal Localization +1

76,589

Paper
Code

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

no code implementations • 9 Oct 2023 • Lijun Yu, José Lezama, Nitesh B. Gundavarapu, Luca Versari, Kihyuk Sohn, David Minnen, Yong Cheng, Vighnesh Birodkar, Agrim Gupta, Xiuye Gu, Alexander G. Hauptmann, Boqing Gong, Ming-Hsuan Yang, Irfan Essa, David A. Ross, Lu Jiang

While Large Language Models (LLMs) are the dominant models for generative tasks in language, they do not perform as well as diffusion models on image and video generation.

Ranked #2 on Video Prediction on Kinetics-600 12 frames, 64x64

Action Recognition Image Generation +4

Paper
Add Code

Text-Driven Image Editing via Learnable Regions

1 code implementation • 28 Nov 2023 • Yuanze Lin, Yi-Wen Chen, Yi-Hsuan Tsai, Lu Jiang, Ming-Hsuan Yang

Language has emerged as a natural interface for image editing.

Image Generation

Paper
Code

Fine-grained Controllable Video Generation via Object Appearance and Context

no code implementations • 5 Dec 2023 • Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, Ming-Hsuan Yang

To achieve detailed control, we propose a unified framework to jointly inject control signals into the existing text-to-video model.

Text-to-Video Generation Video Generation

Paper
Add Code

Photorealistic Video Generation with Diffusion Models

no code implementations • 11 Dec 2023 • Agrim Gupta, Lijun Yu, Kihyuk Sohn, Xiuye Gu, Meera Hahn, Li Fei-Fei, Irfan Essa, Lu Jiang, José Lezama

We present W. A. L. T, a transformer-based approach for photorealistic video generation via diffusion modeling.

Ranked #1 on Video Prediction on Kinetics-600 12 frames, 64x64

Text-to-Video Generation Video Generation +1

Paper
Add Code

VideoPoet: A Large Language Model for Zero-Shot Video Generation

no code implementations • 21 Dec 2023 • Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam, Ming-Hsuan Yang, Irfan Essa, Huisheng Wang, David A. Ross, Bryan Seybold, Lu Jiang

We present VideoPoet, a language model capable of synthesizing high-quality video, with matching audio, from a large variety of conditioning signals.

Ranked #3 on Text-to-Video Generation on MSR-VTT

Language Modelling Large Language Model +2

Paper
Add Code

Spatial-Temporal Interplay in Human Mobility: A Hierarchical Reinforcement Learning Approach with Hypergraph Representation

no code implementations • 25 Dec 2023 • Zhaofan Zhang, Yanan Xiao, Lu Jiang, Dingqi Yang, Minghao Yin, Pengyang Wang

In the realm of human mobility, the decision-making process for selecting the next-visit location is intricately influenced by a trade-off between spatial and temporal constraints, which are reflective of individual needs and preferences.

Decision Making Hierarchical Reinforcement Learning +1

Paper
Add Code

BRAU-Net++: U-Shaped Hybrid CNN-Transformer Network for Medical Image Segmentation

1 code implementation • 1 Jan 2024 • Libin Lan, Pengzhou Cai, Lu Jiang, Xiaojuan Liu, Yongmei Li, Yudong Zhang

Specifically, BRAU-Net++ uses bi-level routing attention as the core building block to design our u-shaped encoder-decoder structure, in which both encoder and decoder are hierarchically constructed, so as to learn global semantic information while reducing computational complexity.

Image Segmentation Medical Image Segmentation +3

Paper
Code

A Generalized Framework with Adaptive Weighted Soft-Margin for Imbalanced SVM Classification

no code implementations • 13 Mar 2024 • Lu Jiang, Qi Wang, Yuhang Chang, Jianing Song, Haoyue Fu

In this paper, we present a new generalized framework with Adaptive Weight function for soft-margin Weighted SVM (AW-WSVM), which aims to enhance the issue of imbalance and outlier sensitivity in standard support vector machine (SVM) for classifying two-class data.

Emotion Classification

Paper
Add Code

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction

1 code implementation • ECCV 2020 • Junwei Liang, Lu Jiang, Alexander Hauptmann

We approach this problem through the real-data-free setting in which the model is trained only on 3D simulation data and applied out-of-the-box to a wide variety of real cameras.

Ranked #1 on Trajectory Forecasting on ActEV

Adversarial Attack Adversarial Defense +2

248

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.