Search Results for author: Junwei Liang

Found 30 papers, 20 papers with code

Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning

1 code implementation • 16 Jul 2016 • Junwei Liang, Lu Jiang, Deyu Meng, Alexander Hauptmann

Learning video concept detectors automatically from the big but noisy web data with no additional manual annotations is a novel but challenging area in the multimedia and the machine learning community.

BIG-bench Machine Learning

Paper
Code

MemexQA: Visual Memex Question Answering

1 code implementation • 4 Aug 2017 • Lu Jiang, Junwei Liang, Liangliang Cao, Yannis Kalantidis, Sachin Farfade, Alexander Hauptmann

This paper proposes a new task, MemexQA: given a collection of photos or videos from a user, the goal is to automatically answer questions that help users recover their memory about events captured in the collection.

Memex Question Answering Question Answering +1

Paper
Code

Focal Visual-Text Attention for Visual Question Answering

2 code implementations • CVPR 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Li-Jia Li, Alexander Hauptmann

Recent insights on language and vision with neural networks have been successfully applied to simple single-image visual question answering.

Ranked #1 on Memex Question Answering on MemexQA

Memex Question Answering Question Answering +1

Paper
Code

Focal Visual-Text Attention for Memex Question Answering

1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Yannis Kalantidis, Li-Jia Li, and Alexander Hauptmann

In addition to a text answer, a few grounding photos are also given to justify the answer.

Ranked #1 on Memex Question Answering on MemexQA

Memex Question Answering Question Answering +1

Paper
Code

Peeking into the Future: Predicting Future Person Activities and Locations in Videos

2 code implementations • CVPR 2019 • Junwei Liang, Lu Jiang, Juan Carlos Niebles, Alexander Hauptmann, Li Fei-Fei

To facilitate the training, the network is learned with an auxiliary task of predicting future location in which the activity will happen.

Ranked #1 on Activity Prediction on ActEV

Future prediction Human motion prediction +4

350

Paper
Code

Technical Report of the Video Event Reconstruction and Analysis (VERA) System -- Shooter Localization, Models, Interface, and Beyond

2 code implementations • 26 May 2019 • Junwei Liang, Jay D. Aronson, Alexander Hauptmann

Among other uses, VERA enables the localization of a shooter from just a few videos that include the sound of gunshots.

Gunshot Detection Shooter Localization +2

Paper
Code

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction

1 code implementation • CVPR 2020 • Junwei Liang, Lu Jiang, Kevin Murphy, Ting Yu, Alexander Hauptmann

The first contribution is a new dataset, created in a realistic 3D simulator, which is based on real world trajectory data, and then extrapolated by human annotators to achieve different latent goals.

Ranked #1 on Multi-future Trajectory Prediction on ForkingPaths

Autonomous Driving Human motion prediction +5

248

Paper
Code

Argus: Efficient Activity Detection System for Extended Video Analysis

1 code implementation • Proceedings of the IEEE Winter Conference on Applications of Computer Vision Workshops 2020 • Wenhe Liu, Guoliang Kang, Po-Yao Huang, Xiaojun Chang, Yijun Qian, Junwei Liang, Liangke Gui, Jing Wen, Peng Chen

We propose an Efficient Activity Detection System, Argus, for Extended Video Analysis in the surveillance scenario.

Action Detection Activity Detection +5

457

Paper
Code

SimAug: Learning Robust Representations from 3D Simulation for Pedestrian Trajectory Prediction in Unseen Cameras

1 code implementation • 4 Apr 2020 • Junwei Liang, Lu Jiang, Alexander Hauptmann

We refer to our method as SimAug.

Ranked #2 on Trajectory Prediction on ActEV

Adversarial Attack Adversarial Defense +2

248

Paper
Code

MSNet: A Multilevel Instance Segmentation Network for Natural Disaster Damage Assessment in Aerial Videos

1 code implementation • 30 Jun 2020 • Xiaoyu Zhu, Junwei Liang, Alexander Hauptmann

This provides the first benchmark for quantitative evaluation of models to assess building damage using aerial videos.

Instance Segmentation Region Proposal +1

Paper
Code

From Recognition to Prediction: Analysis of Human Action and Trajectory Prediction in Video

4 code implementations • 20 Nov 2020 • Junwei Liang

With the advancement in computer vision deep learning, systems now are able to analyze an unprecedented amount of rich visual information from videos to enable applications such as autonomous driving, socially-aware robot assistant and public safety monitoring.

Action Detection Autonomous Driving +1

457

Paper
Code

Spatial-Temporal Alignment Network for Action Recognition and Detection

no code implementations • 4 Dec 2020 • Junwei Liang, Liangliang Cao, Xuehan Xiong, Ting Yu, Alexander Hauptmann

The experimental results show that the STAN model can consistently improve the state of the arts in both action detection and action recognition tasks.

Action Detection Action Recognition

Paper
Add Code

Weakly Supervised 3D Semantic Segmentation Using Cross-Image Consensus and Inter-Voxel Affinity Relations

no code implementations • ICCV 2021 • Xiaoyu Zhu, Jeffrey Chen, Xiangrui Zeng, Junwei Liang, Chengqi Li, Sinuo Liu, Sima Behpour, Min Xu

We propose a novel weakly supervised approach for 3D semantic segmentation on volumetric images.

3D Semantic Segmentation Segmentation

Paper
Add Code

Stargazer: A transformer-based driver action detection system for intelligent transportation

1 code implementation • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2022 • Junwei Liang, He Zhu, Enwei Zhang, Jun Zhang

Distracted driver actions can be dangerous and cause severe accidents.

Action Detection Action Recognition +1

Paper
Code

Multi-dataset Training of Transformers for Robust Action Recognition

1 code implementation • 26 Sep 2022 • Junwei Liang, Enwei Zhang, Jun Zhang, Chunhua Shen

We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition.

Action Recognition Temporal Action Localization

Paper
Code

Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval

no code implementations • 27 Sep 2022 • Chengzhi Lin, AnCong Wu, Junwei Liang, Jun Zhang, Wenhang Ge, Wei-Shi Zheng, Chunhua Shen

To address this problem, we propose a Text-Adaptive Multiple Visual Prototype Matching model, which automatically captures multiple prototypes to describe a video by adaptive aggregation of video token features.

Cross-Modal Retrieval Retrieval +2

Paper
Add Code

SoccerNet 2022 Challenges Results

7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Paper
Code

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition

1 code implementation • CVPR 2023 • Xiaoyu Zhu, Po-Yao Huang, Junwei Liang, Celso M. de Melo, Alexander Hauptmann

The model uses a hierarchical transformer with intra-frame off-set attention and inter-frame self-attention.

Action Recognition Temporal Action Localization

Paper
Code

Spatial-Temporal Alignment Network for Action Recognition

no code implementations • 19 Aug 2023 • Jinhui Ye, Junwei Liang

This paper studies introducing viewpoint invariant feature representations in existing action recognition architecture.

Action Recognition

Paper
Add Code

An Examination of the Compositionality of Large Generative Vision-Language Models

1 code implementation • 21 Aug 2023 • Teli Ma, Rong Li, Junwei Liang

A challenging new task is subsequently added to evaluate the robustness of GVLMs against inherent inclination toward syntactical correctness.

Ranked #83 on Visual Reasoning on Winoground

Visual Reasoning

Paper
Code

TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation

no code implementations • 14 Sep 2023 • Rong Li, Shijie Li, Xieyuanli Chen, Teli Ma, Juergen Gall, Junwei Liang

In this paper, we present TFNet, a range-image-based LiDAR semantic segmentation method that utilizes temporal information to address this issue.

Ranked #1 on Semantic Segmentation on SemanticPOSS

Autonomous Driving LIDAR Semantic Segmentation +1

Paper
Add Code

PatchMixer: A Patch-Mixing Architecture for Long-Term Time Series Forecasting

2 code implementations • 1 Oct 2023 • Zeying Gong, Yujin Tang, Junwei Liang

Although the Transformer has been the dominant architecture for time series forecasting tasks in recent years, a fundamental challenge remains: the permutation-invariant self-attention mechanism within Transformers leads to a loss of temporal information.

Ranked #1 on Time Series Forecasting on ETTh2 (336) Multivariate

Time Series Time Series Forecasting

124

Paper
Code

PostRainBench: A comprehensive benchmark and a new model for precipitation forecasting

1 code implementation • 4 Oct 2023 • Yujin Tang, Jiaming Zhou, Xiang Pan, Zeying Gong, Junwei Liang

To address these limitations, we introduce the PostRainBench, a comprehensive multi-variable NWP post-processing benchmark consisting of three datasets for NWP post-processing-based precipitation forecasting.

NWP Post-processing Precipitation Forecasting

Paper
Code

AdaFocus: Towards End-to-end Weakly Supervised Learning for Long-Video Action Understanding

no code implementations • 28 Nov 2023 • Jiaming Zhou, Hanjun Li, Kun-Yu Lin, Junwei Liang

Under the weak supervision setting, action labels are provided for the whole video without precise start and end times of the action clip.

Ranked #1 on Long-video Activity Recognition on Breakfast

Action Classification Action Recognition +5

Paper
Add Code

GeoDeformer: Geometric Deformable Transformer for Action Recognition

no code implementations • 29 Nov 2023 • Jinhui Ye, Jiaming Zhou, Hui Xiong, Junwei Liang

Specifically, at the core of GeoDeformer is the Geometric Deformation Predictor, a module designed to identify and quantify potential spatial and temporal geometric deformations within the given video.

Action Recognition

Paper
Add Code

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition

no code implementations • 22 Jan 2024 • Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng

With the proposed ActionHub dataset, we further propose a novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR, which consists of a Dual Cross-modality Alignment module and a Cross-action Invariance Mining module.

Action Recognition Video Description +1

Paper
Add Code

Prioritized Semantic Learning for Zero-shot Instance Navigation

no code implementations • 18 Mar 2024 • Xander Sun, Louis Lau, Hoyard Zhi, Ronghe Qiu, Junwei Liang

Furthermore, for the popular HM3D environment, we present an Instance Navigation (InstanceNav) task that requires going to a specific object instance with detailed descriptions, as opposed to the Object Navigation (ObjectNav) task where the goal is defined merely by the object category.

Language Modelling Object

Paper
Add Code

Adversarially Masked Video Consistency for Unsupervised Domain Adaptation

no code implementations • 24 Mar 2024 • Xiaoyu Zhu, Junwei Liang, Po-Yao Huang, Alex Hauptmann

The second is a Masked Consistency Learning module to learn class-discriminative representations.

Unsupervised Domain Adaptation

Paper
Add Code

VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting

1 code implementation • 25 Mar 2024 • Yujin Tang, Peijie Dong, Zhenheng Tang, Xiaowen Chu, Junwei Liang

Combining CNNs or ViTs, with RNNs for spatiotemporal forecasting, has yielded unparalleled results in predicting temporal and spatial dynamics.

Paper
Code

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction

1 code implementation • ECCV 2020 • Junwei Liang, Lu Jiang, Alexander Hauptmann

We approach this problem through the real-data-free setting in which the model is trained only on 3D simulation data and applied out-of-the-box to a wide variety of real cameras.

Ranked #1 on Trajectory Forecasting on ActEV

Adversarial Attack Adversarial Defense +2

248

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.