1 code implementation • 16 Jul 2016 • Junwei Liang, Lu Jiang, Deyu Meng, Alexander Hauptmann
Learning video concept detectors automatically from the big but noisy web data with no additional manual annotations is a novel but challenging area in the multimedia and the machine learning community.
1 code implementation • 4 Aug 2017 • Lu Jiang, Junwei Liang, Liangliang Cao, Yannis Kalantidis, Sachin Farfade, Alexander Hauptmann
This paper proposes a new task, MemexQA: given a collection of photos or videos from a user, the goal is to automatically answer questions that help users recover their memory about events captured in the collection.
2 code implementations • CVPR 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Li-Jia Li, Alexander Hauptmann
Recent insights on language and vision with neural networks have been successfully applied to simple single-image visual question answering.
Ranked #1 on Memex Question Answering on MemexQA
1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Yannis Kalantidis, Li-Jia Li, and Alexander Hauptmann
In addition to a text answer, a few grounding photos are also given to justify the answer.
Ranked #1 on Memex Question Answering on MemexQA
2 code implementations • CVPR 2019 • Junwei Liang, Lu Jiang, Juan Carlos Niebles, Alexander Hauptmann, Li Fei-Fei
To facilitate the training, the network is learned with an auxiliary task of predicting future location in which the activity will happen.
Ranked #1 on Activity Prediction on ActEV
2 code implementations • 26 May 2019 • Junwei Liang, Jay D. Aronson, Alexander Hauptmann
Among other uses, VERA enables the localization of a shooter from just a few videos that include the sound of gunshots.
1 code implementation • CVPR 2020 • Junwei Liang, Lu Jiang, Kevin Murphy, Ting Yu, Alexander Hauptmann
The first contribution is a new dataset, created in a realistic 3D simulator, which is based on real world trajectory data, and then extrapolated by human annotators to achieve different latent goals.
Ranked #1 on Multi-future Trajectory Prediction on ForkingPaths
1 code implementation • Proceedings of the IEEE Winter Conference on Applications of Computer Vision Workshops 2020 • Wenhe Liu, Guoliang Kang, Po-Yao Huang, Xiaojun Chang, Yijun Qian, Junwei Liang, Liangke Gui, Jing Wen, Peng Chen
We propose an Efficient Activity Detection System, Argus, for Extended Video Analysis in the surveillance scenario.
1 code implementation • 4 Apr 2020 • Junwei Liang, Lu Jiang, Alexander Hauptmann
We refer to our method as SimAug.
Ranked #2 on Trajectory Prediction on ActEV
1 code implementation • 30 Jun 2020 • Xiaoyu Zhu, Junwei Liang, Alexander Hauptmann
This provides the first benchmark for quantitative evaluation of models to assess building damage using aerial videos.
4 code implementations • 20 Nov 2020 • Junwei Liang
With the advancement in computer vision deep learning, systems now are able to analyze an unprecedented amount of rich visual information from videos to enable applications such as autonomous driving, socially-aware robot assistant and public safety monitoring.
no code implementations • 4 Dec 2020 • Junwei Liang, Liangliang Cao, Xuehan Xiong, Ting Yu, Alexander Hauptmann
The experimental results show that the STAN model can consistently improve the state of the arts in both action detection and action recognition tasks.
no code implementations • ICCV 2021 • Xiaoyu Zhu, Jeffrey Chen, Xiangrui Zeng, Junwei Liang, Chengqi Li, Sinuo Liu, Sima Behpour, Min Xu
We propose a novel weakly supervised approach for 3D semantic segmentation on volumetric images.
1 code implementation • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2022 • Junwei Liang, He Zhu, Enwei Zhang, Jun Zhang
Distracted driver actions can be dangerous and cause severe accidents.
1 code implementation • 26 Sep 2022 • Junwei Liang, Enwei Zhang, Jun Zhang, Chunhua Shen
We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition.
no code implementations • 27 Sep 2022 • Chengzhi Lin, AnCong Wu, Junwei Liang, Jun Zhang, Wenhang Ge, Wei-Shi Zheng, Chunhua Shen
To address this problem, we propose a Text-Adaptive Multiple Visual Prototype Matching model, which automatically captures multiple prototypes to describe a video by adaptive aggregation of video token features.
7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li
The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.
1 code implementation • CVPR 2023 • Xiaoyu Zhu, Po-Yao Huang, Junwei Liang, Celso M. de Melo, Alexander Hauptmann
The model uses a hierarchical transformer with intra-frame off-set attention and inter-frame self-attention.
no code implementations • 19 Aug 2023 • Jinhui Ye, Junwei Liang
This paper studies introducing viewpoint invariant feature representations in existing action recognition architecture.
1 code implementation • 21 Aug 2023 • Teli Ma, Rong Li, Junwei Liang
A challenging new task is subsequently added to evaluate the robustness of GVLMs against inherent inclination toward syntactical correctness.
Ranked #83 on Visual Reasoning on Winoground
no code implementations • 14 Sep 2023 • Rong Li, Shijie Li, Xieyuanli Chen, Teli Ma, Juergen Gall, Junwei Liang
In this paper, we present TFNet, a range-image-based LiDAR semantic segmentation method that utilizes temporal information to address this issue.
Ranked #1 on Semantic Segmentation on SemanticPOSS
2 code implementations • 1 Oct 2023 • Zeying Gong, Yujin Tang, Junwei Liang
Although the Transformer has been the dominant architecture for time series forecasting tasks in recent years, a fundamental challenge remains: the permutation-invariant self-attention mechanism within Transformers leads to a loss of temporal information.
Ranked #1 on Time Series Forecasting on ETTh2 (336) Multivariate
1 code implementation • 4 Oct 2023 • Yujin Tang, Jiaming Zhou, Xiang Pan, Zeying Gong, Junwei Liang
To address these limitations, we introduce the PostRainBench, a comprehensive multi-variable NWP post-processing benchmark consisting of three datasets for NWP post-processing-based precipitation forecasting.
no code implementations • 28 Nov 2023 • Jiaming Zhou, Hanjun Li, Kun-Yu Lin, Junwei Liang
Under the weak supervision setting, action labels are provided for the whole video without precise start and end times of the action clip.
Ranked #1 on Long-video Activity Recognition on Breakfast
no code implementations • 29 Nov 2023 • Jinhui Ye, Jiaming Zhou, Hui Xiong, Junwei Liang
Specifically, at the core of GeoDeformer is the Geometric Deformation Predictor, a module designed to identify and quantify potential spatial and temporal geometric deformations within the given video.
no code implementations • 22 Jan 2024 • Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng
With the proposed ActionHub dataset, we further propose a novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR, which consists of a Dual Cross-modality Alignment module and a Cross-action Invariance Mining module.
no code implementations • 18 Mar 2024 • Xander Sun, Louis Lau, Hoyard Zhi, Ronghe Qiu, Junwei Liang
Furthermore, for the popular HM3D environment, we present an Instance Navigation (InstanceNav) task that requires going to a specific object instance with detailed descriptions, as opposed to the Object Navigation (ObjectNav) task where the goal is defined merely by the object category.
no code implementations • 24 Mar 2024 • Xiaoyu Zhu, Junwei Liang, Po-Yao Huang, Alex Hauptmann
The second is a Masked Consistency Learning module to learn class-discriminative representations.
1 code implementation • 25 Mar 2024 • Yujin Tang, Peijie Dong, Zhenheng Tang, Xiaowen Chu, Junwei Liang
Combining CNNs or ViTs, with RNNs for spatiotemporal forecasting, has yielded unparalleled results in predicting temporal and spatial dynamics.
1 code implementation • ECCV 2020 • Junwei Liang, Lu Jiang, Alexander Hauptmann
We approach this problem through the real-data-free setting in which the model is trained only on 3D simulation data and applied out-of-the-box to a wide variety of real cameras.
Ranked #1 on Trajectory Forecasting on ActEV