1 code implementation • CVPR 2022 • Yicong Li, Xiang Wang, Junbin Xiao, Wei Ji, Tat-Seng Chua
At its core is understanding the alignments between visual scenes in video and linguistic semantics in question to yield the answer.
1 code implementation • 23 May 2022 • Yuan YAO, Qianyu Chen, Ao Zhang, Wei Ji, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun
We show that PEVL enables state-of-the-art performance of detector-free VLP models on position-sensitive tasks such as referring expression comprehension and phrase grounding, and also improves the performance on position-insensitive tasks with grounded inputs.
1 code implementation • ICLR 2022 • Wei Ji, Jingjing Li, Qi Bi, Chuan Guo, Jie Liu, Li Cheng
The laborious and time-consuming manual annotation has become a real bottleneck in various practical scenarios.
no code implementations • 27 Apr 2022 • Zhedong Zheng, Jiayin Zhu, Wei Ji, Yi Yang, Tat-Seng Chua
In particular, to solve the inherent ambiguity among four implicit variables, i. e., camera position, shape, texture, and illumination, we study existing works and introduce an explainable structural causal map (SCM) to build our model.
1 code implementation • 22 Mar 2022 • Ao Zhang, Yuan YAO, Qianyu Chen, Wei Ji, Zhiyuan Liu, Maosong Sun, Tat-Seng Chua
Scene graph generation (SGG) aims to extract (subject, predicate, object) triplets in images.
no code implementations • 2 Mar 2022 • Yaoyao Zhong, Wei Ji, Junbin Xiao, Yicong Li, Weihong Deng, Tat-Seng Chua
Video Question Answering (VideoQA) aims to answer natural language questions according to the given videos.
1 code implementation • 26 Feb 2022 • Guanghao Yin, Wei Wang, Zehuan Yuan, Chuchu Han, Wei Ji, Shouqian Sun, Changhu Wang
The comparisons of distribution differences between HQ and LQ images can help our model better assess the image quality.
no code implementations • CVPR 2022 • Jingjing Li, Tianyu Yang, Wei Ji, Jue Wang, Li Cheng
Inspired by recent success in unsupervised contrastive representation learning, we propose a novel denoised cross-video contrastive algorithm, aiming to enhance the feature discrimination ability of video snippets for accurate temporal action localization in the weakly-supervised setting.
no code implementations • CVPR 2022 • Chuan Guo, Shihao Zou, Xinxin Zuo, Sen Wang, Wei Ji, Xingyu Li, Li Cheng
Automated generation of 3D human motions from text is a challenging problem.
1 code implementation • 12 Dec 2021 • Junbin Xiao, Angela Yao, Zhiyuan Liu, Yicong Li, Wei Ji, Tat-Seng Chua
To align with the multi-granular essence of linguistic concepts in language queries, we propose to model video as a conditional graph hierarchy which weaves together visual facts of different granularity in a level-wise manner, with the guidance of corresponding textual cues.
1 code implementation • 10 Dec 2021 • Meng Wei, Long Chen, Wei Ji, Xiaoyu Yue, Tat-Seng Chua
Since each verb is associated with a specific set of semantic roles, all existing GSR methods resort to a two-stage framework: predicting the verb in the first stage and detecting the semantic roles in the second stage.
1 code implementation • NeurIPS 2021 • Jingjing Li, Wei Ji, Qi Bi, Cheng Yan, Miao Zhang, Yongri Piao, Huchuan Lu, Li Cheng
As a by-product, a CapS dataset is constructed by augmenting existing benchmark training set with additional image tags and captions.
1 code implementation • 16 Nov 2021 • Andras Huebner, Wei Ji, Xiang Xiao
Lastly, we compare the performance of our baseline models with BART, a state-of-the-art language model that is effective for summarization.
2 code implementations • 1 Nov 2021 • Guanghua Yu, Qinyao Chang, Wenyu Lv, Chang Xu, Cheng Cui, Wei Ji, Qingqing Dang, Kaipeng Deng, Guanzhong Wang, Yuning Du, Baohua Lai, Qiwen Liu, Xiaoguang Hu, dianhai yu, Yanjun Ma
We investigate the applicability of the anchor-free strategy on lightweight object detection models.
Ranked #1 on
Object Detection
on MSCOCO
no code implementations • 29 Sep 2021 • Chenchen Ye, Lizi Liao, Fuli Feng, Wei Ji, Tat-Seng Chua
The core is to construct a latent content space for strategy optimization and disentangle the surface style from it.
no code implementations • 24 Jun 2021 • Tianjie Yang, Yaoru Luo, Wei Ji, Ge Yang
We conclude with an outlook on how deep learning could shape the future of this new generation of light microscopy technology.
1 code implementation • CVPR 2021 • Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng
Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD).
1 code implementation • CVPR 2021 • Wei Ji, Shuang Yu, Junde Wu, Kai Ma, Cheng Bian, Qi Bi, Jingjing Li, Hanruo Liu, Li Cheng, Yefeng Zheng
To our knowledge, our work is the first in producing calibrated predictions under different expertise levels for medical image segmentation.
1 code implementation • 3 Jun 2021 • Xun Yang, Fuli Feng, Wei Ji, Meng Wang, Tat-Seng Chua
To fill the research gap, we propose a causality-inspired VMR framework that builds structural causal model to capture the true effect of query and video content on the prediction.
no code implementations • 26 May 2021 • Feifei Shao, Long Chen, Jian Shao, Wei Ji, Shaoning Xiao, Lu Ye, Yueting Zhuang, Jun Xiao
With the success of deep neural networks in object detection, both WSOD and WSOL have received unprecedented attention.
1 code implementation • 8 Apr 2021 • Guanghao Yin, Wei Wang, Zehuan Yuan, Wei Ji, Dongdong Yu, Shouqian Sun, Tat-Seng Chua, Changhu Wang
We extract degradation prior at task-level with the proposed ConditionNet, which will be used to adapt the parameters of the basic SR network (BaseNet).
no code implementations • 15 Mar 2021 • Shaoning Xiao, Long Chen, Songyang Zhang, Wei Ji, Jian Shao, Lu Ye, Jun Xiao
State-of-the-art NLVL methods are almost in one-stage fashion, which can be typically grouped into two categories: 1) anchor-based approach: it first pre-defines a series of video segment candidates (e. g., by sliding window), and then does classification for each candidate; 2) anchor-free approach: it directly predicts the probabilities for each video frame as a boundary or intermediate frame inside the positive segment.
1 code implementation • ICCV 2021 • Miao Zhang, Jie Liu, Yifei Wang, Yongri Piao, Shunyu Yao, Wei Ji, Jingjing Li, Huchuan Lu, Zhongxuan Luo
Our bidirectional dynamic fusion strategy encourages the interaction of spatial and temporal information in a dynamic manner.
Ranked #13 on
Video Polyp Segmentation
on SUN-SEG-Easy
no code implementations • 1 Jan 2021 • Zhuoyu Wei, Wei Ji, Xiubo Geng, Yining Chen, Baihua Chen, Tao Qin, Daxin Jiang
We notice that some real-world QA tasks are more complex, which cannot be solved by end-to-end neural networks or translated to any kind of formal representations.
no code implementations • 24 Aug 2020 • Mengyu Zhou, Qingtao Li, Xinyi He, Yuejiang Li, Yibo Liu, Wei Ji, Shi Han, Yining Chen, Daxin Jiang, Dongmei Zhang
It is common for people to create different types of charts to explore a multi-dimensional dataset (table).
2 code implementations • ECCV 2020 • Wei Ji, Jingjing Li, Miao Zhang, Yongri Piao, Huchuan Lu
The explicitly extracted edge information goes together with saliency to give more emphasis to the salient regions and object boundaries.
Ranked #18 on
RGB-D Salient Object Detection
on NJU2K
no code implementations • 30 Apr 2020 • Jing Han, Kun Qian, Meishu Song, Zijiang Yang, Zhao Ren, Shuo Liu, Juan Liu, Huaiyuan Zheng, Wei Ji, Tomoya Koike, Xiao Li, Zixing Zhang, Yoshiharu Yamamoto, Björn W. Schuller
In particular, by analysing speech recordings from these patients, we construct audio-only-based models to automatically categorise the health state of patients from four aspects, including the severity of illness, sleep quality, fatigue, and anxiety.
no code implementations • 6 Oct 2018 • Yiming Wu, Wei Ji, Xi Li, Gang Wang, Jianwei Yin, Fei Wu
As a fundamental and challenging problem in computer vision, hand pose estimation aims to estimate the hand joint locations from depth images.