Search Results for author: Chen Ju

Found 20 papers, 2 papers with code

Cell Variational Information Bottleneck Network

no code implementations • 22 Mar 2024 • Zhonghua Zhai, Chen Ju, Jinsong Lan, Shuai Xiao

In this work, we propose Cell Variational Information Bottleneck Network (cellVIB), a convolutional neural network using information bottleneck mechanism, which can be combined with the latest feedforward network architecture in an end-to-end training method.

Face Recognition Representation Learning

Paper
Add Code

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

no code implementations • 19 Mar 2024 • Mengting Chen, Xi Chen, Zhonghua Zhai, Chen Ju, Xuewen Hong, Jinsong Lan, Shuai Xiao

This paper introduces a novel framework for virtual try-on, termed Wear-Any-Way.

Virtual Try-on

Paper
Add Code

Audio-Visual Segmentation via Unlabeled Frame Exploitation

no code implementations • 17 Mar 2024 • Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya zhang, Yanfeng Wang

NFs, temporally adjacent to the labeled frame, often contain rich motion information that assists in the accurate localization of sounding objects.

valid

Paper
Add Code

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models

no code implementations • 12 Dec 2023 • Chen Ju, Haicheng Wang, Zeqian Li, Xu Chen, Zhonghua Zhai, Weilin Huang, Shuai Xiao

Vision-Language Large Models (VLMs) have become primary backbone of AI, due to the impressive performance.

Paper
Add Code

Enhancing Cross-domain Click-Through Rate Prediction via Explicit Feature Augmentation

no code implementations • 30 Nov 2023 • Xu Chen, Zida Cheng, Jiangchao Yao, Chen Ju, Weilin Huang, Jinsong Lan, Xiaoyi Zeng, Shuai Xiao

Later the augmentation network employs the explicit cross-domain knowledge as augmented information to boost the target domain CTR prediction.

Click-Through Rate Prediction Transfer Learning

Paper
Add Code

AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation

no code implementations • NeurIPS 2023 • Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Ya zhang, Yanfeng Wang

The results show the superior performance of attribute decomposition-aggregation.

Attribute Open Vocabulary Semantic Segmentation +1

Paper
Add Code

Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation

no code implementations • 25 Jul 2023 • Jinxiang Liu, Chen Ju, Chaofan Ma, Yanfeng Wang, Yu Wang, Ya zhang

The goal of the audio-visual segmentation (AVS) task is to segment the sounding objects in the video frames using audio cues.

Segmentation

Paper
Add Code

Multi-Modal Prototypes for Open-Set Semantic Segmentation

no code implementations • 5 Jul 2023 • Yuhuan Yang, Chaofan Ma, Chen Ju, Ya zhang, Yanfeng Wang

In this paper, we define a unified setting termed as open-set semantic segmentation (O3S), which aims to learn seen and unseen semantics from both visual examples and textual names.

Segmentation Semantic Segmentation

Paper
Add Code

Annotation-free Audio-Visual Segmentation

no code implementations • 18 May 2023 • Jinxiang Liu, Yu Wang, Chen Ju, Chaofan Ma, Ya zhang, Weidi Xie

The objective of Audio-Visual Segmentation (AVS) is to localise the sounding objects within visual scenes by accurately predicting pixel-wise segmentation masks.

Image Segmentation Segmentation +1

Paper
Add Code

Image to Multi-Modal Retrieval for Industrial Scenarios

no code implementations • 6 May 2023 • Zida Cheng, Chen Ju, Xu Chen, Zhonghua Zhai, Shuai Xiao, Xiaoyi Zeng, Weilin Huang

We formally define a novel valuable information retrieval task: image-to-multi-modal-retrieval (IMMR), where the query is an image and the doc is an entity with both image and textual description.

Cross-Modal Retrieval Information Retrieval +2

Paper
Add Code

Multi-modal Prompting for Low-Shot Temporal Action Localization

no code implementations • 21 Mar 2023 • Chen Ju, Zeqian Li, Peisen Zhao, Ya zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang, Weidi Xie

In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.

Action Classification Temporal Action Localization

Paper
Add Code

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery

no code implementations • 17 Mar 2023 • Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Jinxiang Liu, Yu Wang, Ya zhang, Yanfeng Wang

However, the challenges exist as there is one structural difference between generative and discriminative models, which limits the direct use.

Object Object Discovery +1

Paper
Add Code

Constraint and Union for Partially-Supervised Temporal Sentence Grounding

no code implementations • 20 Feb 2023 • Chen Ju, Haicheng Wang, Jinxiang Liu, Chaofan Ma, Ya zhang, Peisen Zhao, Jianlong Chang, Qi Tian

Temporal sentence grounding aims to detect the event timestamps described by the natural language query from given untrimmed videos.

Sentence Temporal Sentence Grounding

Paper
Add Code

Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization

no code implementations • CVPR 2023 • Chen Ju, Kunhao Zheng, Jinxiang Liu, Peisen Zhao, Ya zhang, Jianlong Chang, Yanfeng Wang, Qi Tian

And as a result, the dual-branch complementarity is effectively fused to promote a strong alliance.

Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization

Paper
Add Code

Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation

no code implementations • 26 Jun 2022 • Jinxiang Liu, Chen Ju, Weidi Xie, Ya zhang

We present a simple yet effective self-supervised framework for audio-visual representation learning, to localize the sound source in videos.

Cross-Modal Retrieval Representation Learning +1

Paper
Add Code

Prompting Visual-Language Models for Efficient Video Understanding

1 code implementation • 8 Dec 2021 • Chen Ju, Tengda Han, Kunhao Zheng, Ya zhang, Weidi Xie

Image-based visual-language (I-VL) pre-training has shown great success for learning joint visual-textual representations from large-scale web data, revealing remarkable ability for zero-shot generalisation.

Ranked #5 on Zero-Shot Action Detection on ActivityNet-1.3

Action Recognition Language Modelling +4

177

Paper
Code

Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization

no code implementations • 6 Apr 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Xiaoyun Zhang, Qi Tian

To solve this issue, we introduce an adaptive mutual supervision framework (AMS) with two branches, where the base branch adopts CAS to localize the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through a novel adaptive sampler.

Ranked #7 on Weakly Supervised Action Localization on THUMOS14

Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1

Paper
Add Code

Divide and Conquer for Single-Frame Temporal Action Localization

no code implementations • ICCV 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Yanfeng Wang, Qi Tian

Single-frame temporal action localization (STAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.

Temporal Action Localization

Paper
Add Code

Point-Level Temporal Action Localization: Bridging Fully-supervised Proposals to Weakly-supervised Losses

no code implementations • 15 Dec 2020 • Chen Ju, Peisen Zhao, Ya zhang, Yanfeng Wang, Qi Tian

Point-Level temporal action localization (PTAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.

Ranked #3 on Weakly Supervised Action Localization on BEOID

Weakly Supervised Action Localization

Paper
Add Code

Bottom-Up Temporal Action Localization with Mutual Regularization

1 code implementation • ECCV 2020 • Peisen Zhao, Lingxi Xie, Chen Ju, Ya zhang, Yan-Feng Wang, Qi Tian

To alleviate this problem, we introduce two regularization terms to mutually regularize the learning procedure: the Intra-phase Consistency (IntraC) regularization is proposed to make the predictions verified inside each phase; and the Inter-phase Consistency (InterC) regularization is proposed to keep consistency between these phases.

Temporal Action Localization

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.