Search Results for author: Yueting Zhuang

Found 114 papers, 40 papers with code

De-Biased Court's View Generation with Causality

no code implementations EMNLP 2020 Yiquan Wu, Kun Kuang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Jun Xiao, Yueting Zhuang, Luo Si, Fei Wu

Court{'}s view generation is a novel but essential task for legal AI, aiming at improving the interpretability of judgment prediction results and enabling automatic legal document generation.

counterfactual Text Generation

Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering

no code implementations22 Feb 2024 Chang Zong, Yuchen Yan, Weiming Lu, Eliot Huang, Jian Shao, Yueting Zhuang

We evaluated the performance of our framework using three benchmark datasets, and the results show that our framework outperforms state-of-the-art systems on the LC-QuAD and YAGO-QA benchmarks, yielding F1 scores of 11. 8% and 20. 7%, respectively.

Knowledge Base Question Answering

Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives

no code implementations4 Jan 2024 Wenqi Zhang, Yongliang Shen, Linjuan Wu, Qiuying Peng, Jun Wang, Yueting Zhuang, Weiming Lu

Experiments conducted on a series of reasoning and translation tasks with different LLMs serve to underscore the effectiveness and generality of our strategy.

Language Modelling Large Language Model

HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

1 code implementation22 Nov 2023 Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang

Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks.

Attribute counterfactual +3

Revisiting the Domain Shift and Sample Uncertainty in Multi-source Active Domain Transfer

no code implementations21 Nov 2023 Wenqiao Zhang, Zheqi Lv, Hao Zhou, Jia-Wei Liu, Juncheng Li, Mengze Li, Siliang Tang, Yueting Zhuang

Active Domain Adaptation (ADA) aims to maximally boost model adaptation in a new target domain by actively selecting a limited number of target data to annotate. This setting neglects the more practical scenario where training data are collected from multiple sources.

Domain Adaptation Transfer Learning

De-fine: Decomposing and Refining Visual Programs with Auto-Feedback

no code implementations21 Nov 2023 Minghe Gao, Juncheng Li, Hao Fei, Liang Pang, Wei Ji, Guoming Wang, Wenqiao Zhang, Siliang Tang, Yueting Zhuang

Visual programming, a modular and generalizable paradigm, integrates different modules and Python operators to solve various vision-language tasks.

Logical Reasoning

Adapt Anything: Tailor Any Image Classifiers across Domains And Categories Using Text-to-Image Diffusion Models

no code implementations25 Oct 2023 WeiJie Chen, Haoyu Wang, Shicai Yang, Lei Zhang, Wei Wei, Yanning Zhang, Luojun Lin, Di Xie, Yueting Zhuang

Such a one-for-all adaptation paradigm allows us to adapt anything in the world using only one text-to-image generator as well as the corresponding unlabeled target data.

Domain Adaptation Image Classification

Improving Vision Anomaly Detection with the Guidance of Language Modality

1 code implementation4 Oct 2023 Dong Chen, Kaihang Pan, Guoming Wang, Yueting Zhuang, Siliang Tang

To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality, and then the latent space of vision modality will be learned with the guidance of the matrix.

Anomaly Detection Defect Detection +1

Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model

no code implementations15 Aug 2023 Bosheng Qin, Wentao Ye, Qifan Yu, Siliang Tang, Yueting Zhuang

Our approach employs a pretrained T2I diffusion model to generate each video frame in an autoregressive fashion.

Image Inpainting

Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions

1 code implementation8 Aug 2023 Juncheng Li, Kaihang Pan, Zhiqi Ge, Minghe Gao, Hanwang Zhang, Wei Ji, Wenqiao Zhang, Tat-Seng Chua, Siliang Tang, Yueting Zhuang

This shortcoming results in MLLMs' underperformance in comprehending demonstrative instructions consisting of multiple, interleaved, and multimodal instructions that demonstrate the required context to complete a task.

Image Captioning Instruction Following

Degeneration-Tuning: Using Scrambled Grid shield Unwanted Concepts from Stable Diffusion

no code implementations2 Aug 2023 Zixuan Ni, Longhui Wei, Jiacheng Li, Siliang Tang, Yueting Zhuang, Qi Tian

In this work, we propose a novel strategy named \textbf{Degeneration-Tuning (DT)} to shield contents of unwanted concepts from SD weights.

ZJU ReLER Submission for EPIC-KITCHEN Challenge 2023: TREK-150 Single Object Tracking

no code implementations5 Jul 2023 Yuanyou Xu, Jiahao Li, Zongxin Yang, Yi Yang, Yueting Zhuang

MSDeAOT efficiently propagates object masks from previous frames to the current frame using two feature scales of 16 and 8.

Object Segmentation +4

ZJU ReLER Submission for EPIC-KITCHEN Challenge 2023: Semi-Supervised Video Object Segmentation

no code implementations5 Jul 2023 Jiahao Li, Yuanyou Xu, Zongxin Yang, Yi Yang, Yueting Zhuang

The Associating Objects with Transformers (AOT) framework has exhibited exceptional performance in a wide range of complex scenarios for video object segmentation.

Object Position +4

Improving Reference-based Distinctive Image Captioning with Contrastive Rewards

no code implementations25 Jun 2023 Yangjun Mao, Jun Xiao, Dong Zhang, Meng Cao, Jian Shao, Yueting Zhuang, Long Chen

A recent DIC method proposes to generate distinctive captions by comparing the target image with a set of semantic-similar reference images, i. e., reference-based DIC (Ref-DIC).

Benchmarking Contrastive Learning +1

Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow

1 code implementation12 Jun 2023 Wenqi Zhang, Yongliang Shen, Weiming Lu, Yueting Zhuang

Various industries such as finance, meteorology, and energy generate vast amounts of heterogeneous data every day.

Interactive Data Synthesis for Systematic Vision Adaptation via LLMs-AIGCs Collaboration

1 code implementation22 May 2023 Qifan Yu, Juncheng Li, Wentao Ye, Siliang Tang, Yueting Zhuang

Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images.

Data Augmentation Prompt Engineering

DiffusionNER: Boundary Diffusion for Named Entity Recognition

2 code implementations22 May 2023 Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang

In this paper, we propose DiffusionNER, which formulates the named entity recognition task as a boundary-denoising diffusion process and thus generates named entities from noisy spans.

Chinese Named Entity Recognition Denoising +4

InstructVid2Vid: Controllable Video Editing with Natural Language Instructions

no code implementations21 May 2023 Bosheng Qin, Juncheng Li, Siliang Tang, Tat-Seng Chua, Yueting Zhuang

To improve the consistency between adjacent frames of generated videos, we propose the Frame Difference Loss, which is incorporated during the training process.

Attribute Image Generation +2

Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models

1 code implementation NeurIPS 2023 Lin Li, Jun Xiao, Guikun Chen, Jian Shao, Yueting Zhuang, Long Chen

To dynamically fuse different cues, we further introduce a chain-of-thought method that prompts LLMs to generate reasonable weights for different visual cues.

Relation

Continual Vision-Language Representation Learning with Off-Diagonal Information

no code implementations11 May 2023 Zixuan Ni, Longhui Wei, Siliang Tang, Yueting Zhuang, Qi Tian

Moreover, we empirically and theoretically demonstrate how SD leads to a performance decline for CLIP on cross-modal retrieval tasks.

Continual Learning Contrastive Learning +4

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

1 code implementation NeurIPS 2023 Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang

Solving complicated AI tasks with different domains and modalities is a key step toward artificial general intelligence.

Philosophy

Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World

1 code implementation ICCV 2023 Qifan Yu, Juncheng Li, Yu Wu, Siliang Tang, Wei Ji, Yueting Zhuang

Based on that, we further introduce a novel Entangled cross-modal prompt approach for open-world predicate scene graph generation (Epic), where models can generalize to unseen predicates in a zero-shot manner.

Graph Generation Language Modelling +1

Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models

no code implementations ICCV 2023 Juncheng Li, Minghe Gao, Longhui Wei, Siliang Tang, Wenqiao Zhang, Mengze Li, Wei Ji, Qi Tian, Tat-Seng Chua, Yueting Zhuang

Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to downstream tasks in a parameter -- and data -- efficient way, by learning the ``soft prompts'' to condition frozen pre-training models.

Domain Generalization Few-Shot Learning

Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding

no code implementations7 Mar 2023 Jiacheng Li, Longhui Wei, Zongyuan Zhan, Xin He, Siliang Tang, Qi Tian, Yueting Zhuang

To better accelerate the generative transformers while keeping good generation quality, we propose Lformer, a semi-autoregressive text-to-image generation model.

Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding

no code implementations22 Jan 2023 Juncheng Li, Siliang Tang, Linchao Zhu, Wenqiao Zhang, Yi Yang, Tat-Seng Chua, Fei Wu, Yueting Zhuang

To systematically benchmark the compositional generalizability of temporal grounding models, we introduce a new Compositional Temporal Grounding task and construct two new dataset splits, i. e., Charades-CG and ActivityNet-CG.

Semantic correspondence Sentence

1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

no code implementations12 Jan 2023 Wei Zhao, Binbin Chen, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang

The domain adaptation part is implemented as a Source-Free Domain Adaptation paradigm, which only uses the pre-trained model and the unlabeled target data to further optimize in a self-supervised training manner.

Domain Generalization object-detection +3

1st Place Solution for ECCV 2022 OOD-CV Challenge Image Classification Track

no code implementations12 Jan 2023 Yilu Guo, Xingyue Shi, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang

In the test-time training stage, we use the pre-trained model to assign noisy label for the unlabeled target data, and propose a Label-Periodically-Updated DivideMix method for noisy label learning.

Data Augmentation Domain Generalization +3

Unsupervised Prompt Tuning for Text-Driven Object Detection

no code implementations ICCV 2023 Weizhen He, WeiJie Chen, Binbin Chen, Shicai Yang, Di Xie, Luojun Lin, Donglian Qi, Yueting Zhuang

In this paper, we delve into this problem and propose an Unsupervised Prompt Tuning framework for text-driven object detection, which is composed of two novel mean teaching mechanisms.

Data Augmentation Object +4

DBA: Efficient Transformer with Dynamic Bilinear Low-Rank Attention

no code implementations24 Nov 2022 Bosheng Qin, Juncheng Li, Siliang Tang, Yueting Zhuang

Furthermore, we show that the hidden state dimension can be approximated by extending the Johnson-Lindenstrauss lemma, optimizing the attention in bilinear form.

LEMMA

Citation Trajectory Prediction via Publication Influence Representation Using Temporal Knowledge Graph

no code implementations2 Oct 2022 Chang Zong, Yueting Zhuang, Weiming Lu, Jian Shao, Siliang Tang

In this paper, we propose CTPIR, a new citation trajectory prediction framework that is able to represent the influence (the momentum of citation) of either new or existing publications using the history information of all their attributes.

Attribute Graph Embedding +1

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos

1 code implementation3 Aug 2022 Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang

In this paper, we introduce a new task, named Temporal Emotion Localization in videos~(TEL), which aims to detect human emotions and localize their corresponding temporal boundaries in untrimmed videos with aligned subtitles.

Emotion Classification Temporal Action Localization +1

BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval

no code implementations9 Jul 2022 Wenqiao Zhang, Jiannan Guo, Mengze Li, Haochen Shi, Shengyu Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang

In this scenario, the input image serves as an intuitive context and background for the search, while the corresponding language expressly requests new traits on how specific characteristics of the query image should be modified in order to get the intended target image.

Content-Based Image Retrieval counterfactual +2

ReLER@ZJU-Alibaba Submission to the Ego4D Natural Language Queries Challenge 2022

1 code implementation1 Jul 2022 Naiyuan Liu, Xiaohan Wang, Xiaobo Li, Yi Yang, Yueting Zhuang

In this report, we present the ReLER@ZJU-Alibaba submission to the Ego4D Natural Language Queries (NLQ) Challenge in CVPR 2022.

Data Augmentation Natural Language Queries

Slimmable Domain Adaptation

1 code implementation CVPR 2022 Rang Meng, WeiJie Chen, Shicai Yang, Jie Song, Luojun Lin, Di Xie, ShiLiang Pu, Xinchao Wang, Mingli Song, Yueting Zhuang

In this paper, we introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank, from which models of different capacities can be sampled to accommodate different accuracy-efficiency trade-offs.

Domain Generalization Unsupervised Domain Adaptation

Label Matching Semi-Supervised Object Detection

3 code implementations CVPR 2022 Binbin Chen, WeiJie Chen, Shicai Yang, Yunyi Xuan, Jie Song, Di Xie, ShiLiang Pu, Mingli Song, Yueting Zhuang

To remedy this issue, we present a novel label assignment mechanism for self-training framework, namely proposal self-assignment, which injects the proposals from student into teacher and generates accurate pseudo labels to match each proposal in the student model accordingly.

Object object-detection +2

Transductive CLIP with Class-Conditional Contrastive Learning

no code implementations13 Jun 2022 Junchu Huang, WeiJie Chen, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang

This framework can reduce the impact of noisy labels from CLIP model effectively by combining both techniques.

Contrastive Learning Pseudo Label +1

Learning Domain Adaptive Object Detection with Probabilistic Teacher

2 code implementations13 Jun 2022 Meilin Chen, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Yunfeng Yan, Donglian Qi, Yueting Zhuang, Di Xie, ShiLiang Pu

In addition, we conduct anchor adaptation in parallel with localization adaptation, since anchor can be regarded as a learnable parameter.

Object object-detection +1

Robust Meta-learning with Sampling Noise and Label Noise via Eigen-Reptile

1 code implementation4 Jun 2022 Dong Chen, Lingfei Wu, Siliang Tang, Xiao Yun, Bo Long, Yueting Zhuang

Moreover, when handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise on a corrupted dataset.

Few-Shot Learning

Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning

1 code implementation CVPR 2022 Juncheng Li, Junlin Xie, Long Qian, Linchao Zhu, Siliang Tang, Fei Wu, Yi Yang, Yueting Zhuang, Xin Eric Wang

To systematically measure the compositional generalizability of temporal grounding models, we introduce a new Compositional Temporal Grounding task and construct two new dataset splits, i. e., Charades-CG and ActivityNet-CG.

Semantic correspondence Sentence

Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images

1 code implementation1 Jan 2022 Xiaoqiang Wang, Lei Zhu, Siliang Tang, Huazhu Fu, Ping Li, Fei Wu, Yi Yang, Yueting Zhuang

The depth estimation branch is trained with RGB-D images and then used to estimate the pseudo depth maps for all unlabeled RGB images to form the paired data.

Depth Estimation object-detection +3

Learning To Learn by Jointly Optimizing Neural Architecture and Weights

no code implementations CVPR 2022 Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Longhui Wei, Yueting Zhuang, Qi Tian

Existing NAS-based meta-learning methods apply a two-stage strategy, i. e., first searching architectures and then re-training meta-weights on the searched architecture.

Meta-Learning

Consensus Graph Representation Learning for Better Grounded Image Captioning

no code implementations2 Dec 2021 Wenqiao Zhang, Haochen Shi, Siliang Tang, Jun Xiao, Qiang Yu, Yueting Zhuang

The contemporary visual captioning models frequently hallucinate objects that are not actually in a scene, due to the visual misclassification or over-reliance on priors that resulting in the semantic inconsistency between the visual information and the target lexical words.

Graph Representation Learning Hallucination +1

Relational Graph Learning for Grounded Video Description Generation

no code implementations2 Dec 2021 Wenqiao Zhang, Xin Eric Wang, Siliang Tang, Haizhou Shi, Haocheng Shi, Jun Xiao, Yueting Zhuang, William Yang Wang

Such a setting can help explain the decisions of captioning models and prevents the model from hallucinating object words in its description.

Graph Learning Hallucination +2

Learning to Generate Visual Questions with Noisy Supervision

1 code implementation NeurIPS 2021 Shen Kai, Lingfei Wu, Siliang Tang, Yueting Zhuang, Zhen He, Zhuoye Ding, Yun Xiao, Bo Long

The task of visual question generation (VQG) aims to generate human-like neural questions from an image and potentially other side information (e. g., answer type or the answer itself).

Question Generation Question-Generation +1

Self-Supervised Class Incremental Learning

no code implementations18 Nov 2021 Zixuan Ni, Siliang Tang, Yueting Zhuang

Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.

Class Incremental Learning Data Augmentation +2

Towards Communication-Efficient and Privacy-Preserving Federated Representation Learning

no code implementations29 Sep 2021 Haizhou Shi, Youcai Zhang, Zijin Shen, Siliang Tang, Yaqian Li, Yandong Guo, Yueting Zhuang

This paper investigates the feasibility of federated representation learning under the constraints of communication cost and privacy protection.

Contrastive Learning Federated Learning +2

Natural Language Video Localization with Learnable Moment Proposals

1 code implementation EMNLP 2021 Shaoning Xiao, Long Chen, Jian Shao, Yueting Zhuang, Jun Xiao

Given an untrimmed video and a natural language query, Natural Language Video Localization (NLVL) aims to identify the video moment described by the query.

Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference

no code implementations ICCV 2021 Juncheng Li, Siliang Tang, Linchao Zhu, Haochen Shi, Xuanwen Huang, Fei Wu, Yi Yang, Yueting Zhuang

Secondly, we introduce semantic coherence learning to explicitly encourage the semantic coherence of the adaptive hierarchical graph network from three hierarchies.

Revisiting Catastrophic Forgetting in Class Incremental Learning

no code implementations26 Jul 2021 Zixuan Ni, Haizhou Shi, Siliang Tang, Longhui Wei, Qi Tian, Yueting Zhuang

After investigating existing strategies, we observe that there is a lack of study on how to prevent the inter-phase confusion.

Class Incremental Learning Contrastive Learning +2

Empower Distantly Supervised Relation Extraction with Collaborative Adversarial Training

1 code implementation21 Jun 2021 Tao Chen, Haochen Shi, Liyuan Liu, Siliang Tang, Jian Shao, Zhigang Chen, Yueting Zhuang

In this paper, we propose collaborative adversarial training to improve the data utilization, which coordinates virtual adversarial training (VAT) and adversarial training (AT) at different levels.

Relation Relation Extraction

CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction

1 code implementation ACL 2021 Tao Chen, Haizhou Shi, Siliang Tang, Zhigang Chen, Fei Wu, Yueting Zhuang

The journey of reducing noise from distant supervision (DS) generated training data has been started since the DS was first introduced into the relation extraction (RE) task.

Relation Relation Extraction +1

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

no code implementations26 May 2021 Feifei Shao, Long Chen, Jian Shao, Wei Ji, Shaoning Xiao, Lu Ye, Yueting Zhuang, Jun Xiao

With the success of deep neural networks in object detection, both WSOD and WSOL have received unprecedented attention.

Object object-detection +2

A Sequence-to-Set Network for Nested Named Entity Recognition

1 code implementation19 May 2021 Zeqi Tan, Yongliang Shen, Shuai Zhang, Weiming Lu, Yueting Zhuang

We utilize a non-autoregressive decoder to predict the final set of entities in one pass, in which we are able to capture dependencies between entities.

named-entity-recognition Named Entity Recognition +2

VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching

no code implementations12 May 2021 Chenchi Zhang, Wenbo Ma, Jun Xiao, Hanwang Zhang, Jian Shao, Yueting Zhuang, Long Chen

In this paper, we argue that these methods overlook an obvious \emph{mismatch} between the roles of proposals in the two stages: they generate proposals solely based on the detection confidence (i. e., query-agnostic), hoping that the proposals contain all instances mentioned in the text query (i. e., query-aware).

Image-text matching Referring Expression +2

Self-Supervised Noisy Label Learning for Source-Free Unsupervised Domain Adaptation

no code implementations23 Feb 2021 WeiJie Chen, Luojun Lin, Shicai Yang, Di Xie, ShiLiang Pu, Yueting Zhuang, Wenqi Ren

Usually, the given source domain pre-trained model is expected to optimize with only unlabeled target data, which is termed as source-free unsupervised domain adaptation.

Self-Supervised Learning Unsupervised Domain Adaptation

Ask Question with Double Hints: Visual Question Generation with Answer-awareness and Region-reference

no code implementations1 Jan 2021 Shen Kai, Lingfei Wu, Siliang Tang, Fangli Xu, Zhu Zhang, Yu Qiang, Yueting Zhuang

The task of visual question generation~(VQG) aims to generate human-like questions from an image and potentially other side information (e. g. answer type or the answer itself).

Graph-to-Sequence Question Generation +1

Connection-Adaptive Meta-Learning

no code implementations1 Jan 2021 Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Yueting Zhuang

In this paper, we aim to obtain better meta-learners by co-optimizing the architecture and meta-weights simultaneously.

Meta-Learning

Run Away From your Teacher: a New Self-Supervised Approach Solving the Puzzle of BYOL

no code implementations1 Jan 2021 Haizhou Shi, Dongliang Luo, Siliang Tang, Jian Wang, Yueting Zhuang

Recently, a newly proposed self-supervised framework Bootstrap Your Own Latent (BYOL) seriously challenges the necessity of negative samples in contrastive-based learning frameworks.

Self-Supervised Learning

Robust Meta-learning with Noise via Eigen-Reptile

no code implementations1 Jan 2021 Dong Chen, Lingfei Wu, Siliang Tang, Fangli Xu, Juncheng Li, Chang Zong, Chilie Tan, Yueting Zhuang

In particular, we first cast the meta-overfitting problem (overfitting on sampling and label noise) as a gradient noise problem since few available samples cause meta-learner to overfit on existing examples (clean or corrupted) of an individual task at every gradient step.

Few-Shot Learning

Differentiable Graph Optimization for Neural Architecture Search

no code implementations1 Jan 2021 Chengyue Huang, Lingfei Wu, Yadong Ding, Siliang Tang, Fangli Xu, Chang Zong, Chilie Tan, Yueting Zhuang

To this end, we learn a differentiable graph neural network as a surrogate model to rank candidate architectures, which enable us to obtain gradient w. r. t the input architectures.

Bayesian Optimization Neural Architecture Search

Semi-Supervised Active Learning for Semi-Supervised Models: Exploit Adversarial Examples With Graph-Based Virtual Labels

no code implementations ICCV 2021 Jiannan Guo, Haochen Shi, Yangyang Kang, Kun Kuang, Siliang Tang, Zhuoren Jiang, Changlong Sun, Fei Wu, Yueting Zhuang

Although current mainstream methods begin to combine SSL and AL (SSL-AL) to excavate the diverse expressions of unlabeled samples, these methods' fully supervised task models are still trained only with labeled data.

Active Learning

Run Away From your Teacher: Understanding BYOL by a Novel Self-Supervised Approach

no code implementations22 Nov 2020 Haizhou Shi, Dongliang Luo, Siliang Tang, Jian Wang, Yueting Zhuang

Recently, a newly proposed self-supervised framework Bootstrap Your Own Latent (BYOL) seriously challenges the necessity of negative samples in contrastive learning frameworks.

Contrastive Learning Self-Supervised Learning

Federated Unsupervised Representation Learning

no code implementations18 Oct 2020 Fengda Zhang, Kun Kuang, Zhaoyang You, Tao Shen, Jun Xiao, Yin Zhang, Chao Wu, Yueting Zhuang, Xiaolin Li

FURL poses two new challenges: (1) data distribution shift (Non-IID distribution) among clients would make local models focus on different categories, leading to the inconsistency of representation spaces.

Federated Learning Representation Learning

Two Step Joint Model for Drug Drug Interaction Extraction

no code implementations28 Aug 2020 Siliang Tang, Qi Zhang, Tianpeng Zheng, Mengdi Zhou, Zhan Chen, Lixing Shen, Xiang Ren, Yueting Zhuang, ShiLiang Pu, Fei Wu

When patients need to take medicine, particularly taking more than one kind of drug simultaneously, they should be alarmed that there possibly exists drug-drug interaction.

Drug–drug Interaction Extraction named-entity-recognition +4

Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling

no code implementations11 Aug 2020 Jiacheng Li, Siliang Tang, Juncheng Li, Jun Xiao, Fei Wu, ShiLiang Pu, Yueting Zhuang

In this paper, we focus on enhancing the generalization ability of the VIST model by considering the few-shot setting.

Meta-Learning Visual Storytelling

Learning Decomposed Representation for Counterfactual Inference

1 code implementation12 Jun 2020 Anpeng Wu, Kun Kuang, Junkun Yuan, Bo Li, Runze Wu, Qiang Zhu, Yueting Zhuang, Fei Wu

The fundamental problem in treatment effect estimation from observational data is confounder identification and balancing.

counterfactual Counterfactual Inference

Stable Prediction via Leveraging Seed Variable

no code implementations9 Jun 2020 Kun Kuang, Bo Li, Peng Cui, Yue Liu, Jianrong Tao, Yueting Zhuang, Fei Wu

By assuming the relationships between causal variables and response variable are invariant across data, to address this problem, we propose a conditional independence test based algorithm to separate those causal variables with a seed variable as priori, and adopt them for stable prediction.

Test

Balance-Subsampled Stable Prediction

no code implementations8 Jun 2020 Kun Kuang, Hengtao Zhang, Fei Wu, Yueting Zhuang, Aijun Zhang

However, this assumption is often violated in practice because the sample selection bias may induce the distribution shift from training data to test data.

Selection bias Test

Counterfactual Samples Synthesizing for Robust Visual Question Answering

2 code implementations CVPR 2020 Long Chen, Xin Yan, Jun Xiao, Hanwang Zhang, ShiLiang Pu, Yueting Zhuang

To reduce the language biases, several recent works introduce an auxiliary question-only model to regularize the training of targeted VQA model, and achieve dominating performance on VQA-CP.

 Ranked #1 on Visual Question Answering (VQA) on VQA-CP (using extra training data)

counterfactual Question Answering +2

Bi-Decoder Augmented Network for Neural Machine Translation

no code implementations14 Jan 2020 Boyuan Pan, Yazheng Yang, Zhou Zhao, Yueting Zhuang, Deng Cai

Neural Machine Translation (NMT) has become a popular technology in recent years, and the encoder-decoder framework is the mainstream among all the methods.

Machine Translation NMT +1

Deep Neural Network for Fast and Accurate Single Image Super-Resolution via Channel-Attention-based Fusion of Orientation-aware Features

no code implementations9 Dec 2019 Du Chen, Zewei He, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, Michael Ying Yang, Siliang Tang, Yueting Zhuang

Firstly, we proposed a novel Orientation-Aware feature extraction and fusion Module (OAM), which contains a mixture of 1D and 2D convolutional kernels (i. e., 5 x 1, 1 x 5, and 3 x 3) for extracting orientation-aware features.

Computational Efficiency Image Super-Resolution

Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation

no code implementations CVPR 2020 Juncheng Li, Xin Wang, Siliang Tang, Haizhou Shi, Fei Wu, Yueting Zhuang, William Yang Wang

Visual navigation is a task of training an embodied agent by intelligently navigating to a target object (e. g., television) using only visual observations.

Object reinforcement-learning +3

Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets

1 code implementation11 Nov 2019 Ziqiang Cheng, Yang Yang, Wei Wang, Wenjie Hu, Yueting Zhuang, Guojie Song

Time series modeling has attracted extensive research efforts; however, achieving both reliable efficiency and interpretability from a unified model still remains a challenging problem.

Graph Embedding Time Series +1

Video Dialog via Progressive Inference and Cross-Transformer

no code implementations IJCNLP 2019 Weike Jin, Zhou Zhao, Mao Gu, Jun Xiao, Furu Wei, Yueting Zhuang

Video dialog is a new and challenging task, which requires the agent to answer questions combining video information with dialog history.

Answer Generation Question Answering +4

Learning Dynamic Context Augmentation for Global Entity Linking

2 code implementations IJCNLP 2019 Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang, Yueting Zhuang, Fei Wu, Zhigang Chen, Guoping Hu, Xiang Ren

Despite of the recent success of collective entity linking (EL) methods, these "global" inference methods may yield sub-optimal results when the "all-mention coherence" assumption breaks, and often suffer from high computational cost at the inference stage, due to the complex search space.

Entity Disambiguation Entity Linking +1

Walking with MIND: Mental Imagery eNhanceD Embodied QA

no code implementations5 Aug 2019 Juncheng Li, Siliang Tang, Fei Wu, Yueting Zhuang

The experimental results and further analysis prove that the agent with the MIND module is superior to its counterparts not only in EQA performance but in many other aspects such as route planning, behavioral interpretation, and the ability to generalize from a few examples.

Informative Visual Storytelling with Cross-modal Rules

1 code implementation7 Jul 2019 Jiacheng Li, Haizhou Shi, Siliang Tang, Fei Wu, Yueting Zhuang

To solve this problem, we propose a method to mine the cross-modal rules to help the model infer these informative concepts given certain visual input.

Visual Storytelling

Weak Supervision Enhanced Generative Network for Question Generation

no code implementations1 Jul 2019 Yutong Wang, Jiyuan Zheng, Qijiong Liu, Zhou Zhao, Jun Xiao, Yueting Zhuang

More specifically, we devise a discriminator, Relation Guider, to capture the relations between the whole passage and the associated answer and then the Multi-Interaction mechanism is deployed to transfer the knowledge dynamically for our question generation system.

Question Answering Question Generation +1

Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction

1 code implementation27 Dec 2018 Yujin Yuan, Liyuan Liu, Siliang Tang, Zhongfei Zhang, Yueting Zhuang, ShiLiang Pu, Fei Wu, Xiang Ren

Distant supervision leverages knowledge bases to automatically label instances, thus allowing us to train relation extractor without human annotations.

Relation Relation Extraction +1

Representation Learning for Scale-free Networks

no code implementations29 Nov 2017 Rui Feng, Yang Yang, Wenjie Hu, Fei Wu, Yueting Zhuang

Existing network embedding works primarily focus on preserving the microscopic structure, such as the first- and second-order proximity of vertexes, while the macroscopic scale-free property is largely ignored.

Link Prediction Network Embedding

Deeply-Learned Part-Aligned Representations for Person Re-Identification

1 code implementation ICCV 2017 Liming Zhao, Xi Li, Jingdong Wang, Yueting Zhuang

In this paper, we address the problem of person re-identification, which refers to associating the persons captured from different cameras.

Person Re-Identification

Video Question Answering via Attribute-Augmented Attention Network Learning

no code implementations20 Jul 2017 Yunan Ye, Zhou Zhao, Yimeng Li, Long Chen, Jun Xiao, Yueting Zhuang

Video Question Answering is a challenging problem in visual information retrieval, which provides the answer to the referenced video content according to the question.

Attribute Information Retrieval +6

Zero-Shot Recognition using Dual Visual-Semantic Mapping Paths

no code implementations CVPR 2017 Yanan Li, Donghui Wang, Huanhang Hu, Yuetan Lin, Yueting Zhuang

This mapping is learned on training data of seen classes and is expected to have transfer ability to unseen classes.

Zero-Shot Learning

Task-driven Visual Saliency and Attention-based Visual Question Answering

no code implementations22 Feb 2017 Yuetan Lin, Zhangyang Pang, Donghui Wang, Yueting Zhuang

Visual question answering (VQA) has witnessed great progress since May, 2015 as a classic problem unifying visual and textual data into a system.

Question Answering Visual Question Answering

Deep Learning Driven Visual Path Prediction from a Single Image

no code implementations27 Jan 2016 Siyu Huang, Xi Li, Zhongfei Zhang, Zhouzhou He, Fei Wu, Wei Liu, Jinhui Tang, Yueting Zhuang

The highly effective visual representation and deep context models ensure that our framework makes a deep semantic understanding of the scene and motion pattern, consequently improving the performance of the visual path prediction task.

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

no code implementations19 Oct 2015 Xi Li, Liming Zhao, Lina Wei, Ming-Hsuan Yang, Fei Wu, Yueting Zhuang, Haibin Ling, Jingdong Wang

A key problem in salient object detection is how to effectively model the semantic properties of salient objects in a data-driven manner.

Image Segmentation Multi-Task Learning +6

Online Metric-Weighted Linear Representations for Robust Visual Tracking

no code implementations21 Jul 2015 Xi Li, Chunhua Shen, Anthony Dick, Zhongfei Zhang, Yueting Zhuang

Object identification results for an entire video sequence are achieved by systematically combining the tracking information and visual recognition at each frame.

Metric Learning Object +2

Metric Learning Driven Multi-Task Structured Output Optimization for Robust Keypoint Tracking

no code implementations4 Dec 2014 Liming Zhao, Xi Li, Jun Xiao, Fei Wu, Yueting Zhuang

As an important and challenging problem in computer vision and graphics, keypoint-based object tracking is typically formulated in a spatio-temporal statistical learning framework.

Metric Learning Object Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.