1 code implementation • 19 Aug 2024 • Tianwei Lin, Jiang Liu, Wenqiao Zhang, Zhaocheng Li, Yang Dai, Haoyuan Li, Zhelun Yu, Wanggui He, Juncheng Li, Hao Jiang, Siliang Tang, Yueting Zhuang
Considering this, we introduce an innovative PEFT method, TeamLoRA, consisting of a collaboration and competition module for experts, and thus achieving the right balance of effectiveness and efficiency: (i) For collaboration, a novel knowledge-sharing and -organizing mechanism is devised to appropriately reduce the scale of matrix operations, thereby boosting the training and inference speed.
1 code implementation • 18 Aug 2024 • Jianhao Guo, Zixuan Ni, Yun Zhu, Siliang Tang
In the realm of continual graph learning, where graphs continuously evolve based on streaming graph data, continual graph learning presents unique challenges that require adaptive and efficient graph learning methods in addition to the problem of catastrophic forgetting.
1 code implementation • 15 Aug 2024 • Boci Peng, Yun Zhu, Yongchao Liu, Xiaohe Bo, Haizhou Shi, Chuntao Hong, Yan Zhang, Siliang Tang
Recently, Retrieval-Augmented Generation (RAG) has achieved remarkable success in addressing the challenges of Large Language Models (LLMs) without necessitating retraining.
no code implementations • 28 Jul 2024 • Dong Chen, Shilin Zhang, Fei Gao, Yueting Zhuang, Siliang Tang, Qidong Liu, Mingliang Xu
Subsequently, based on the function base, LD fine-tunes S-LLMs to learn the logic employed by L-LLMs in planning and decision-making.
1 code implementation • 15 Jul 2024 • Jie Cao, Dian Jiao, Qiang Yan, Wenqiao Zhang, Siliang Tang, Yueting Zhuang
Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization.
1 code implementation • 6 Jul 2024 • Kai Shen, Lingfei Wu, Siliang Tang, Fangli Xu, Bo Long, Yueting Zhuang, Jian Pei
The visual question generation (VQG) task aims to generate human-like questions from an image and potentially other side information (e. g. answer type).
no code implementations • 18 Jun 2024 • Yaoke Wang, Yun Zhu, Wenqiao Zhang, Yueting Zhuang, Yunfei Li, Siliang Tang
Representation learning on text-attributed graphs (TAGs) is vital for real-world applications, as they combine semantic textual and contextual structural information.
1 code implementation • 15 Jun 2024 • Dong Chen, Shuo Zhang, Yueting Zhuang, Siliang Tang, Qidong Liu, Hua Wang, Mingliang Xu
On the other hand, certain tasks can be broken down into multiple subtasks, some of which can be completed without powerful capabilities.
no code implementations • 11 Jun 2024 • Aoxiong Yin, Haoyuan Li, Kai Shen, Siliang Tang, Yueting Zhuang
In this work, we propose a two-stage sign language production (SLP) paradigm that first encodes sign language sequences into discrete codes and then autoregressively generates sign language from text based on the learned codebook.
no code implementations • 12 May 2024 • Dian Jiao, Li Cai, Jingsheng Huang, Wenqiao Zhang, Siliang Tang, Yueting Zhuang
Retrieval-Augmented Generation (RAG) methods augment the input of Large Language Models (LLMs) with relevant retrieved passages, reducing factual errors in knowledge-intensive tasks.
1 code implementation • 3 May 2024 • Kaihang Pan, Siliang Tang, Juncheng Li, Zhaoyu Fan, Wei Chow, Shuicheng Yan, Tat-Seng Chua, Yueting Zhuang, Hanwang Zhang
For multimodal LLMs, the synergy of visual comprehension (textual output) and generation (visual output) presents an ongoing challenge.
1 code implementation • 28 Apr 2024 • Zhiqi Ge, Hongzhe Huang, Mingze Zhou, Juncheng Li, Guoming Wang, Siliang Tang, Yueting Zhuang
As for evaluation, we build WorldNet, a multimodal state transition prediction benchmark encompassing varied real-life scenarios.
no code implementations • 21 Apr 2024 • Haoyu Zheng, Wenqiao Zhang, Yaoke Wang, Hao Zhou, Jiang Liu, Juncheng Li, Zheqi Lv, Siliang Tang, Yueting Zhuang
Revolutionary advancements in text-to-image models have unlocked new dimensions for sophisticated content creation, e. g., text-conditioned image editing, allowing us to edit the diverse images that convey highly complex visual concepts according to the textual guidance.
no code implementations • 17 Apr 2024 • Minghe Gao, Shuang Chen, Liang Pang, Yuan YAO, Jisheng Dang, Wenqiao Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang, Tat-Seng Chua
Their ability to execute intricate compositional reasoning tasks is also constrained, culminating in a stagnation of learning progression for these models.
1 code implementation • 20 Mar 2024 • Wenqiao Zhang, Tianwei Lin, Jiang Liu, Fangxun Shu, Haoyuan Li, Lei Zhang, He Wanggui, Hao Zhou, Zheqi Lv, Hao Jiang, Juncheng Li, Siliang Tang, Yueting Zhuang
Recent advancements indicate that scaling up Multimodal Large Language Models (MLLMs) effectively enhances performance on downstream multimodal tasks.
Ranked #112 on Visual Question Answering on MM-Vet
no code implementations • 5 Mar 2024 • Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao
Specifically, 1) we design a neural codec with factorized vector quantization (FVQ) to disentangle speech waveform into subspaces of content, prosody, timbre, and acoustic details; 2) we propose a factorized diffusion model to generate attributes in each subspace following its corresponding prompt.
1 code implementation • 18 Feb 2024 • Long Qian, Juncheng Li, Yu Wu, Yaobo Ye, Hao Fei, Tat-Seng Chua, Yueting Zhuang, Siliang Tang
Large Language Models (LLMs) demonstrate remarkable proficiency in comprehending and handling text-based tasks.
1 code implementation • 28 Jan 2024 • Yun Zhu, Yaoke Wang, Haizhou Shi, Siliang Tang
In light of recent advancements in large language models (LLMs), it is apparent that integrating LLMs for enhanced textual encoding can substantially improve the performance of textual graphs.
no code implementations • CVPR 2024 • Xinyi Jiang, Guoming Wang, Junhao Guo, Juncheng Li, Wenqiao Zhang, Rongxing Lu, Siliang Tang
On MM-Vet our method achieves an improvement in MM-Vet scores increasing from 31. 1 to 32. 4.
no code implementations • 22 Dec 2023 • Aoxiong Yin, Tianyun Zhong, Haoyuan Li, Siliang Tang, Zhou Zhao
Subsequently, we utilize the predicted source words to decode the output in advance.
1 code implementation • CVPR 2024 • Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang
Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks.
no code implementations • 21 Nov 2023 • Minghe Gao, Juncheng Li, Hao Fei, Liang Pang, Wei Ji, Guoming Wang, Zheqi Lv, Wenqiao Zhang, Siliang Tang, Yueting Zhuang
Visual programming, a modular and generalizable paradigm, integrates different modules and Python operators to solve various vision-language tasks.
no code implementations • CVPR 2024 • Wenqiao Zhang, Zheqi Lv, Hao Zhou, Jia-Wei Liu, Juncheng Li, Mengze Li, Siliang Tang, Yueting Zhuang
Active Domain Adaptation (ADA) aims to maximally boost model adaptation in a new target domain by actively selecting a limited number of target data to annotate. This setting neglects the more practical scenario where training data are collected from multiple sources.
1 code implementation • 15 Oct 2023 • Xiangnan Chen, Wen Zhang, Zhen Yao, Mingyang Chen, Siliang Tang
Most existing negative sampling methods assume that non-existent triples with high scores are high-quality negative triples.
no code implementations • 11 Oct 2023 • Yun Zhu, Yaoke Wang, Haizhou Shi, Zhenshuo Zhang, Dian Jiao, Siliang Tang
These pre-trained models can be applied to various downstream Web applications, saving training time and improving downstream (target) performance.
1 code implementation • 4 Oct 2023 • Dong Chen, Kaihang Pan, Guoming Wang, Yueting Zhuang, Siliang Tang
To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality, and then the latent space of vision modality will be learned with the guidance of the matrix.
no code implementations • 19 Aug 2023 • Kaihang Pan, Juncheng Li, Wenjie Wang, Hao Fei, Hongye Song, Wei Ji, Jun Lin, Xiaozhong Liu, Tat-Seng Chua, Siliang Tang
Recent studies indicate that dense retrieval models struggle to perform well on a wide variety of retrieval tasks that lack dedicated training data, as different retrieval tasks often entail distinct search intents.
no code implementations • 15 Aug 2023 • Bosheng Qin, Wentao Ye, Qifan Yu, Siliang Tang, Yueting Zhuang
Our approach employs a pretrained T2I diffusion model to generate each video frame in an autoregressive fashion.
1 code implementation • 8 Aug 2023 • Juncheng Li, Kaihang Pan, Zhiqi Ge, Minghe Gao, Wei Ji, Wenqiao Zhang, Tat-Seng Chua, Siliang Tang, Hanwang Zhang, Yueting Zhuang
This shortcoming results in MLLMs' underperformance in comprehending demonstrative instructions consisting of multiple, interleaved, and multimodal instructions that demonstrate the required context to complete a task.
no code implementations • 2 Aug 2023 • Zixuan Ni, Longhui Wei, Jiacheng Li, Siliang Tang, Yueting Zhuang, Qi Tian
In this work, we propose a novel strategy named \textbf{Degeneration-Tuning (DT)} to shield contents of unwanted concepts from SD weights.
1 code implementation • 24 Jul 2023 • Yun Zhu, Haizhou Shi, Zhenshuo Zhang, Siliang Tang
In this work, we investigate the problem of out-of-distribution (OOD) generalization for unsupervised learning methods on graph data.
1 code implementation • 23 May 2023 • Xiangnan Chen, Qian Xiao, Juncheng Li, Duo Dong, Jun Lin, Xiaozhong Liu, Siliang Tang
GOSE initiates by generating preliminary relation predictions on entity pairs extracted from a scanned image of the document.
1 code implementation • 22 May 2023 • Qifan Yu, Juncheng Li, Wentao Ye, Siliang Tang, Yueting Zhuang
Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images.
no code implementations • 21 May 2023 • Bosheng Qin, Juncheng Li, Siliang Tang, Tat-Seng Chua, Yueting Zhuang
We introduce InstructVid2Vid, an end-to-end diffusion-based methodology for video editing guided by human language instructions.
no code implementations • 11 May 2023 • Zixuan Ni, Longhui Wei, Siliang Tang, Yueting Zhuang, Qi Tian
Moreover, we empirically and theoretically demonstrate how SD leads to a performance decline for CLIP on cross-modal retrieval tasks.
no code implementations • 8 May 2023 • Xiaoqiang Wang, Bang Liu, Siliang Tang, Lingfei Wu
We present $\textbf{$\texttt{SkillQG}$}$: a question generation framework with controllable comprehension types for assessing and improving machine reading comprehension models.
no code implementations • ICCV 2023 • Wenqiao Zhang, Changshuo Liu, Lingze Zeng, Beng Chin Ooi, Siliang Tang, Yueting Zhuang
Conventional multi-label classification (MLC) methods assume that all samples are fully labeled and identically distributed.
1 code implementation • ICCV 2023 • Qifan Yu, Juncheng Li, Yu Wu, Siliang Tang, Wei Ji, Yueting Zhuang
Based on that, we further introduce a novel Entangled cross-modal prompt approach for open-world predicate scene graph generation (Epic), where models can generalize to unseen predicates in a zero-shot manner.
1 code implementation • 22 Mar 2023 • Kaihang Pan, Juncheng Li, Hongye Song, Jun Lin, Xiaozhong Liu, Siliang Tang
Though effective, prompt tuning under few-shot settings on the one hand heavily relies on a good initialization of soft prompts.
no code implementations • 16 Mar 2023 • Boren Hu, Yun Zhu, Jiacheng Li, Siliang Tang
In this paper, we propose a novel dynamic early exiting combined with layer skipping for BERT inference named SmartBERT, which adds a skipping gate and an exiting operator into each layer of BERT.
no code implementations • ICCV 2023 • Juncheng Li, Minghe Gao, Longhui Wei, Siliang Tang, Wenqiao Zhang, Mengze Li, Wei Ji, Qi Tian, Tat-Seng Chua, Yueting Zhuang
Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to downstream tasks in a parameter -- and data -- efficient way, by learning the ``soft prompts'' to condition frozen pre-training models.
no code implementations • 9 Mar 2023 • Zhenshuo Zhang, Yun Zhu, Haizhou Shi, Siliang Tang
Albeit having gained significant progress lately, large-scale graph representation learning remains expensive to train and deploy for two main reasons: (i) the repetitive computation of multi-hop message passing and non-linearity in graph neural networks (GNNs); (ii) the computational cost of complex pairwise contrastive learning loss.
no code implementations • 7 Mar 2023 • Jiacheng Li, Longhui Wei, Zongyuan Zhan, Xin He, Siliang Tang, Qi Tian, Yueting Zhuang
To better accelerate the generative transformers while keeping good generation quality, we propose Lformer, a semi-autoregressive text-to-image generation model.
no code implementations • 24 Feb 2023 • Yun Zhu, Jianhao Guo, Siliang Tang
And aiming for graph classification task, we unify pre-training and fine-tuning by designing a novel verbalizer-free prompting function, which reformulates the downstream task in a similar format as pretext task.
no code implementations • 13 Feb 2023 • Kai Shen, Junliang Guo, Xu Tan, Siliang Tang, Rui Wang, Jiang Bian
This paper sheds light on the following points: 1) Softmax and ReLU use different normalization methods over elements which lead to different variances of results, and ReLU is good at dealing with a large number of key-value slots; 2) FFN and key-value memory are equivalent, and thus the Transformer can be viewed as a memory network where FFNs and self-attention networks are both key-value memories.
no code implementations • 22 Jan 2023 • Juncheng Li, Siliang Tang, Linchao Zhu, Wenqiao Zhang, Yi Yang, Tat-Seng Chua, Fei Wu, Yueting Zhuang
To systematically benchmark the compositional generalizability of temporal grounding models, we introduce a new Compositional Temporal Grounding task and construct two new dataset splits, i. e., Charades-CG and ActivityNet-CG.
no code implementations • 24 Nov 2022 • Bosheng Qin, Juncheng Li, Siliang Tang, Yueting Zhuang
Furthermore, we show that the hidden state dimension can be approximated by extending the Johnson-Lindenstrauss lemma, optimizing the attention in bilinear form.
1 code implementation • 23 Nov 2022 • Kai Shen, Yichong Leng, Xu Tan, Siliang Tang, Yuan Zhang, Wenjie Liu, Edward Lin
Since the error rate of the incorrect sentence is usually low (e. g., 10\%), the correction model can only learn to correct on limited error tokens but trivially copy on most tokens (correct tokens), which harms the effective training of error correction.
1 code implementation • 14 Oct 2022 • Wenbin An, Feng Tian, Ping Chen, Siliang Tang, Qinghua Zheng, Qianying Wang
Novel category discovery aims at adapting models trained on known categories to novel categories.
no code implementations • 6 Oct 2022 • Tao Chen, Luxin Liu, Xuepeng Jia, Baoliang Cui, Haihong Tang, Siliang Tang
Specifically, we borrow recent prompt-based language models as the knowledge expert to yield initial seed rules, and based on the formed high-quality instance pool that acts as an intermediary role, we keep teaching the expert to fit our task and learning task-specific logical rules.
no code implementations • 2 Oct 2022 • Chang Zong, Yueting Zhuang, Weiming Lu, Jian Shao, Siliang Tang
In this paper, we propose CTPIR, a new citation trajectory prediction framework that is able to represent the influence (the momentum of citation) of either new or existing publications using the history information of all their attributes.
1 code implementation • 4 Aug 2022 • Juncheng Li, Xin He, Longhui Wei, Long Qian, Linchao Zhu, Lingxi Xie, Yueting Zhuang, Qi Tian, Siliang Tang
Large-scale vision-language pre-training has shown impressive advances in a wide range of downstream tasks.
1 code implementation • 3 Aug 2022 • Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang
In this paper, we introduce a new task, named Temporal Emotion Localization in videos~(TEL), which aims to detect human emotions and localize their corresponding temporal boundaries in untrimmed videos with aligned subtitles.
no code implementations • 9 Jul 2022 • Wenqiao Zhang, Jiannan Guo, Mengze Li, Haochen Shi, Shengyu Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang
In this scenario, the input image serves as an intuitive context and background for the search, while the corresponding language expressly requests new traits on how specific characteristics of the query image should be modified in order to get the intended target image.
no code implementations • 7 Jun 2022 • Jiannan Guo, Yangyang Kang, Yu Duan, Xiaozhong Liu, Siliang Tang, Wenqiao Zhang, Kun Kuang, Changlong Sun, Fei Wu
Motivated by the industry practice of labeling data, we propose an innovative Inconsistency-based virtual aDvErsarial Active Learning (IDEAL) algorithm to further investigate SSL-AL's potential superiority and achieve mutual enhancement of AL and SSL, i. e., SSL propagates label information to unlabeled samples and provides smoothed embeddings for AL, while AL excludes samples with inconsistent predictions and considerable uncertainty for SSL.
1 code implementation • 4 Jun 2022 • Dong Chen, Lingfei Wu, Siliang Tang, Xiao Yun, Bo Long, Yueting Zhuang
Moreover, when handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise on a corrupted dataset.
1 code implementation • 29 Apr 2022 • Yun Zhu, Jianhao Guo, Fei Wu, Siliang Tang
To the best of our awareness, RoSA is the first work focuses on the non-aligned node-node graph contrastive learning problem.
no code implementations • 29 Apr 2022 • Xiaoqiang Wang, Bang Liu, Siliang Tang, Lingfei Wu
Existing metrics for assessing question generation not only require costly human reference but also fail to take into account the input context of generation, rendering the lack of deep understanding of the relevance between the generated questions and input contexts.
1 code implementation • CVPR 2022 • Juncheng Li, Junlin Xie, Long Qian, Linchao Zhu, Siliang Tang, Fei Wu, Yi Yang, Yueting Zhuang, Xin Eric Wang
To systematically measure the compositional generalizability of temporal grounding models, we introduce a new Compositional Temporal Grounding task and construct two new dataset splits, i. e., Charades-CG and ActivityNet-CG.
no code implementations • ACL 2022 • Xiaoqiang Wang, Bang Liu, Fangli Xu, Bo Long, Siliang Tang, Lingfei Wu
In this paper, we argue that a deep understanding of model capabilities and data properties can help us feed a model with appropriate training data based on its learning status.
no code implementations • CVPR 2022 • Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Longhui Wei, Yueting Zhuang, Qi Tian
Existing NAS-based meta-learning methods apply a two-stage strategy, i. e., first searching architectures and then re-training meta-weights on the searched architecture.
1 code implementation • 1 Jan 2022 • Xiaoqiang Wang, Lei Zhu, Siliang Tang, Huazhu Fu, Ping Li, Fei Wu, Yi Yang, Yueting Zhuang
The depth estimation branch is trained with RGB-D images and then used to estimate the pseudo depth maps for all unlabeled RGB images to form the paired data.
no code implementations • 2 Dec 2021 • Wenqiao Zhang, Haochen Shi, Siliang Tang, Jun Xiao, Qiang Yu, Yueting Zhuang
The contemporary visual captioning models frequently hallucinate objects that are not actually in a scene, due to the visual misclassification or over-reliance on priors that resulting in the semantic inconsistency between the visual information and the target lexical words.
no code implementations • 2 Dec 2021 • Wenqiao Zhang, Xin Eric Wang, Siliang Tang, Haizhou Shi, Haocheng Shi, Jun Xiao, Yueting Zhuang, William Yang Wang
Such a setting can help explain the decisions of captioning models and prevents the model from hallucinating object words in its description.
1 code implementation • NeurIPS 2021 • Shen Kai, Lingfei Wu, Siliang Tang, Yueting Zhuang, Zhen He, Zhuoye Ding, Yun Xiao, Bo Long
The task of visual question generation (VQG) aims to generate human-like neural questions from an image and potentially other side information (e. g., answer type or the answer itself).
no code implementations • 18 Nov 2021 • Zixuan Ni, Siliang Tang, Yueting Zhuang
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
no code implementations • 29 Sep 2021 • Haizhou Shi, Youcai Zhang, Zijin Shen, Siliang Tang, Yaqian Li, Yandong Guo, Yueting Zhuang
This paper investigates the feasibility of federated representation learning under the constraints of communication cost and privacy protection.
no code implementations • 30 Jul 2021 • Haizhou Shi, Youcai Zhang, Siliang Tang, Wenjie Zhu, Yaqian Li, Yandong Guo, Yueting Zhuang
It is a consensus that small models perform quite poorly under the paradigm of self-supervised contrastive learning.
no code implementations • ICCV 2021 • Juncheng Li, Siliang Tang, Linchao Zhu, Haochen Shi, Xuanwen Huang, Fei Wu, Yi Yang, Yueting Zhuang
Secondly, we introduce semantic coherence learning to explicitly encourage the semantic coherence of the adaptive hierarchical graph network from three hierarchies.
no code implementations • 26 Jul 2021 • Zixuan Ni, Haizhou Shi, Siliang Tang, Longhui Wei, Qi Tian, Yueting Zhuang
After investigating existing strategies, we observe that there is a lack of study on how to prevent the inter-phase confusion.
1 code implementation • ACL 2021 • Tao Chen, Haizhou Shi, Siliang Tang, Zhigang Chen, Fei Wu, Yueting Zhuang
The journey of reducing noise from distant supervision (DS) generated training data has been started since the DS was first introduced into the relation extraction (RE) task.
1 code implementation • 21 Jun 2021 • Tao Chen, Haochen Shi, Liyuan Liu, Siliang Tang, Jian Shao, Zhigang Chen, Yueting Zhuang
In this paper, we propose collaborative adversarial training to improve the data utilization, which coordinates virtual adversarial training (VAT) and adversarial training (AT) at different levels.
1 code implementation • 21 Apr 2021 • Feifei Shao, Yawei Luo, Li Zhang, Lu Ye, Siliang Tang, Yi Yang, Jun Xiao
The recent emerged weakly supervised object localization (WSOL) methods can learn to localize an object in the image only using image-level labels.
no code implementations • 13 Apr 2021 • Zongshen Mu, Siliang Tang, Jie Tan, Qiang Yu, Yueting Zhuang
In this paper, we propose a novel graph learning framework for phrase grounding in the image.
Ranked #6 on Phrase Grounding on Flickr30k Entities Test
no code implementations • 1 Jan 2021 • Chengyue Huang, Lingfei Wu, Yadong Ding, Siliang Tang, Fangli Xu, Chang Zong, Chilie Tan, Yueting Zhuang
To this end, we learn a differentiable graph neural network as a surrogate model to rank candidate architectures, which enable us to obtain gradient w. r. t the input architectures.
no code implementations • 1 Jan 2021 • Shen Kai, Lingfei Wu, Siliang Tang, Fangli Xu, Zhu Zhang, Yu Qiang, Yueting Zhuang
The task of visual question generation~(VQG) aims to generate human-like questions from an image and potentially other side information (e. g. answer type or the answer itself).
no code implementations • ICCV 2021 • Jiannan Guo, Haochen Shi, Yangyang Kang, Kun Kuang, Siliang Tang, Zhuoren Jiang, Changlong Sun, Fei Wu, Yueting Zhuang
Although current mainstream methods begin to combine SSL and AL (SSL-AL) to excavate the diverse expressions of unlabeled samples, these methods' fully supervised task models are still trained only with labeled data.
no code implementations • 1 Jan 2021 • Dong Chen, Lingfei Wu, Siliang Tang, Fangli Xu, Juncheng Li, Chang Zong, Chilie Tan, Yueting Zhuang
In particular, we first cast the meta-overfitting problem (overfitting on sampling and label noise) as a gradient noise problem since few available samples cause meta-learner to overfit on existing examples (clean or corrupted) of an individual task at every gradient step.
no code implementations • 1 Jan 2021 • Haizhou Shi, Dongliang Luo, Siliang Tang, Jian Wang, Yueting Zhuang
Recently, a newly proposed self-supervised framework Bootstrap Your Own Latent (BYOL) seriously challenges the necessity of negative samples in contrastive-based learning frameworks.
no code implementations • 1 Jan 2021 • Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Yueting Zhuang
In this paper, we aim to obtain better meta-learners by co-optimizing the architecture and meta-weights simultaneously.
no code implementations • 22 Nov 2020 • Haizhou Shi, Dongliang Luo, Siliang Tang, Jian Wang, Yueting Zhuang
Recently, a newly proposed self-supervised framework Bootstrap Your Own Latent (BYOL) seriously challenges the necessity of negative samples in contrastive learning frameworks.
no code implementations • 2 Oct 2020 • Shengyu Zhang, Donghui Wang, Zhou Zhao, Siliang Tang, Di Xie, Fei Wu
In this paper, we investigate the problem of text-to-pedestrian synthesis, which has many potential applications in art, design, and video surveillance.
no code implementations • 28 Aug 2020 • Siliang Tang, Qi Zhang, Tianpeng Zheng, Mengdi Zhou, Zhan Chen, Lixing Shen, Xiang Ren, Yueting Zhuang, ShiLiang Pu, Fei Wu
When patients need to take medicine, particularly taking more than one kind of drug simultaneously, they should be alarmed that there possibly exists drug-drug interaction.
no code implementations • 11 Aug 2020 • Jiacheng Li, Siliang Tang, Juncheng Li, Jun Xiao, Fei Wu, ShiLiang Pu, Yueting Zhuang
In this paper, we focus on enhancing the generalization ability of the VIST model by considering the few-shot setting.
no code implementations • ACL 2020 • Jie Tan, Changlin Yang, Ying Li, Siliang Tang, Chen Huang, Yueting Zhuang
Measuring the scholarly impact of a document without citations is an important and challenging problem.
no code implementations • 4 Jun 2020 • Yesheng Xu, Ming Kong, Wenjia Xie, Runping Duan, Zhengqing Fang, Yuxiao Lin, Qiang Zhu, Siliang Tang, Fei Wu, Yu-Feng Yao
Infectious keratitis is the most common entities of corneal diseases, in which pathogen grows in the cornea leading to inflammation and destruction of the corneal tissues.
1 code implementation • 8 May 2020 • Abdelrahman Abdelhamed, Mahmoud Afifi, Radu Timofte, Michael S. Brown, Yue Cao, Zhilu Zhang, WangMeng Zuo, Xiaoling Zhang, Jiye Liu, Wendong Chen, Changyuan Wen, Meng Liu, Shuailin Lv, Yunchao Zhang, Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Xiyu Yu, Gang Zhang, Jingtuo Liu, Junyu Han, Errui Ding, Songhyun Yu, Bumjun Park, Jechang Jeong, Shuai Liu, Ziyao Zong, Nan Nan, Chenghua Li, Zengli Yang, Long Bao, Shuangquan Wang, Dongwoon Bai, Jungwon Lee, Youngjung Kim, Kyeongha Rho, Changyeop Shin, Sungho Kim, Pengliang Tang, Yiyun Zhao, Yuqian Zhou, Yuchen Fan, Thomas Huang, Zhihao LI, Nisarg A. Shah, Wei Liu, Qiong Yan, Yuzhi Zhao, Marcin Możejko, Tomasz Latkowski, Lukasz Treszczotko, Michał Szafraniuk, Krzysztof Trojanowski, Yanhong Wu, Pablo Navarrete Michelini, Fengshuo Hu, Yunhua Lu, Sujin Kim, Wonjin Kim, Jaayeon Lee, Jang-Hwan Choi, Magauiya Zhussip, Azamat Khassenov, Jong Hyun Kim, Hwechul Cho, Priya Kansal, Sabari Nathan, Zhangyu Ye, Xiwen Lu, Yaqi Wu, Jiangxin Yang, Yanlong Cao, Siliang Tang, Yanpeng Cao, Matteo Maggioni, Ioannis Marras, Thomas Tanay, Gregory Slabaugh, Youliang Yan, Myungjoo Kang, Han-Soo Choi, Kyungmin Song, Shusong Xu, Xiaomu Lu, Tingniao Wang, Chunxia Lei, Bin Liu, Rajat Gupta, Vineet Kumar
This challenge is based on a newly collected validation and testing image datasets, and hence, named SIDD+.
no code implementations • 7 May 2020 • Siwei Fu, Kai Xiong, Xiaodong Ge, Siliang Tang, Wei Chen, Yingcai Wu
To address this challenge, we present a new dataset, called Quda, that aims to help V-NLIs recognize analytic tasks from free-form natural language by training and evaluating cutting-edge multi-label classification models.
no code implementations • 10 Mar 2020 • Yankun Ren, Jianbin Lin, Siliang Tang, Jun Zhou, Shuang Yang, Yuan Qi, Xiang Ren
It can attack text classification models with a higher success rate than existing methods, and provide acceptable quality for humans in the meantime.
no code implementations • 29 Feb 2020 • Shengyu Zhang, Tan Jiang, Qinghao Huang, Ziqi Tan, Zhou Zhao, Siliang Tang, Jin Yu, Hongxia Yang, Yi Yang, Fei Wu
Existing image completion procedure is highly subjective by considering only visual context, which may trigger unpredictable results which are plausible but not faithful to a grounded knowledge.
no code implementations • 9 Dec 2019 • Du Chen, Zewei He, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, Michael Ying Yang, Siliang Tang, Yueting Zhuang
Firstly, we proposed a novel Orientation-Aware feature extraction and fusion Module (OAM), which contains a mixture of 1D and 2D convolutional kernels (i. e., 5 x 1, 1 x 5, and 3 x 3) for extracting orientation-aware features.
no code implementations • CVPR 2020 • Juncheng Li, Xin Wang, Siliang Tang, Haizhou Shi, Fei Wu, Yueting Zhuang, William Yang Wang
Visual navigation is a task of training an embodied agent by intelligently navigating to a target object (e. g., television) using only visual observations.
2 code implementations • IJCNLP 2019 • Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang, Yueting Zhuang, Fei Wu, Zhigang Chen, Guoping Hu, Xiang Ren
Despite of the recent success of collective entity linking (EL) methods, these "global" inference methods may yield sub-optimal results when the "all-mention coherence" assumption breaks, and often suffer from high computational cost at the inference stage, due to the complex search space.
Ranked #5 on Entity Disambiguation on AIDA-CoNLL
no code implementations • 5 Aug 2019 • Juncheng Li, Siliang Tang, Fei Wu, Yueting Zhuang
The experimental results and further analysis prove that the agent with the MIND module is superior to its counterparts not only in EQA performance but in many other aspects such as route planning, behavioral interpretation, and the ability to generalize from a few examples.
1 code implementation • 7 Jul 2019 • Jiacheng Li, Haizhou Shi, Siliang Tang, Fei Wu, Yueting Zhuang
To solve this problem, we propose a method to mine the cross-modal rules to help the model infer these informative concepts given certain visual input.
Ranked #11 on Visual Storytelling on VIST
1 code implementation • ACL 2019 • Sheng Lin, Luye Zheng, Bo Chen, Siliang Tang, Yueting Zhuang, Fei Wu, Zhigang Chen, Guoping Hu, Xiang Ren
Fine-grained Entity Typing is a tough task which suffers from noise samples extracted from distant supervision.
1 code implementation • NAACL 2019 • Qi Zhang, Siliang Tang, Xiang Ren, Fei Wu, ShiLiang Pu, Yueting Zhuang
This paper provides a new way to improve the efficiency of the REINFORCE training process.
no code implementations • NAACL 2019 • Bo Chen, Xiaotao Gu, Yu-Feng Hu, Siliang Tang, Guoping Hu, Yueting Zhuang, Xiang Ren
Recently, distant supervision has gained great success on Fine-grained Entity Typing (FET).
1 code implementation • 27 Dec 2018 • Yujin Yuan, Liyuan Liu, Siliang Tang, Zhongfei Zhang, Yueting Zhuang, ShiLiang Pu, Fei Wu, Xiang Ren
Distant supervision leverages knowledge bases to automatically label instances, thus allowing us to train relation extractor without human annotations.
no code implementations • EMNLP 2017 • Siliang Tang, Ning Zhang, Jinjiang Zhang, Fei Wu, Yueting Zhuang
In domain-specific NER, due to insufficient labeled training data, deep models usually fail to behave normally.