Re-examining the Role of Schema Linking in Text-to-SQL

no code implementations EMNLP 2020 Wenqiang Lei, Weixin Wang, Zhixin Ma, Tian Gan, Wei Lu, Min-Yen Kan, Tat-Seng Chua

By providing a schema linking corpus based on the Spider text-to-SQL dataset, we systematically study the role of schema linking.


Preview-based Category Contrastive Learning for Knowledge Distillation

no code implementations18 Oct 2024 Muhe Ding, Jianlong Wu, Xue Dong, Xiaojie Li, Pengda Qin, Tian Gan, Liqiang Nie

It first distills the structural knowledge of both instance-level feature correspondence and the relation between instance features and category centers in a contrastive learning fashion, which can explicitly optimize the category representation and explore the distinct correlation between representations of instances and categories, contributing to discriminative category centers and better classification results.

Contrastive Learning Knowledge Distillation +1

Social Debiasing for Fair Multi-modal LLMs

no code implementations13 Aug 2024 Harry Cheng, Yangyang Guo, Qingpei Guo, Ming Yang, Tian Gan, Liqiang Nie

Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities.


SHE-Net: Syntax-Hierarchy-Enhanced Text-Video Retrieval

no code implementations22 Apr 2024 Xuzheng Yu, Chen Jiang, Xingning Dong, Tian Gan, Ming Yang, Qingpei Guo

In particular, text-video retrieval, which aims to find the top matching videos given text descriptions from a vast video corpus, is an essential function, the primary challenge of which is to bridge the modality gap.

Retrieval Video Retrieval

SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks

1 code implementation31 Jan 2024 Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chu

By employing one shared BERT-type network to refine textual and cross-modal features simultaneously, SNP is lightweight and could support various downstream applications.


Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

no code implementations9 Jan 2024 Xuzheng Yu, Chen Jiang, Wei zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu

With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important.

Representation Learning Scene Recognition

RTQ: Rethinking Video-language Understanding Based on Image-text Model

2 code implementations1 Dec 2023 Xiao Wang, Yaoyu Li, Tian Gan, Zheng Zhang, Jingjing Lv, Liqiang Nie

Recent advancements in video-language understanding have been established on the foundation of image-text models, resulting in promising outcomes due to the shared knowledge between images and videos.

Video Captioning Video Question Answering +1

EVE: Efficient zero-shot text-based Video Editing with Depth Map Guidance and Temporal Consistency Constraints

1 code implementation21 Aug 2023 Yutao Chen, Xingning Dong, Tian Gan, Chunluan Zhou, Ming Yang, Qingpei Guo

Compared with images, we conjecture that videos necessitate more constraints to preserve the temporal consistency during editing.

Video Editing

Temporal Sentence Grounding in Streaming Videos

1 code implementation14 Aug 2023 Tian Gan, Xiao Wang, Yan Sun, Jianlong Wu, Qingpei Guo, Liqiang Nie

The goal of TSGSV is to evaluate the relevance between a video stream and a given sentence query.

Sentence Temporal Sentence Grounding

Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation

1 code implementation15 Mar 2023 Xiao Wang, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie

Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation.

Link Prediction Relation +3

CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning

1 code implementation CVPR 2023 Jianlong Wu, Haozhe Yang, Tian Gan, Ning Ding, Feijun Jiang, Liqiang Nie

In the meantime, we make full use of the structured information in the hierarchical labels to learn an accurate affinity graph for contrastive learning.

Contrastive Learning

CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-Text Dataset

1 code implementation CVPR 2023 Tian Gan, Qing Wang, Xingning Dong, Xiangyuan Ren, Liqiang Nie, Qingpei Guo

Though there are certain methods studying the Chinese video-text pre-training, they pre-train their models on private datasets whose videos and text are unavailable.

Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation

1 code implementation CVPR 2022 Xingning Dong, Tian Gan, Xuemeng Song, Jianlong Wu, Yuan Cheng, Liqiang Nie

Scene Graph Generation, which generally follows a regular encoder-decoder pipeline, aims to first encode the visual contents within the given image and then parse them into a compact summary graph.

Decoder Graph Generation +1

Explicit Interaction Model towards Text Classification

1 code implementation23 Nov 2018 Cunxiao Du, Zhaozheng Chin, Fuli Feng, Lei Zhu, Tian Gan, Liqiang Nie

To address this problem, we introduce the interaction mechanism to incorporate word-level matching signals into the text classification task.

General Classification Multi Class Text Classification +3

