Search Results for author: Xin Tan

Found 43 papers, 20 papers with code

Coupling Context Modeling with Zero Pronoun Recovering for Document-Level Natural Language Generation

1 code implementation EMNLP 2021 Xin Tan, Longyin Zhang, Guodong Zhou

Natural language generation (NLG) tasks on pro-drop languages are known to suffer from zero pronoun (ZP) problems, and the problems remain challenging due to the scarcity of ZP-annotated NLG corpora.

Machine Translation Question Answering +2

EDTC: A Corpus for Discourse-Level Topic Chain Parsing

1 code implementation Findings (EMNLP) 2021 Longyin Zhang, Xin Tan, Fang Kong, Guodong Zhou

Discourse analysis has long been known to be fundamental in natural language processing.

Mutual Information Guided Optimal Transport for Unsupervised Visible-Infrared Person Re-identification

no code implementations17 Jul 2024 Zhizhong Zhang, Jiangming Wang, Xin Tan, Yanyun Qu, JunPing Wang, Yong Xie, Yuan Xie

In the training stage, we utilize this matching information to introduce prototype-based contrastive learning for minimizing the intra- and cross-modality entropy ("Sharpness").

Contrastive Learning Fairness +1

Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining

no code implementations10 Jul 2024 Tianfang Sun, Zhizhong Zhang, Xin Tan, Yanyun Qu, Yuan Xie

We utilized timestamps and the semantic priors from VFMs to identify well-synchronized training pairs and to discover samples with diverse content.

3D Semantic Segmentation Autonomous Driving

Teola: Towards End-to-End Optimization of LLM-based Applications

no code implementations29 Jun 2024 Xin Tan, Yimin Jiang, Yitao Yang, Hong Xu

Existing frameworks employ coarse-grained orchestration with task modules, which confines optimizations to within each module and yields suboptimal scheduling decisions.

Language Modelling Large Language Model +1

PIG: Prompt Images Guidance for Night-Time Scene Parsing

1 code implementation15 Jun 2024 Zhifeng Xie, Rui Qiu, Sen Wang, Xin Tan, Yuan Xie, Lizhuang Ma

In this paper, we leverage Prompt Images Guidance (PIG) to enhance UDA with supplementary night knowledge.

Data Augmentation Pseudo Label +2

FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping

no code implementations4 Jun 2024 Yuzhou Ji, He Zhu, Junshu Tang, Wuyi Liu, Zhizhong Zhang, Yuan Xie, Lizhuang Ma, Xin Tan

The semantically interactive radiance field has always been an appealing task for its potential to facilitate user-friendly and automated real-world 3D scene understanding applications.

Scene Understanding

Gradient Projection For Continual Parameter-Efficient Tuning

no code implementations22 May 2024 Jingyang Qiao, Zhizhong Zhang, Xin Tan, Yanyun Qu, Wensheng Zhang, Zhi Han, Yuan Xie

Parameter-efficient tunings (PETs) have demonstrated impressive performance and promising perspectives in training large models, while they are still confronted with a common problem: the trade-off between learning new content and protecting old knowledge, leading to zero-shot generalization collapse, and cross-modal hallucination.

Continual Learning Hallucination +1

GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision

no code implementations17 May 2024 Xin Tan, Wenbin Wu, Zhiwei Zhang, Chaojie Fan, Yong Peng, Zhizhong Zhang, Yuan Xie, Lizhuang Ma

Nevertheless, current models still encounter two main challenges: modeling depth accurately in the 2D-3D view transformation stage, and overcoming the lack of generalizability issues due to sparse LiDAR supervision.

Autonomous Driving Decoder +3

Efficient Multimodal Large Language Models: A Survey

1 code implementation17 May 2024 Yizhang Jin, Jian Li, Yexin Liu, Tianjun Gu, Kai Wu, Zhengkai Jiang, Muyang He, Bo Zhao, Xin Tan, Zhenye Gan, Yabiao Wang, Chengjie Wang, Lizhuang Ma

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance in tasks such as visual question answering, visual understanding and reasoning.

Edge-computing Question Answering +1

Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception

1 code implementation CVPR 2024 Haoming Chen, Zhizhong Zhang, Yanyun Qu, Ruixin Zhang, Xin Tan, Yuan Xie

Such inconsiderate consistency greatly hampers the promising path of reaching an universal pre-training framework: (1) The cross-scene semantic self-conflict, i. e., the intense collision between primitive segments of the same semantics from different scenes; (2) Lacking a globally unified bond that pushes the cross-scene semantic consistency into 3D representation learning.

object-detection Object Detection +2

PromptAD: Learning Prompts with only Normal Samples for Few-Shot Anomaly Detection

1 code implementation CVPR 2024 Xiaofan Li, Zhizhong Zhang, Xin Tan, Chengwei Chen, Yanyun Qu, Yuan Xie, Lizhuang Ma

The vision-language model has brought great improvement to few-shot industrial anomaly detection, which usually needs to design of hundreds of prompts through prompt engineering.

Anomaly Detection Language Modelling +1

CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion

1 code implementation12 Mar 2024 Qibing Ren, Chang Gao, Jing Shao, Junchi Yan, Xin Tan, Wai Lam, Lizhuang Ma

The rapid advancement of Large Language Models (LLMs) has brought about remarkable generative capabilities but also raised concerns about their potential misuse.

Code Completion

Continuous Piecewise-Affine Based Motion Model for Image Animation

1 code implementation17 Jan 2024 Hexiang Wang, Fengqi Liu, Qianyu Zhou, Ran Yi, Xin Tan, Lizhuang Ma

To address this issue, we propose to model motion from the source image to the driving frame in highly-expressive diffeomorphism spaces.

Image Animation

Beyond the Label Itself: Latent Labels Enhance Semi-supervised Point Cloud Panoptic Segmentation

no code implementations13 Dec 2023 Yujun Chen, Xin Tan, Zhizhong Zhang, Yanyun Qu, Yuan Xie

Second, in the Image Branch, we propose the Instance Position-scale Learning (IPSL) Module to learn and fuse the information of instance position and scale, which is from a 2D pre-trained detector and a type of latent label obtained from 3D to 2D projection.

Panoptic Segmentation Position

Evidence-based Interpretable Open-domain Fact-checking with Large Language Models

no code implementations10 Dec 2023 Xin Tan, Bowei Zou, Ai Ti Aw

Universal fact-checking systems for real-world claims face significant challenges in gathering valid and sufficient real-time evidence and making reasoned decisions.

Fact Checking valid

COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction

1 code implementation CVPR 2024 Qihang Ma, Xin Tan, Yanyun Qu, Lizhuang Ma, Zhizhong Zhang, Yuan Xie

The autonomous driving community has shown significant interest in 3D occupancy prediction, driven by its exceptional geometric perception and general object recognition capabilities.

Autonomous Driving Decoder +1

Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

1 code implementation CVPR 2024 Zhen Zhao, Jingqun Tang, Chunhui Lin, Binghong Wu, Can Huang, Hao liu, Xin Tan, Zhizhong Zhang, Yuan Xie

A straightforward solution is performing model fine-tuning tailored to a specific scenario, but it is computationally intensive and requires multiple model copies for various scenarios.

Diversity In-Context Learning +1

Generalized Category Discovery in Semantic Segmentation

1 code implementation20 Nov 2023 Zhengyuan Peng, Qijian Tian, Jianqing Xu, Yizhang Jin, Xuequan Lu, Xin Tan, Yuan Xie, Lizhuang Ma

This paper explores a novel setting called Generalized Category Discovery in Semantic Segmentation (GCDSS), aiming to segment unlabeled images given prior knowledge from a labeled set of base classes.

Segmentation Semantic Segmentation

Unveiling the Power of CLIP in Unsupervised Visible-Infrared Person Re-Identification

1 code implementation journal 2023 Zhong Chen, Zhizhong Zhang, Xin Tan, Yanyun Qu, and Yuan XieAuthors Info & Claims

In this paper, we propose a new prompt learning paradigm for unsupervised visible-infrared person re-identification (USL-VI-ReID) by taking full advantage of the visual-text representation ability from CLIP.

Contrastive Learning Person Re-Identification

Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data

1 code implementation CVPR 2023 Yuhao Chen, Xin Tan, Borui Zhao, Zhaowei Chen, RenJie Song, Jiajun Liang, Xuequan Lu

ANL introduces the additional negative pseudo-label for all unlabeled data to leverage low-confidence examples.

Pseudo Label

Learning To Detect Mirrors From Videos via Dual Correspondences

no code implementations CVPR 2023 Jiaying Lin, Xin Tan, Rynson W.H. Lau

However, detecting mirrors over dynamic scenes is still under-explored due to the lack of a high-quality dataset and an effective method for video mirror detection (VMD).

Multi-Centroid Task Descriptor for Dynamic Class Incremental Inference

no code implementations CVPR 2023 Tenghao Cai, Zhizhong Zhang, Xin Tan, Yanyun Qu, Guannan Jiang, Chengjie Wang, Yuan Xie

As a result, our dynamic inference network is trained independently of baseline and provides a flexible, efficient solution to distinguish between tasks.

Class Incremental Learning Incremental Learning

Rethinking Gradient Projection Continual Learning: Stability / Plasticity Feature Space Decoupling

no code implementations CVPR 2023 Zhen Zhao, Zhizhong Zhang, Xin Tan, Jun Liu, Yanyun Qu, Yuan Xie, Lizhuang Ma

In this paper, we propose a space decoupling (SD) algorithm to decouple the feature space into a pair of complementary subspaces, i. e., the stability space I, and the plasticity space R. I is established by conducting space intersection between the historic and current feature space, and thus I contains more task-shared bases.

Continual Learning

Image Understands Point Cloud: Weakly Supervised 3D Semantic Segmentation via Association Learning

1 code implementation16 Sep 2022 Tianfang Sun, Zhizhong Zhang, Xin Tan, Yanyun Qu, Yuan Xie, Lizhuang Ma

In this paper, we propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images.

3D Semantic Segmentation Pseudo Label +2

Boosting Night-time Scene Parsing with Learnable Frequency

1 code implementation30 Aug 2022 Zhifeng Xie, Sen Wang, Ke Xu, Zhizhong Zhang, Xin Tan, Yuan Xie, Lizhuang Ma

Based on this, we propose to exploit the image frequency distributions for night-time scene parsing.

Autonomous Driving Scene Parsing

Discourse Cohesion Evaluation for Document-Level Neural Machine Translation

no code implementations19 Aug 2022 Xin Tan, Longyin Zhang, Guodong Zhou

It is well known that translations generated by an excellent document-level neural machine translation (NMT) model are consistent and coherent.

Machine Translation NMT +2

Dual Windows Are Significant: Learning from Mediastinal Window and Focusing on Lung Window

no code implementations8 Jun 2022 Qiuli Wang, Xin Tan, Chen Liu

Since the pandemic of COVID-19, several deep learning methods were proposed to analyze the chest Computed Tomography (CT) for diagnosis.

Computed Tomography (CT)

Novelty Detection via Contrastive Learning with Negative Data Augmentation

no code implementations18 Jun 2021 Chengwei Chen, Yuan Xie, Shaohui Lin, Ruizhi Qiao, Jian Zhou, Xin Tan, Yi Zhang, Lizhuang Ma

Moreover, our model is more stable for training in a non-adversarial manner, compared to other adversarial based novelty detection methods.

Clustering Contrastive Learning +5

Instance and Pair-Aware Dynamic Networks for Re-Identification

no code implementations9 Mar 2021 Bingliang Jiao, Xin Tan, Jinghao Zhou, Lu Yang, Yunlong Wang, Peng Wang

The proposed model is composed of three main branches where a self-guided dynamic branch is constructed to strengthen instance-specific features, focusing on every single image.

Boundary-Aware Geometric Encoding for Semantic Segmentation of Point Clouds

no code implementations7 Jan 2021 Jingyu Gong, Jiachen Xu, Xin Tan, Jie zhou, Yanyun Qu, Yuan Xie, Lizhuang Ma

Boundary information plays a significant role in 2D image segmentation, while usually being ignored in 3D point cloud segmentation where ambiguous features might be generated in feature extraction, leading to misclassification in the transition area between two objects.

Image Segmentation Point Cloud Segmentation +2

Coreference Resolution: Are the eliminated spans totally worthless?

no code implementations4 Jan 2021 Xin Tan, Longyin Zhang, Guodong Zhou

Various neural-based methods have been proposed so far for joint mention detection and coreference resolution.

coreference-resolution Diversity

Weakly-Supervised Saliency Detection via Salient Object Subitizing

no code implementations4 Jan 2021 Xiaoyang Zheng, Xin Tan, Jie zhou, Lizhuang Ma, Rynson W. H. Lau

This allows the supervision to be aligned with the property of saliency detection, where the salient objects of an image could be from more than one class.

Object object-detection +4

Night-time Scene Parsing with a Large Real Dataset

no code implementations15 Mar 2020 Xin Tan, Ke Xu, Ying Cao, Yiheng Zhang, Lizhuang Ma, Rynson W. H. Lau

Although huge progress has been made on scene analysis in recent years, most existing works assume the input images to be in day-time with good lighting conditions.

Scene Parsing Semantic Segmentation

SceneEncoder: Scene-Aware Semantic Segmentation of Point Clouds with A Learnable Scene Descriptor

1 code implementation24 Jan 2020 Jiachen Xu, Jingyu Gong, Jie zhou, Xin Tan, Yuan Xie, Lizhuang Ma

Besides local features, global information plays an essential role in semantic segmentation, while recent works usually fail to explicitly extract the meaningful global information and make full use of it.

Segmentation Semantic Segmentation

Re-ID Driven Localization Refinement for Person Search

no code implementations ICCV 2019 Chuchu Han, Jiacheng Ye, Yunshan Zhong, Xin Tan, Chi Zhang, Changxin Gao, Nong Sang

The state-of-the-art methods train the detector individually, and the detected bounding boxes may be sub-optimal for the following re-ID task.

Person Re-Identification Person Search

FVNet: 3D Front-View Proposal Generation for Real-Time Object Detection from Point Clouds

no code implementations26 Mar 2019 Jie Zhou, Xin Tan, Zhiwei Shao, Lizhuang Ma

We then introduce a proposal generation network to predict 3D region proposals from the generated maps and further extrude objects of interest from the whole point cloud.

3D Object Detection Object +2

Deep Multi-Center Learning for Face Alignment

1 code implementation5 Aug 2018 Zhiwen Shao, Hengliang Zhu, Xin Tan, Yangyang Hao, Lizhuang Ma

Most of the existing deep learning methods only use one fully-connected layer called shape prediction layer to estimate the locations of facial landmarks.

Face Alignment

Cannot find the paper you are looking for? You can Submit a new open access paper.