Search Results for author: Yan Xia

Found 45 papers, 20 papers with code

Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

1 code implementation • 4 Apr 2024 • Wenshan Wu, Shaoguang Mao, Yadong Zhang, Yan Xia, Li Dong, Lei Cui, Furu Wei

Large language models (LLMs) have exhibited impressive performance in language comprehension and various reasoning tasks.

Visual Navigation

Paper
Code

LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models

no code implementations • 1 Apr 2024 • Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Adrian de Wynter, Yan Xia, Wenshan Wu, Ting Song, Man Lan, Furu Wei

This paper presents a comprehensive survey of the current status and opportunities for Large Language Models (LLMs) in strategic reasoning, a sophisticated form of reasoning that necessitates understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly.

Decision Making

Paper
Add Code

VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition

no code implementations • 21 Mar 2024 • Yun-Jin Li, Mariia Gladkova, Yan Xia, Rui Wang, Daniel Cremers

Recent works on the global place recognition treat the task as a retrieval problem, where an off-the-shelf global descriptor is commonly designed in image-based and LiDAR-based modalities.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Unlocking the Potential of Multimodal Unified Discrete Representation through Training-Free Codebook Optimization and Hierarchical Alignment

1 code implementation • 8 Mar 2024 • Hai Huang, Yan Xia, Shengpeng Ji, Shulei Wang, Hanting Wang, Jieming Zhu, Zhenhua Dong, Zhou Zhao

The Dual Cross-modal Information Disentanglement (DCID) model, utilizing a unified codebook, shows promising results in achieving fine-grained representation and cross-modal generalization.

Disentanglement

Paper
Code

Unsupervised Domain Adaptation for Brain Vessel Segmentation through Transwarp Contrastive Learning

1 code implementation • 23 Feb 2024 • Fengming Lin, Yan Xia, Michael MacRaild, Yash Deo, Haoran Dou, Qiongyao Liu, Kun Wu, Nishant Ravikumar, Alejandro F. Frangi

Unsupervised domain adaptation (UDA) aims to align the labelled source distribution with the unlabelled target distribution to obtain domain-invariant predictive models.

Contrastive Learning Unsupervised Domain Adaptation

Paper
Code

GS-EMA: Integrating Gradient Surgery Exponential Moving Average with Boundary-Aware Contrastive Learning for Enhanced Domain Generalization in Aneurysm Segmentation

1 code implementation • 23 Feb 2024 • Fengming Lin, Yan Xia, Michael MacRaild, Yash Deo, Haoran Dou, Qiongyao Liu, Nina Cheng, Nishant Ravikumar, Alejandro F. Frangi

The automated segmentation of cerebral aneurysms is pivotal for accurate diagnosis and treatment planning.

Contrastive Learning Domain Generalization +1

Paper
Code

K-Level Reasoning with Large Language Models

no code implementations • 2 Feb 2024 • Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Yan Xia, Man Lan, Furu Wei

While Large Language Models (LLMs) have demonstrated their proficiency in complex reasoning tasks, their performance in dynamic, interactive, and competitive scenarios - such as business strategy and stock market analysis - remains underexplored.

Decision Making

Paper
Add Code

Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding

no code implementations • 21 Dec 2023 • Haifeng Huang, Yang Zhao, Zehan Wang, Yan Xia, Zhou Zhao

Thus, to address this issue and enhance model performance on new scenes, we explore the TVG task in an unsupervised domain adaptation (UDA) setting across scenes for the first time, where the video-query pairs in the source scene (domain) are labeled with temporal boundaries, while those in the target scene are not.

Unsupervised Domain Adaptation Video Grounding

Paper
Add Code

StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis

no code implementations • 17 Dec 2023 • Yu Zhang, Rongjie Huang, RuiQi Li, Jinzheng He, Yan Xia, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao

Moreover, existing SVS methods encounter a decline in the quality of synthesized singing voices in OOD scenarios, as they rest upon the assumption that the target vocal attributes are discernible during the training phase.

Quantization Singing Voice Synthesis +1

Paper
Add Code

Text2Loc: 3D Point Cloud Localization from Natural Language

no code implementations • 27 Nov 2023 • Yan Xia, Letian Shi, Zifeng Ding, João F. Henriques, Daniel Cremers

We tackle the problem of 3D point cloud localization based on a few natural linguistic descriptions and introduce a novel neural network, Text2Loc, that fully interprets the semantic relationship between points and text.

Contrastive Learning

Paper
Add Code

Multi-view Hybrid Graph Convolutional Network for Volume-to-mesh Reconstruction in Cardiovascular MRI

no code implementations • 22 Nov 2023 • Nicolás Gaggion, Benjamin A. Matheson, Yan Xia, Rodrigo Bonazzola, Nishant Ravikumar, Zeike A. Taylor, Diego H. Milone, Alejandro F. Frangi, Enzo Ferrante

Cardiovascular magnetic resonance imaging is emerging as a crucial tool to examine cardiac morphology and function.

Anatomy

Paper
Add Code

Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks

1 code implementation • NeurIPS 2023 • Haoyi Duan, Yan Xia, Mingze Zhou, Li Tang, Jieming Zhu, Zhou Zhao

This mechanism leverages audio and visual modalities as soft prompts to dynamically adjust the parameters of pre-trained models based on the current multi-modal input features.

Paper
Code

ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic Decision-Making with AI Agents

1 code implementation • 6 Nov 2023 • Shaoguang Mao, Yuzhe Cai, Yan Xia, Wenshan Wu, Xun Wang, Fengyi Wang, Tao Ge, Furu Wei

This paper introduces Alympics (Olympics for Agents), a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research.

Decision Making Language Modelling +1

Paper
Code

EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation

no code implementations • 12 Oct 2023 • Wang You, Wenshan Wu, Yaobo Liang, Shaoguang Mao, Chenfei Wu, Maosong Cao, Yuzhe Cai, Yiduo Guo, Yan Xia, Furu Wei, Nan Duan

In this paper, we propose a new framework called Evaluation-guided Iterative Plan Extraction for long-form narrative text generation (EIPE-text), which extracts plans from the corpus of narratives and utilizes the extracted plans to construct a better planner.

In-Context Learning Text Generation

Paper
Add Code

SCP: Scene Completion Pre-training for 3D Object Detection

no code implementations • 12 Sep 2023 • Yiming Shan, Yan Xia, Yuhong Chen, Daniel Cremers

In this paper, we propose a Scene Completion Pre-training (SCP) method to enhance the performance of 3D object detectors with less labeled data.

3D Object Detection Autonomous Driving +2

Paper
Add Code

Learned Local Attention Maps for Synthesising Vessel Segmentations

no code implementations • 24 Aug 2023 • Yash Deo, Rodrigo Bonazzola, Haoran Dou, Yan Xia, Tianyou Wei, Nishant Ravikumar, Alejandro F. Frangi, Toni Lassila

We present an encoder-decoder model for synthesising segmentations of the main cerebral arteries in the circle of Willis (CoW) from only T2 MRI.

Paper
Add Code

Adaptive Semi-Supervised Segmentation of Brain Vessels with Ambiguous Labels

1 code implementation • 7 Aug 2023 • Fengming Lin, Yan Xia, Nishant Ravikumar, Qiongyao Liu, Michael MacRaild, Alejandro F Frangi

Accurate segmentation of brain vessels is crucial for cerebrovascular disease diagnosis and treatment.

Domain Generalization Segmentation +1

Paper
Code

Exploring Link Prediction over Hyper-Relational Temporal Knowledge Graphs Enhanced with Time-Invariant Relational Knowledge

no code implementations • 14 Jul 2023 • Zifeng Ding, Jingcheng Wu, Jingpei Wu, Yan Xia, Volker Tresp

We develop two new benchmark hyper-relational TKG (HTKG) datasets, i. e., Wiki-hy and YAGO-hy, and propose an HTKG reasoning model that efficiently models both temporal facts and qualifiers.

Knowledge Graphs Link Prediction +1

Paper
Add Code

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

no code implementations • 8 Jun 2023 • Zhiyi Wang, Shaoguang Mao, Wenshan Wu, Yan Xia, Yan Deng, Jonathan Tien

To leverage NLP models, speech input is first force-aligned with texts, and then pre-processed into a token sequence, including words and phrase break information.

text-classification Text Classification

Paper
Add Code

Smart Word Suggestions for Writing Assistance

1 code implementation • 17 May 2023 • Chenshuo Wang, Shaoguang Mao, Tao Ge, Wenshan Wu, Xun Wang, Yan Xia, Jonathan Tien, Dongyan Zhao

The training dataset comprises over 3. 7 million sentences and 12. 7 million suggestions generated through rules.

Paper
Code

Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting

no code implementations • 11 May 2023 • Haoyang Huang, Tianyi Tang, Dongdong Zhang, Wayne Xin Zhao, Ting Song, Yan Xia, Furu Wei

Large language models (LLMs) demonstrate impressive multilingual capability, but their performance varies substantially across different languages.

Arithmetic Reasoning Logical Reasoning +1

Paper
Add Code

Scan2LoD3: Reconstructing semantic 3D building models at LoD3 using ray casting and Bayesian networks

1 code implementation • 10 May 2023 • Olaf Wysocki, Yan Xia, Magdalena Wysocki, Eleonora Grilli, Ludwig Hoegner, Daniel Cremers, Uwe Stilla

To this end, we leverage laser physics and 3D building model priors to probabilistically identify model conflicts.

3D Reconstruction Autonomous Driving +2

Paper
Code

Low-code LLM: Graphical User Interface over Large Language Models

2 code implementations • 17 Apr 2023 • Yuzhe Cai, Shaoguang Mao, Wenshan Wu, Zehua Wang, Yaobo Liang, Tao Ge, Chenfei Wu, Wang You, Ting Song, Yan Xia, Jonathan Tien, Nan Duan, Furu Wei

By introducing this framework, we aim to bridge the gap between humans and LLMs, enabling more effective and efficient utilization of LLMs for complex tasks.

Prompt Engineering

34,514

Paper
Code

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

no code implementations • 29 Mar 2023 • Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan

On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well.

Code Generation Common Sense Reasoning +1

Paper
Add Code

Unsupervised ensemble-based phenotyping helps enhance the discoverability of genes related to heart morphology

no code implementations • 7 Jan 2023 • Rodrigo Bonazzola, Enzo Ferrante, Nishant Ravikumar, Yan Xia, Bernard Keavney, Sven Plein, Tanveer Syeda-Mahmood, Alejandro F Frangi

Here, we propose a new framework for gene discovery entitled Unsupervised Phenotype Ensembles (UPE).

Paper
Add Code

Extensible Prompts for Language Models on Zero-shot Language Style Customization

no code implementations • NeurIPS 2023 • Tao Ge, Jing Hu, Li Dong, Shaoguang Mao, Yan Xia, Xun Wang, Si-Qing Chen, Furu Wei

We propose eXtensible Prompt (X-Prompt) for prompting a large language model (LLM) beyond natural language (NL).

Descriptive Language Modelling +1

Paper
Add Code

Joint segmentation and discontinuity-preserving deformable registration: Application to cardiac cine-MR images

1 code implementation • 24 Nov 2022 • Xiang Chen, Yan Xia, Nishant Ravikumar, Alejandro F Frangi

In such scenarios, enforcing smooth, globally continuous deformation fields leads to incorrect/implausible registration results.

Image Registration Image Segmentation +3

Paper
Code

CASSPR: Cross Attention Single Scan Place Recognition

1 code implementation • ICCV 2023 • Yan Xia, Mariia Gladkova, Rui Wang, Qianyun Li, Uwe Stilla, João F. Henriques, Daniel Cremers

CASSPR uses queries from one branch to try to match structures in the other branch, ensuring that both extract self-contained descriptors of the point cloud (rather than one branch dominating), but using both to inform the output global descriptor of the point cloud.

Paper
Code

Assessing Phrase Break of ESL speech with Pre-trained Language Models

no code implementations • 28 Oct 2022 • Zhiyi Wang, Shaoguang Mao, Wenshan Wu, Yan Xia

The token sequence is then fed into the pre-training and fine-tuning pipeline.

text-classification Text Classification

Paper
Add Code

Video-Guided Curriculum Learning for Spoken Video Grounding

1 code implementation • 1 Sep 2022 • Yan Xia, Zhou Zhao, Shangwei Ye, Yang Zhao, Haoyuan Li, Yi Ren

To rectify the discriminative phonemes and extract video-related information from noisy audio, we develop a novel video-guided curriculum learning (VGCL) during the audio pre-training process, which can make use of the vital visual perceptions to help understand the spoken language and suppress the external noise.

Video Grounding

Paper
Code

A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds

1 code implementation • 8 Mar 2022 • Yan Xia, Qiangqiang Wu, Wei Li, Antoni B. Chan, Uwe Stilla

Recent works on 3D single object tracking treat the task as a target-specific 3D detection task, where an off-the-shelf 3D detector is commonly employed for the tracking.

3D Single Object Tracking motion prediction +1

Paper
Code

Cross-Modal Background Suppression for Audio-Visual Event Localization

1 code implementation • CVPR 2022 • Yan Xia, Zhou Zhao

Audiovisual Event (AVE) localization requires the model to jointly localize an event by observing audio and visual information.

audio-visual event localization

Paper
Code

An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings

no code implementations • 14 Oct 2021 • Wenxuan Ye, Shaoguang Mao, Frank Soong, Wenshan Wu, Yan Xia, Jonathan Tien, Zhiyong Wu

These embeddings, when used as implicit phonetic supplementary information, can alleviate the data shortage of explicit phoneme annotations.

Paper
Add Code

Rapid Assessments of Light-Duty Gasoline Vehicle Emissions Using On-Road Remote Sensing and Machine Learning

no code implementations • 1 Oct 2021 • Yan Xia, Linhui Jiang, Lu Wang, Xue Chen, Jianjie Ye, Tangyan Hou, Liqiang Wang, Yibo Zhang, Mengying Li, Zhen Li, Zhe Song, Yaping Jiang, Weiping Liu, Pengfei Li, Daniel Rosenfeld, John H. Seinfeld, Shaocai Yu

Our results show that the ORRS measurements, assisted by the machine-learning-based ensemble model developed here, can realize day-to-day supervision of on-road vehicle-specific emissions.

Paper
Add Code

A Deep Discontinuity-Preserving Image Registration Network

1 code implementation • 9 Jul 2021 • Xiang Chen, Nishant Ravikumar, Yan Xia, Alejandro F Frangi

Image registration aims to establish spatial correspondence across pairs, or groups of images, and is a cornerstone of medical image computing and computer-assisted-interventions.

Image Registration Medical Image Registration +1

Paper
Code

CAR-Net: Unsupervised Co-Attention Guided Registration Network for Joint Registration and Structure Learning

no code implementations • 11 Jun 2021 • Xiang Chen, Yan Xia, Nishant Ravikumar, Alejandro F Frangi

Image registration is a fundamental building block for various applications in medical image analysis.

Image Registration

Paper
Add Code

Lessons Learned Addressing Dataset Bias in Model-Based Candidate Generation at Twitter

no code implementations • 13 May 2021 • Alim Virani, Jay Baxter, Dan Shiebler, Philip Gautier, Shivam Verma, Yan Xia, Apoorv Sharma, Sumit Binnani, LinLin Chen, Chenguang Yu

Traditionally, heuristic methods are used to generate candidates for large scale recommender systems.

Recommendation Systems

Paper
Add Code

ASFM-Net: Asymmetrical Siamese Feature Matching Network for Point Completion

1 code implementation • 19 Apr 2021 • Yaqi Xia, Yan Xia, Wei Li, Rui Song, Kailang Cao, Uwe Stilla

We tackle the problem of object completion from point clouds and propose a novel point cloud completion network employing an Asymmetrical Siamese Feature Matching strategy, termed as ASFM-Net.

Point Cloud Completion

Paper
Code

SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud based Place Recognition

1 code implementation • CVPR 2021 • Yan Xia, Yusheng Xu, Shuang Li, Rui Wang, Juan Du, Daniel Cremers, Uwe Stilla

We tackle the problem of place recognition from point cloud data and introduce a self-attention and orientation encoding network (SOE-Net) that fully explores the relationship between points and incorporates long-range context into point-wise local descriptors.

Ranked #5 on 3D Place Recognition on Oxford RobotCar Dataset (AR@1% metric)

3D Place Recognition Metric Learning +1

Paper
Code

Improving pronunciation assessment via ordinal regression with anchored reference samples

no code implementations • 26 Oct 2020 • Bin Su, Shaoguang Mao, Frank Soong, Yan Xia, Jonathan Tien, Zhiyong Wu

Traditional speech pronunciation assessment, based on the Goodness of Pronunciation (GOP) algorithm, has some weakness in assessing a speech utterance: 1) Phoneme GOP scores cannot be easily translated into a sentence score with a simple average for effective assessment; 2) The rank ordering information has not been well exploited in GOP scoring for delivering a robust assessment and correlate well with a human rater's evaluations.

regression Sentence

Paper
Add Code

VPC-Net: Completion of 3D Vehicles from MLS Point Clouds

1 code implementation • 8 Aug 2020 • Yan Xia, Yusheng Xu, Cheng Wang, Uwe Stilla

Moreover, a new refiner module is also presented to preserve the vehicle details from inputs and refine the complete outputs with fine-grained information.

Autonomous Driving

Paper
Code

RealPoint3D: Point Cloud Generation from a Single Image with Complex Background

1 code implementation • 8 Sep 2018 • Yan Xia, Yang Zhang, Dingfu Zhou, Xinyu Huang, Cheng Wang, Ruigang Yang

Then, the image together with the retrieved shape model is fed into the proposed network to generate the fine-grained 3D point cloud.

3D Generation Point Cloud Generation

Paper
Code

Mixed one-bit compressive sensing with applications to overexposure correction for CT reconstruction

no code implementations • 3 Jan 2017 • Xiaolin Huang, Yan Xia, Lei Shi, Yixing Huang, Ming Yan, Joachim Hornegger, Andreas Maier

Aiming at overexposure correction for computed tomography (CT) reconstruction, we in this paper propose a mixed one-bit compressive sensing (M1bit-CS) to acquire information from both regular and saturated measurements.

Compressive Sensing Computed Tomography (CT) +1

Paper
Add Code

Learning Discriminative Reconstructions for Unsupervised Outlier Removal

no code implementations • ICCV 2015 • Yan Xia, Xudong Cao, Fang Wen, Gang Hua, Jian Sun

We study the problem of automatically removing outliers from noisy data, with application for removing outlier images from an image collection.

Paper
Add Code

Sparse Projections for High-Dimensional Binary Codes

no code implementations • CVPR 2015 • Yan Xia, Kaiming He, Pushmeet Kohli, Jian Sun

This paper addresses the problem of learning long binary codes from high-dimensional data.

Image Classification Image Retrieval +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.