Search Results for author: Xiaoyong Du

Found 27 papers, 15 papers with code

Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration

1 code implementation7 Dec 2023 Meihao Fan, Xiaoyue Han, Ju Fan, Chengliang Chai, Nan Tang, Guoliang Li, Xiaoyong Du

However, existing ICL approaches to ER typically necessitate providing a task description and a set of demonstrations for each entity pair and thus have limitations on the monetary cost of interfacing LLMs.

Entity Resolution In-Context Learning

Create and Find Flatness: Building Flat Training Spaces in Advance for Continual Learning

1 code implementation20 Sep 2023 Wenhang Shi, Yiren Chen, Zhe Zhao, Wei Lu, Kimmo Yan, Xiaoyong Du

Therefore, we shift the attention to the current task learning stage, presenting a novel framework, C&F (Create and Find Flatness), which builds a flat training space for each task in advance.

Continual Learning

WavMark: Watermarking for Audio Generation

no code implementations24 Aug 2023 Guangyu Chen, Yu Wu, Shujie Liu, Tao Liu, Xiaoyong Du, Furu Wei

Recent breakthroughs in zero-shot voice synthesis have enabled imitating a speaker's voice using just a few seconds of recording while maintaining a high level of realism.

Audio Generation

REAL: A Representative Error-Driven Approach for Active Learning

1 code implementation3 Jul 2023 Cheng Chen, Yong Wang, Lizi Liao, Yueguo Chen, Xiaoyong Du

Given a limited labeling budget, active learning (AL) aims to sample the most informative instances from an unlabeled pool to acquire labels for subsequent model training.

Active Learning Informativeness +2

Interleaving Pre-Trained Language Models and Large Language Models for Zero-Shot NL2SQL Generation

1 code implementation15 Jun 2023 Zihui Gu, Ju Fan, Nan Tang, Songyue Zhang, Yuxin Zhang, Zui Chen, Lei Cao, Guoliang Li, Sam Madden, Xiaoyong Du

PLMs can perform well in schema alignment but struggle to achieve complex reasoning, while LLMs is superior in complex reasoning tasks but cannot achieve precise schema alignment.

Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration

1 code implementation SIGMOD/PODS 2023 Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Guoliang Li, Xiaoyong Du, Xiaofeng Jia, Song Gao

The widely used practice is to build task-specific or even dataset-specific solutions, which are hard to generalize and disable the opportunities of knowledge sharing that can be learned from different datasets and multiple tasks.

Entity Resolution Zero-Shot Learning

ChatPipe: Orchestrating Data Preparation Program by Optimizing Human-ChatGPT Interactions

no code implementations7 Apr 2023 Sibei Chen, Hanbing Liu, Weiting Jin, Xiangyu Sun, Xiaoyao Feng, Ju Fan, Xiaoyong Du, Nan Tang

Orchestrating a high-quality data preparation program is essential for successful machine learning (ML), but it is known to be time and effort consuming.

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

3 code implementations13 Dec 2022 Zhe Zhao, Yudong Li, Cheng Hou, Jing Zhao, Rong Tian, Weijie Liu, Yiren Chen, Ningyuan Sun, Haoyan Liu, Weiquan Mao, Han Guo, Weigang Guo, Taiqiang Wu, Tao Zhu, Wenhang Shi, Chen Chen, Shan Huang, Sihong Chen, Liqun Liu, Feifei Li, Xiaoshuai Chen, Xingwu Sun, Zhanhui Kang, Xiaoyong Du, Linlin Shen, Kimmo Yan

The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework.

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

1 code implementation5 Nov 2022 Zihui Gu, Ju Fan, Nan Tang, Preslav Nakov, Xiaoman Zhao, Xiaoyong Du

In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms the previous state of the art by 4. 7 points (85. 6% vs. 80. 9%), and the gap between PASTA and human performance on the small TabFact test set is narrowed to just 1. 5 points (90. 6% vs. 92. 1%).

Fact Checking Fact Verification +5

Human Pose Driven Object Effects Recommendation

no code implementations17 Sep 2022 Zhaoxin Fan, Fengxin Li, Hongyan Liu, Jun He, Xiaoyong Du

In this paper, we research the new topic of object effects recommendation in micro-video platforms, which is a challenging but important task for many practical applications such as advertisement insertion.

Object

RPR-Net: A Point Cloud-based Rotation-aware Large Scale Place Recognition Network

no code implementations29 Aug 2021 Zhaoxin Fan, Zhenbo Song, Wenping Zhang, Hongyan Liu, Jun He, Xiaoyong Du

Third, we apply these kernels to previous point cloud features to generate new features, which is the well-known SO(3) mapping process.

Autonomous Driving Point Cloud Retrieval +2

A Benchmark for Voice-Face Cross-Modal Matching and Retrieval

no code implementations1 Jan 2021 Chuyuan Xiong, Deyuan Zhang, Tao Liu, Xiaoyong Du, Jiankun Tian, Songyan Xue

In this paper, a baseline evaluation framework is proposed for voice-face matching and retrieval tasks.

Retrieval

RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation

no code implementations4 Dec 2020 Nan Tang, Ju Fan, Fangyi Li, Jianhong Tu, Xiaoyong Du, Guoliang Li, Sam Madden, Mourad Ouzzani

RPT is pre-trained for a tuple-to-tuple model by corrupting the input tuple and then learning a model to reconstruct the original tuple.

Denoising Entity Resolution +4

Scalable Graph Neural Networks via Bidirectional Propagation

1 code implementation NeurIPS 2020 Ming Chen, Zhewei Wei, Bolin Ding, Yaliang Li, Ye Yuan, Xiaoyong Du, Ji-Rong Wen

Most notably, GBP can deliver superior performance on a graph with over 60 million nodes and 1. 8 billion edges in less than half an hour on a single machine.

Graph Sampling

Relational Data Synthesis using Generative Adversarial Networks: A Design Space Exploration

1 code implementation28 Aug 2020 Ju Fan, Tongyu Liu, Guoliang Li, Junyou Chen, Yuwei Shen, Xiaoyong Du

We conduct extensive experiments to explore the design space and compare with traditional data synthesis approaches.

Privacy Preserving

Deep Manifold Embedding for Hyperspectral Image Classification

1 code implementation24 Dec 2019 Zhiqiang Gong, Weidong Hu, Xiaoyong Du, Ping Zhong, Panhe Hu

Deep learning methods have played a more and more important role in hyperspectral image classification.

Classification Clustering +2

Voice-Face Cross-modal Matching and Retrieval: A Benchmark

no code implementations21 Nov 2019 Chuyuan Xiong, Deyuan Zhang, Tao Liu, Xiaoyong Du

It achieves state-of-the-art performance with various performance metrics on different tasks and with high test confidence on large scale datasets, which can be taken as a baseline for the follow-up research.

Retrieval

UER: An Open-Source Toolkit for Pre-training Models

1 code implementation IJCNLP 2019 Zhe Zhao, Hui Chen, Jinbin Zhang, Xin Zhao, Tao Liu, Wei Lu, Xi Chen, Haotang Deng, Qi Ju, Xiaoyong Du

Existing works, including ELMO and BERT, have revealed the importance of pre-training for NLP tasks.

Subword-level Composition Functions for Learning Word Embeddings

no code implementations WS 2018 Bofang Li, Aleks Drozd, R, Tao Liu, Xiaoyong Du

Subword-level information is crucial for capturing the meaning and morphology of words, especially for out-of-vocabulary entries.

Learning Word Embeddings

Learning Document Embeddings by Predicting N-grams for Sentiment Classification of Long Movie Reviews

1 code implementation27 Dec 2015 Bofang Li, Tao Liu, Xiaoyong Du, Deyuan Zhang, Zhe Zhao

Many document embeddings methods have been proposed to capture semantics, but they still can't outperform bag-of-ngram based methods on this task.

General Classification Sentiment Analysis +1

Cannot find the paper you are looking for? You can Submit a new open access paper.