Search Results for author: Luo Si

Found 99 papers, 49 papers with code

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections

3 code implementations24 May 2022 Chenliang Li, Haiyang Xu, Junfeng Tian, Wei Wang, Ming Yan, Bin Bi, Jiabo Ye, Hehong Chen, Guohai Xu, Zheng Cao, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou, Luo Si

Large-scale pretrained foundation models have been an emerging paradigm for building artificial intelligence (AI) systems, which can be quickly adapted to a wide range of downstream tasks.

Computational Efficiency Image Captioning +6

PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation

2 code implementations14 Apr 2020 Bin Bi, Chenliang Li, Chen Wu, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si

An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.

Abstractive Text Summarization Conversational Response Generation +8

VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation

1 code implementation ACL 2021 Fuli Luo, Wei Wang, Jiahao Liu, Yijia Liu, Bin Bi, Songfang Huang, Fei Huang, Luo Si

Existing work in multilingual pretraining has demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages.

Language Modelling Question Answering +4

Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing

2 code implementations28 Jun 2022 Lihan Wang, Bowen Qin, Binyuan Hui, Bowen Li, Min Yang, Bailin Wang, Binhua Li, Fei Huang, Luo Si, Yongbin Li

The importance of building text-to-SQL parsers which can be applied to new databases has long been acknowledged, and a critical step to achieve this goal is schema linking, i. e., properly recognizing mentions of unseen columns or tables when generating SQLs.

SQL Parsing Text-To-SQL

SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers

1 code implementation COLING 2022 Bowen Qin, Lihan Wang, Binyuan Hui, Bowen Li, Xiangpeng Wei, Binhua Li, Fei Huang, Luo Si, Min Yang, Yongbin Li

To improve the generalizability and stability of neural text-to-SQL parsers, we propose a model uncertainty constraint to refine the query representations by enforcing the output representations of different perturbed encoding networks to be consistent with each other.

SQL Parsing Text-To-SQL

SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog Understanding and Generation

1 code implementation14 Sep 2022 Wanwei He, Yinpei Dai, Min Yang, Jian Sun, Fei Huang, Luo Si, Yongbin Li

To capture the structured dialog semantics, we pre-train the dialog understanding module via a novel tree-induced semi-supervised contrastive learning objective with the help of extra dialog annotations.

Contrastive Learning dialog state tracking +1

STAR: SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing

1 code implementation21 Oct 2022 ZeFeng Cai, Xiangyu Li, Binyuan Hui, Min Yang, Bowen Li, Binhua Li, Zheng Cao, Weijie Li, Fei Huang, Luo Si, Yongbin Li

Concretely, we propose two novel pre-training objectives which respectively explore the context-dependent interactions of NL utterances and SQL queries within each text-to-SQL conversation: (i) schema state tracking (SST) objective that tracks and explores the schema states of context-dependent SQL queries in the form of schema-states by predicting and updating the value of each schema slot during interaction; (ii) utterance dependency tracking (UDT) objective that employs weighted contrastive learning to pull together two semantically similar NL utterances and push away the representations of semantically dissimilar NL utterances within each conversation.

Contrastive Learning SQL Parsing +1

Towards Generalizable and Robust Text-to-SQL Parsing

1 code implementation23 Oct 2022 Chang Gao, Bowen Li, Wenxuan Zhang, Wai Lam, Binhua Li, Fei Huang, Luo Si, Yongbin Li

Text-to-SQL parsing tackles the problem of mapping natural language questions to executable SQL queries.

SQL Parsing Text-To-SQL

Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing

1 code implementation18 Jan 2023 Jinyang Li, Binyuan Hui, Reynold Cheng, Bowen Qin, Chenhao Ma, Nan Huo, Fei Huang, Wenyu Du, Luo Si, Yongbin Li

Recently, the pre-trained text-to-text transformer model, namely T5, though not specialized for text-to-SQL parsing, has achieved state-of-the-art performance on standard benchmarks targeting domain generalization.

Domain Generalization Inductive Bias +3

Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning

1 code implementation4 May 2022 Xiang Chen, Lei LI, Ningyu Zhang, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen

Note that the previous parametric learning paradigm can be viewed as memorization regarding training data as a book and inference as the close-book test.

Few-Shot Learning Memorization +3

Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

2 code implementations29 May 2022 Xiang Chen, Lei LI, Ningyu Zhang, Xiaozhuan Liang, Shumin Deng, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen

Specifically, vanilla prompt learning may struggle to utilize atypical instances by rote during fully-supervised training or overfit shallow patterns with low-shot data.

Few-Shot Text Classification Memorization +5

KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction

1 code implementation15 Apr 2021 Xiang Chen, Ningyu Zhang, Xin Xie, Shumin Deng, Yunzhi Yao, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen

To this end, we focus on incorporating knowledge among relation labels into prompt-tuning for relation extraction and propose a Knowledge-aware Prompt-tuning approach with synergistic optimization (KnowPrompt).

Ranked #5 on Dialog Relation Extraction on DialogRE (F1 (v1) metric)

Dialog Relation Extraction Language Modelling +3

Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion

1 code implementation4 May 2022 Xiang Chen, Ningyu Zhang, Lei LI, Shumin Deng, Chuanqi Tan, Changliang Xu, Fei Huang, Luo Si, Huajun Chen

Since most MKGs are far from complete, extensive knowledge graph completion studies have been proposed focusing on the multimodal entity, relation extraction and link prediction.

Information Retrieval Link Prediction +4

Document-level Relation Extraction as Semantic Segmentation

2 code implementations7 Jun 2021 Ningyu Zhang, Xiang Chen, Xin Xie, Shumin Deng, Chuanqi Tan, Mosha Chen, Fei Huang, Luo Si, Huajun Chen

Specifically, we leverage an encoder module to capture the context information of entities and a U-shaped segmentation module over the image-style feature map to capture global interdependency among triples.

Document-level Relation Extraction Relation +2

Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction

1 code implementation7 May 2022 Xiang Chen, Ningyu Zhang, Lei LI, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen

To deal with these issues, we propose a novel Hierarchical Visual Prefix fusion NeTwork (HVPNeT) for visual-enhanced entity and relation extraction, aiming to achieve more effective and robust performance.

named-entity-recognition Named Entity Recognition +3

Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

1 code implementation27 Apr 2021 Guanglin Niu, Yang Li, Chengguang Tang, Ruiying Geng, Jian Dai, Qiao Liu, Hao Wang, Jian Sun, Fei Huang, Luo Si

Moreover, modeling and inferring complex relations of one-to-many (1-N), many-to-one (N-1), and many-to-many (N-N) by previous knowledge graph completion approaches requires high model complexity and a large amount of training instances.

Few-Shot Learning Relational Reasoning

ENT-DESC: Entity Description Generation by Exploring Knowledge Graph

1 code implementation EMNLP 2020 Liying Cheng, Dekun Wu, Lidong Bing, Yan Zhang, Zhanming Jie, Wei Lu, Luo Si

Previous works on knowledge-to-text generation take as input a few RDF triples or key-value pairs conveying the knowledge of some entities to generate a natural language description.

Graph-to-Sequence KG-to-Text Generation +2

"Bilingual Expert" Can Find Translation Errors

1 code implementation25 Jul 2018 Kai Fan, Jiayi Wang, Bo Li, Fengming Zhou, Boxing Chen, Luo Si

Recent advances in statistical machine translation via the adoption of neural sequence-to-sequence models empower the end-to-end system to achieve state-of-the-art in many WMT benchmarks.

Language Modelling Machine Translation +1

MReD: A Meta-Review Dataset for Structure-Controllable Text Generation

1 code implementation Findings (ACL) 2022 Chenhui Shen, Liying Cheng, Ran Zhou, Lidong Bing, Yang You, Luo Si

A more useful text generator should leverage both the input text and the control signal to guide the generation, which can only be built with a deep understanding of the domain knowledge.

Text Generation Text Summarization

A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach

1 code implementation18 Nov 2022 Yew Ken Chia, Lidong Bing, Sharifah Mahani Aljunied, Luo Si, Soujanya Poria

Hence, we propose CubeRE, a cube-filling model inspired by table-filling approaches and explicitly considers the interaction between relation triplets and qualifiers.

graph construction Hyper-Relational Extraction +1

DialogueCSE: Dialogue-based Contrastive Learning of Sentence Embeddings

1 code implementation EMNLP 2021 Che Liu, Rui Wang, Jinghua Liu, Jian Sun, Fei Huang, Luo Si

Learning sentence embeddings from dialogues has drawn increasing attention due to its low annotation cost and high domain adaptability.

Contrastive Learning Semantic Textual Similarity +2

Syntax-aware Neural Semantic Role Labeling

1 code implementation22 Jul 2019 Qingrong Xia, Zhenghua Li, Min Zhang, Meishan Zhang, Guohong Fu, Rui Wang, Luo Si

Semantic role labeling (SRL), also known as shallow semantic parsing, is an important yet challenging task in NLP.

Semantic Parsing Semantic Role Labeling +1

A Unified Span-Based Approach for Opinion Mining with Syntactic Constituents

1 code implementation NAACL 2021 Qingrong Xia, Bo Zhang, Rui Wang, Zhenghua Li, Yue Zhang, Fei Huang, Luo Si, Min Zhang

Fine-grained opinion mining (OM) has achieved increasing attraction in the natural language processing (NLP) community, which aims to find the opinion structures of {``}Who expressed what opinions towards what{''} in one sentence.

Multi-Task Learning Opinion Mining +1

IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks

1 code implementation ACL 2022 Liying Cheng, Lidong Bing, Ruidan He, Qian Yu, Yan Zhang, Luo Si

Traditionally, a debate usually requires a manual preparation process, including reading plenty of articles, selecting the claims, identifying the stances of the claims, seeking the evidence for the claims, etc.

Claim-Evidence Pair Extraction (CEPE) Claim Extraction with Stance Classification (CESC) +1

Semi-supervised Domain Adaptation for Dependency Parsing

1 code implementation ACL 2019 Zhenghua Li, Xue Peng, Min Zhang, Rui Wang, Luo Si

During the past decades, due to the lack of sufficient labeled data, most studies on cross-domain parsing focus on unsupervised domain adaptation, assuming there is no target-domain training data.

Chinese Dependency Parsing Dependency Parsing +3

ConNER: Consistency Training for Cross-lingual Named Entity Recognition

1 code implementation17 Nov 2022 Ran Zhou, Xin Li, Lidong Bing, Erik Cambria, Luo Si, Chunyan Miao

We propose ConNER as a novel consistency training framework for cross-lingual NER, which comprises of: (1) translation-based consistency training on unlabeled target-language data, and (2) dropoutbased consistency training on labeled source-language data.

Cross-Lingual NER Knowledge Distillation +3

Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples

1 code implementation22 Nov 2021 Linlin Liu, Xin Li, Ruidan He, Lidong Bing, Shafiq Joty, Luo Si

In this work, we explore methods to make better use of the multilingual annotation and language agnostic property of KG triples, and present novel knowledge based multilingual language models (KMLMs) trained directly on the knowledge triples.

Knowledge Graphs Language Modelling +9

Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations

1 code implementation16 Nov 2022 Linlin Liu, Xingxuan Li, Megh Thakkar, Xin Li, Shafiq Joty, Luo Si, Lidong Bing

Due to the huge amount of parameters, fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios.

Argument Pair Extraction via Attention-guided Multi-Layer Multi-Cross Encoding

1 code implementation ACL 2021 Liying Cheng, Tianyu Wu, Lidong Bing, Luo Si

Prior research work treats this task as a sequence labeling problem and a binary classification problem on two passages that are directly concatenated together, which has a limitation of not fully utilizing the unique characteristics and inherent relations of two different passages.

Argument Pair Extraction (APE) Binary Classification

SentBS: Sentence-level Beam Search for Controllable Summarization

1 code implementation26 Oct 2022 Chenhui Shen, Liying Cheng, Lidong Bing, Yang You, Luo Si

A wide range of control perspectives have been explored in controllable text generation.

Sentence Text Generation

From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader

1 code implementation9 Dec 2022 Weiwen Xu, Xin Li, Wenxuan Zhang, Meng Zhou, Wai Lam, Luo Si, Lidong Bing

We present Pre-trained Machine Reader (PMR), a novel method for retrofitting pre-trained masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data.

Classification Extractive Question-Answering +6

Symmetric Regularization based BERT for Pair-wise Semantic Reasoning

1 code implementation8 Sep 2019 Weidi Xu, Xingyi Cheng, Kunlong Chen, Wei Wang, Bin Bi, Ming Yan, Chen Wu, Luo Si, Wei Chu, Taifeng Wang

To remedy this, we propose to augment the NSP task to a 3-class categorization task, which includes a category for previous sentence prediction (PSP).

Machine Reading Comprehension Natural Language Inference +2

Perceive Your Users in Depth: Learning Universal User Representations from Multiple E-commerce Tasks

no code implementations28 May 2018 Yabo Ni, Dan Ou, Shichen Liu, Xiang Li, Wenwu Ou, An-Xiang Zeng, Luo Si

In this work, we propose to learn universal user representations across multiple tasks for more e ective personalization.

Cascade Ranking for Operational E-commerce Search

no code implementations7 Jun 2017 Shichen Liu, Fei Xiao, Wenwu Ou, Luo Si

Real-world search applications often involve multiple factors of preferences or constraints with respect to user experience and computational costs such as search accuracy, search latency, size of search results and total CPU cost, while most existing search solutions only address one or two factors; 2).

A Joint Probabilistic Classification Model of Relevant and Irrelevant Sentences in Mathematical Word Problems

no code implementations21 Nov 2014 Suleyman Cetintas, Luo Si, Yan Ping Xin, Dake Zhang, Joo Young Park, Ron Tzur

Identification of relevant and irrelevant sentences in math word problems is an important step for calculating the difficulty levels of such problems.

Classification General Classification +4

A Deep Cascade Model for Multi-Document Reading Comprehension

no code implementations28 Nov 2018 Ming Yan, Jiangnan Xia, Chen Wu, Bin Bi, Zhongzhou Zhao, Ji Zhang, Luo Si, Rui Wang, Wei Wang, Haiqing Chen

To address this problem, we develop a novel deep cascade learning model, which progressively evolves from the document-level and paragraph-level ranking of candidate texts to more precise answer extraction with machine reading comprehension.

Machine Reading Comprehension Question Answering +2

Supervised Treebank Conversion: Data and Approaches

no code implementations ACL 2018 Xinzhou Jiang, Zhenghua Li, Bo Zhang, Min Zhang, Sheng Li, Luo Si

Treebank conversion is a straightforward and effective way to exploit various heterogeneous treebanks for boosting parsing performance.

Dependency Parsing Multi-Task Learning +1

NAI-SEA at SemEval-2018 Task 5: An Event Search System

no code implementations SEMEVAL 2018 Yingchi Liu, Quanzhi Li, Luo Si

In this paper, we describe Alibaba{'}s participating system in the semEval-2018 Task5: Counting Events and Participants in the Long Tail.

Clustering Retrieval

Alibaba Submission for WMT18 Quality Estimation Task

no code implementations WS 2018 Jiayi Wang, Kai Fan, Bo Li, Fengming Zhou, Boxing Chen, Yangbin Shi, Luo Si

The goal of WMT 2018 Shared Task on Translation Quality Estimation is to investigate automatic methods for estimating the quality of machine translation results without reference translations.

Automatic Post-Editing Language Modelling +2

Alibaba at IJCNLP-2017 Task 2: A Boosted Deep System for Dimensional Sentiment Analysis of Chinese Phrases

no code implementations IJCNLP 2017 Xin Zhou, Jian Wang, Xu Xie, Changlong Sun, Luo Si

For word level task our best run achieved MAE 0. 545 (ranked 2nd), PCC 0. 892 (ranked 2nd) in valence prediction and MAE 0. 857 (ranked 1st), PCC 0. 678 (ranked 2nd) in arousal prediction.

Clustering Feature Engineering +3

Multi-Instance Learning for End-to-End Knowledge Base Question Answering

no code implementations6 Mar 2019 Mengxi Wei, Yifan He, Qiong Zhang, Luo Si

This paper proposes a novel approach based on multiple instance learning to address the problem of noisy answers by exploring consensus among answers to the same question in training end-to-end KBQA models.

Knowledge Base Question Answering Multiple Instance Learning

Aspect Sentiment Classification Towards Question-Answering with Reinforced Bidirectional Attention Network

no code implementations ACL 2019 Jingjing Wang, Changlong Sun, Shoushan Li, Xiaozhong Liu, Luo Si, Min Zhang, Guodong Zhou

This paper extends the research to interactive reviews and proposes a new research task, namely Aspect Sentiment Classification towards Question-Answering (ASC-QA), for real-world applications.

General Classification Question Answering +2

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

no code implementations ICLR 2020 Wei Wang, Bin Bi, Ming Yan, Chen Wu, Zuyi Bao, Jiangnan Xia, Liwei Peng, Luo Si

Recently, the pre-trained language model, BERT (and its robustly optimized version RoBERTa), has attracted a lot of attention in natural language understanding (NLU), and achieved state-of-the-art accuracy in various NLU tasks, such as sentiment classification, natural language inference, semantic textual similarity and question answering.

Language Modelling Linguistic Acceptability +7

Syntax-Enhanced Self-Attention-Based Semantic Role Labeling

no code implementations IJCNLP 2019 Yue Zhang, Rui Wang, Luo Si

As a fundamental NLP task, semantic role labeling (SRL) aims to discover the semantic roles for each predicate within one sentence.

Semantic Role Labeling Sentence

Uncover Sexual Harassment Patterns from Personal Stories by Joint Key Element Extraction and Categorization

no code implementations IJCNLP 2019 Yingchi Liu, Quanzhi Li, Marika Cifor, Xiaozhong Liu, Qiong Zhang, Luo Si

Sexual harassment occurred in a variety of situations, and categorization of the stories and extraction of their key elements will provide great help for the related parties to understand and address sexual harassment.

Review-based Question Generation with Adaptive Instance Transfer and Augmentation

no code implementations ACL 2020 Qian Yu, Lidong Bing, Qiong Zhang, Wai Lam, Luo Si

We propose an iterative learning framework for handling this challenge via adaptive transfer and augmentation of the training instances with the help of the available user-posed question-answer data.

Question Generation Question-Generation

Rumor Detection on Social Media: Datasets, Methods and Opportunities

no code implementations WS 2019 Quanzhi Li, Qiong Zhang, Luo Si, Yingchi Liu

Social media platforms have been used for information and news gathering, and they are very valuable in many applications.

Tracing the Propagation Path: A Flow Perspective of Representation Learning on Graphs

no code implementations12 Dec 2019 Menghan Wang, Kun Zhang, Gulin Li, Keping Yang, Luo Si

We generalize the propagation strategies of current GCNs as a \emph{"Sink$\to$Source"} mode, which seems to be an underlying cause of the two challenges.

Representation Learning

Aspect Sentiment Classification with Document-level Sentiment Preference Modeling

no code implementations ACL 2020 Xiao Chen, Changlong Sun, Jingjing Wang, Shoushan Li, Luo Si, Min Zhang, Guodong Zhou

This justifies the importance of the document-level sentiment preference information to ASC and the effectiveness of our approach capturing such information.

Classification General Classification +4

Recommending Complementary Products in E-Commerce Push Notifications with a Mixture Model Approach

no code implementations25 Jul 2017 Huasha Zhao, Luo Si, Xiaogang Li, Qiong Zhang

The item with the highest predicted open rate is then chosen to be included in the push notification message for each user.

Product Recommendation

De-Biased Court's View Generation with Causality

no code implementations EMNLP 2020 Yiquan Wu, Kun Kuang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Jun Xiao, Yueting Zhuang, Luo Si, Fei Wu

Court{'}s view generation is a novel but essential task for legal AI, aiming at improving the interpretability of judgment prediction results and enabling automatic legal document generation.

counterfactual Text Generation

PALM: Pre-training an Autoencoding\&Autoregressive Language Model for Context-conditioned Generation

no code implementations EMNLP 2020 Bin Bi, Chenliang Li, Chen Wu, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si

An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.

Abstractive Text Summarization Conversational Response Generation +8

Leveraging Online Shopping Behaviors as a Proxy for Personal Lifestyle Choices: New Insights into Chronic Disease Prevention Literacy

no code implementations29 Apr 2021 Yongzhen Wang, Xiaozhong Liu, Katy Börner, Jun Lin, Yingnan Ju, Changlong Sun, Luo Si

Objective: Ubiquitous internet access is reshaping the way we live, but it is accompanied by unprecedented challenges in preventing chronic diseases that are usually planted by long exposure to unhealthy lifestyles.

Medical Diagnosis

Preview, Attend and Review: Schema-Aware Curriculum Learning for Multi-Domain Dialog State Tracking

no code implementations1 Jun 2021 Yinpei Dai, Hangyu Li, Yongbin Li, Jian Sun, Fei Huang, Luo Si, Xiaodan Zhu

Existing dialog state tracking (DST) models are trained with dialog data in a random order, neglecting rich structural information in a dataset.

 Ranked #1 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1 (using extra training data)

dialog state tracking Multi-domain Dialogue State Tracking

On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

no code implementations ACL 2021 Ruidan He, Linlin Liu, Hai Ye, Qingyu Tan, Bosheng Ding, Liying Cheng, Jia-Wei Low, Lidong Bing, Luo Si

It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a downstream task.

Language Modelling

MulDA: A Multilingual Data Augmentation Framework for Low-Resource Cross-Lingual NER

no code implementations ACL 2021 Linlin Liu, Bosheng Ding, Lidong Bing, Shafiq Joty, Luo Si, Chunyan Miao

With the source-language data as well as the translated data, a generation-based multilingual data augmentation method is introduced to further increase diversity by generating synthetic labeled data in multiple languages.

Cross-Lingual NER Data Augmentation +5

VECO: Variable Encoder-decoder Pre-training for Cross-lingual Understanding and Generation

no code implementations28 Sep 2020 Fuli Luo, Wei Wang, Jiahao Liu, Yijia Liu, Bin Bi, Songfang Huang, Fei Huang, Luo Si

Recent studies about learning multilingual representations have achieved significant performance gains across a wide range of downstream cross-lingual tasks.

Language Modelling Masked Language Modeling +5

Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems

no code implementations30 May 2022 Ting-En Lin, Yuchuan Wu, Fei Huang, Luo Si, Jian Sun, Yongbin Li

In this paper, we present Duplex Conversation, a multi-turn, multimodal spoken dialogue system that enables telephone-based agents to interact with customers like a human.

Data Augmentation Spoken Dialogue Systems

Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding

no code implementations27 Jun 2022 Chuwei Luo, Guozhi Tang, Qi Zheng, Cong Yao, Lianwen Jin, Chenliang Li, Yang Xue, Luo Si

Multi-modal document pre-trained models have proven to be very effective in a variety of visually-rich document understanding (VrDU) tasks.

Document Classification document understanding +2

A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions

no code implementations29 Aug 2022 Bowen Qin, Binyuan Hui, Lihan Wang, Min Yang, Jinyang Li, Binhua Li, Ruiying Geng, Rongyu Cao, Jian Sun, Luo Si, Fei Huang, Yongbin Li

In recent years, deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output SQL query.

SQL Parsing Text-To-SQL

Doc2Bot: Accessing Heterogeneous Documents via Conversational Bots

no code implementations20 Oct 2022 Haomin Fu, Yeqin Zhang, Haiyang Yu, Jian Sun, Fei Huang, Luo Si, Yongbin Li, Cam-Tu Nguyen

This paper introduces Doc2Bot, a novel dataset for building machines that help users seek information via conversations.

dialog state tracking Response Generation

Ranking Preserving Hashing for Fast Similarity Search

no code implementations AAAI 2015 Qifan Wang, Zhiwei Zhang, Luo Si

But in many real world applications, ranking measure is important for evaluating the quality of hashing codes. In this paper, we propose a novel Ranking Preserving Hashing (RPH) approach that directly optimizes a popular ranking measure, Normalized Discounted Cumulative Gain (NDCG), to obtain effective hashing codes with high ranking accuracy.

Computational Efficiency

Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering

no code implementations14 May 2023 Qianglong Chen, Guohai Xu, Ming Yan, Ji Zhang, Fei Huang, Luo Si, Yin Zhang

Existing knowledge-enhanced methods have achieved remarkable results in certain QA tasks via obtaining diverse knowledge from different knowledge bases.

Explanation Generation Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.