no code implementations • CCL 2020 • Tianyang Zhao, Zhao Yan, Yunbo Cao, Zhoujun Li
Joint entity and relation extraction has received increasing interests recently, due to the capability of utilizing the interactions between both steps.
no code implementations • Findings (EMNLP) 2021 • Tongliang Li, Lei Fang, Jian-Guang Lou, Zhoujun Li
In this paper, we propose to generate text conditioned on the structured data (table) and a prefix (the written text) by leveraging the pre-trained models.
no code implementations • 14 Sep 2024 • Hongcheng Guo, Wei zhang, JunHao Chen, Yaonan Gu, Jian Yang, Junjia Du, Binyuan Hui, Tianyu Liu, Jianxin Ma, Chang Zhou, Zhoujun Li
We have conducted extensive experiments on existing large multimodal models, offering insights into their performance and areas for improvement in image-to-web domain.
no code implementations • 3 Sep 2024 • Liqun Yang, Jian Yang, Chaoren Wei, Guanglin Niu, Ge Zhang, Yunli Wang, Linzheng Chai, Wanxu Xia, Hongcheng Guo, Shun Zhang, Jiaheng Liu, Yuwei Yin, Junran Peng, Jiaxin Ma, Liang Sun, Zhoujun Li
In this work, we propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks to guide future fuzzing explorations.
no code implementations • 17 Aug 2024 • Xianjie Wu, Jian Yang, Linzheng Chai, Ge Zhang, Jiaheng Liu, Xinrun Du, Di Liang, Daixin Shu, Xianfu Cheng, Tianzhen Sun, Guanglin Niu, Tongliang Li, Zhoujun Li
Recent advancements in Large Language Models (LLMs) have markedly enhanced the interpretation and processing of tabular data, introducing previously unimaginable capabilities.
no code implementations • 11 Jul 2024 • Zhenhe Wu, Zhongqiu Li, Jie Zhang, Mengxiang Li, Yu Zhao, Ruiyu Fang, Zhongjiang He, Xuelong Li, Zhoujun Li, Shuangyong Song
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task.
no code implementations • 3 Jul 2024 • Xia Hou, QiFeng Li, Jian Yang, Tongliang Li, Linzheng Chai, Xianjie Wu, Hangyuan Ji, Zhoujun Li, Jixuan Nie, Jingbo Dun, Wenfeng Song
In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generating knowledge-intensive multi-turn dialogues for instruction tuning.
no code implementations • 24 Jun 2024 • Tao Sun, Linzheng Chai, Jian Yang, Yuwei Yin, Hongcheng Guo, Jiaheng Liu, Bing Wang, Liqun Yang, Zhoujun Li
When applying LLMs for code generation, recent works mainly focus on directing the models to articulate intermediate natural-language reasoning steps, as in chain-of-thought (CoT) prompting, and then output code with the natural language or other structured intermediate steps.
no code implementations • 5 Jun 2024 • Shun Zhang, Chaoran Yan, Jian Yang, Jiaheng Liu, Ying Mo, Jiaqi Bai, Tongliang Li, Zhoujun Li
New Intent Discovery (NID) aims at detecting known and previously undefined categories of user intent by utilizing limited labeled and massive unlabeled data.
no code implementations • 27 May 2024 • Yinghao Zhu, Changyu Ren, Zixiang Wang, Xiaochen Zheng, Shiyun Xie, Junlan Feng, Xi Zhu, Zhoujun Li, Liantao Ma, Chengwei Pan
However, current models that utilize clinical notes and multivariate time-series EHR data often lack the necessary medical context for precise clinical tasks.
no code implementations • 27 May 2024 • Xianfu Cheng, Hang Zhang, Jian Yang, Xiang Li, Weixiao Zhou, Kui Wu, Fei Liu, Wei zhang, Tao Sun, Tongliang Li, Zhoujun Li
In the domain of document AI, semi-structured form parsing plays a crucial role.
no code implementations • 22 May 2024 • Wei zhang, Xianfu Cheng, Yi Zhang, Jian Yang, Hongcheng Guo, Zhoujun Li, Xiaolin Yin, Xiangyuan Guan, Xu Shi, Liangfan Zheng, Bo Zhang
These challenges are two-fold: 1) massive log templates: The performance and efficiency of most existing parsers will be significantly reduced when logs of growing quantities and different lengths; 2) Complex and changeable semantics: Traditional template-matching algorithms cannot accurately match the log templates of complicated industrial logs because they cannot utilize cross-language logs with similar semantics.
no code implementations • 27 Apr 2024 • Chenhao Cui, Yufan Jiang, Shuangzhi Wu, Zhoujun Li
Multi-choice Machine Reading Comprehension (MMRC) aims to select the correct answer from a set of options based on a given passage and question.
no code implementations • 13 Apr 2024 • Shun Zhang, Chaoran Yan, Jian Yang, Changyu Ren, Jiaqi Bai, Tongliang Li, Zhoujun Li
To address the aforementioned challenges, we propose a Robust New Intent Discovery (RoNID) framework optimized by an EM-style method, which focuses on constructing reliable pseudo-labels and obtaining cluster-friendly discriminative representations.
no code implementations • 26 Mar 2024 • Jian Yang, Hongcheng Guo, Yuwei Yin, Jiaqi Bai, Bing Wang, Jiaheng Liu, Xinnian Liang, Linzheng Cahi, Liqun Yang, Zhoujun Li
Our method aims to minimize the representation distance of different languages by regarding the image as a central language.
no code implementations • 25 Mar 2024 • Shun Zhang, Jian Yang, Jiaqi Bai, Chaoran Yan, Tongliang Li, Zhao Yan, Zhoujun Li
New Intent Discovery (NID) aims to recognize known and infer new intent categories with the help of limited labeled and large-scale unlabeled data.
1 code implementation • 28 Feb 2024 • Shuhua Shi, Shaohan Huang, Minghui Song, Zhoujun Li, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang
As one of the most popular parameter-efficient fine-tuning (PEFT) methods, low-rank adaptation (LoRA) is commonly applied to fine-tune large language models (LLMs).
1 code implementation • 28 Feb 2024 • Wei zhang, Hongcheng Guo, Anjie Le, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Shi Xu, Runqiang Zang, Liangfan Zheng, Bo Zhang
Log parsing, which entails transforming raw log messages into structured templates, constitutes a critical phase in the automation of log analytics.
no code implementations • 17 Feb 2024 • Ying Mo, Jiahao Liu, Jian Yang, Qifan Wang, Shun Zhang, Jingang Wang, Zhoujun Li
There has been increasing interest in exploring the capabilities of advanced large language models (LLMs) in the field of information extraction (IE), specifically focusing on tasks related to named entity recognition (NER) and relation extraction (RE).
1 code implementation • 10 Feb 2024 • Yinghao Zhu, Changyu Ren, Shiyun Xie, Shukai Liu, Hangyuan Ji, Zixiang Wang, Tao Sun, Long He, Zhoujun Li, Xi Zhu, Chengwei Pan
Leveraging clinical notes and multivariate time-series EHR, existing models often lack the medical context relevent to clinical tasks, prompting the incorporation of external knowledge, particularly from the knowledge graph (KG).
1 code implementation • 18 Jan 2024 • Xianfu Cheng, Weixiao Zhou, Xiang Li, Jian Yang, Hang Zhang, Tao Sun, Wei zhang, Yuying Mai, Tongliang Li, Xiaoming Chen, Zhoujun Li
In this work, we propose a VIsion Permutable extractor for fast and efficient Scene Text Recognition (SVIPTR), which achieves an impressive balance between high performance and rapid inference speeds in the domain of STR.
no code implementations • 15 Jan 2024 • Runqiang Zang, Hongcheng Guo, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Xu Shi, Liangfan Zheng, Bo Zhang
In spite of the rapid advancements in unsupervised log anomaly detection techniques, the current mainstream models still necessitate specific training for individual system datasets, resulting in costly procedures and limited scalability due to dataset size, thereby leading to performance bottlenecks.
no code implementations • 13 Jan 2024 • Linzheng Chai, Jian Yang, Tao Sun, Hongcheng Guo, Jiaheng Liu, Bing Wang, Xiannian Liang, Jiaqi Bai, Tongliang Li, Qiyao Peng, Zhoujun Li
To bridge the gap among different languages, we propose a cross-lingual instruction fine-tuning framework (xCOT) to transfer knowledge from high-resource languages to low-resource languages.
1 code implementation • 9 Jan 2024 • Hongcheng Guo, Jian Yang, Jiaheng Liu, Jiaqi Bai, Boyang Wang, Zhoujun Li, Tieqiao Zheng, Bo Zhang, Junran Peng, Qi Tian
Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps).
1 code implementation • 18 Dec 2023 • Bing Wang, Changyu Ren, Jian Yang, Xinnian Liang, Jiaqi Bai, Linzheng Chai, Zhao Yan, Qian-Wen Zhang, Di Yin, Xing Sun, Zhoujun Li
Our framework comprises a core decomposer agent for Text-to-SQL generation with few-shot chain-of-thought reasoning, accompanied by two auxiliary agents that utilize external tools or models to acquire smaller sub-databases and refine erroneous SQL queries.
1 code implementation • 26 Oct 2023 • Hongcheng Guo, Boyang Wang, Jiaqi Bai, Jiaheng Liu, Jian Yang, Zhoujun Li
In other words, the Multimodal Manga Complement (M2C) task has not been investigated, which aims to handle the aforementioned issues by providing a shared semantic space for vision and language understanding.
1 code implementation • 16 Oct 2023 • Weixiao Zhou, Gengyao Li, Xianfu Cheng, Xinnian Liang, Junnan Zhu, FeiFei Zhai, Zhoujun Li
Specifically, we first conduct domain-aware pre-training using large-scale multi-scenario multi-domain dialogue data to enhance the adaptability of our pre-trained model.
1 code implementation • 17 Sep 2023 • Hongcheng Guo, Jian Yang, Jiaheng Liu, Liqun Yang, Linzheng Chai, Jiaqi Bai, Junran Peng, Xiaorong Hu, Chao Chen, Dongfeng Zhang, Xu Shi, Tieqiao Zheng, Liangfan Zheng, Bo Zhang, Ke Xu, Zhoujun Li
However, there is a lack of specialized LLMs for IT operations.
no code implementations • 15 Sep 2023 • Xianjie Wu, Jian Yang, Tongliang Li, Di Liang, Shiwei Zhang, Yiyang Du, Zhoujun Li
To fully Unleash the potential of evidence, we propose a framework to effectively incorporate Evidence in knowledge-Intensive Dialogue Generation (u-EIDG).
no code implementations • 17 Aug 2023 • Ying Mo, Jian Yang, Jiahao Liu, Qifan Wang, Ruoyu Chen, Jingang Wang, Zhoujun Li
A multi-view contrastive learning framework is introduced to encompass semantic contrasts between source, codeswitched, and target sentences, as well as contrasts among token-to-token relations.
2 code implementations • 12 Aug 2023 • Tongliang Li, Zixiang Wang, Linzheng Chai, Jian Yang, Jiaqi Bai, Yuwei Yin, Jiaheng Liu, Hongcheng Guo, Liqun Yang, Hebboul Zine el-abidine, Zhoujun Li
Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages.
1 code implementation • 27 Jun 2023 • Jiaqi Bai, Zhao Yan, Jian Yang, Xinnian Liang, Hongcheng Guo, Zhoujun Li
We propose Knowledgeable Prefix Tuning (KnowPrefix-Tuning), a two-stage tuning framework, bypassing the retrieval process in a knowledge-grounded conversation system by injecting prior knowledge into the lightweight knowledge prefix.
no code implementations • 29 May 2023 • Jiaqi Bai, Hongcheng Guo, Jiaheng Liu, Jian Yang, Xinnian Liang, Zhao Yan, Zhoujun Li
However, the retrieved passages are not ideal for guiding answer generation because of the discrepancy between retrieval and generation, i. e., the candidate passages are all treated equally during the retrieval procedure without considering their potential to generate a proper answer.
no code implementations • 11 May 2023 • Linzheng Chai, Dongling Xiao, Jian Yang, Liqun Yang, Qian-Wen Zhang, Yunbo Cao, Zhoujun Li, Zhao Yan
Context-dependent Text-to-SQL aims to translate multi-turn natural language questions into SQL queries.
1 code implementation • 26 Apr 2023 • Bing Wang, Xinnian Liang, Jian Yang, Hui Huang, Shuangzhi Wu, Peihao Wu, Lu Lu, Zejun Ma, Zhoujun Li
Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information.
no code implementations • 14 Apr 2023 • Minghao Li, Yingxiu Zhao, Bowen Yu, Feifan Song, Hangyu Li, Haiyang Yu, Zhoujun Li, Fei Huang, Yongbin Li
(2) How can we enhance LLMs' ability to utilize tools?
1 code implementation • 23 Mar 2023 • Xinnian Liang, Shuangzhi Wu, Hui Huang, Jiaqi Bai, Chao Bian, Zhoujun Li
Retrieval augmented methods have shown promising results in various classification tasks.
1 code implementation • 20 Mar 2023 • Xinnian Liang, Zefan Zhou, Hui Huang, Shuangzhi Wu, Tong Xiao, Muyun Yang, Zhoujun Li, Chao Bian
We conduct extensive experiments on various Chinese NLP tasks to evaluate existing PLMs as well as the proposed MigBERT.
no code implementations • 20 Mar 2023 • Ying Mo, Hongyin Tang, Jiahao Liu, Qifan Wang, Zenglin Xu, Jingang Wang, Wei Wu, Zhoujun Li
There are three types of NER tasks, including flat, nested and discontinuous entity recognition.
1 code implementation • 29 Jan 2023 • Xinnian Liang, Shuangzhi Wu, Chenhao Cui, Jiaqi Bai, Chao Bian, Zhoujun Li
The global one aims to identify vital sub-topics in the dialogue and the local one aims to select the most important context in each sub-topic.
no code implementations • 17 Jan 2023 • Jian Yang, Yuwei Yin, Shuming Ma, Liqun Yang, Hongcheng Guo, Haoyang Huang, Dongdong Zhang, Yutao Zeng, Zhoujun Li, Furu Wei
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
no code implementations • 11 Jan 2023 • Zixiang Wang, Jian Yang, Tongliang Li, Jiaheng Liu, Ying Mo, Jiaqi Bai, Longtao He, Zhoujun Li
In this paper, we propose a two-stage multilingual training method and a joint model called Multilingual Entity and Relation Extraction framework (mERE) to mitigate language interference across languages.
1 code implementation • 20 Dec 2022 • Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Furu Wei, Zhoujun Li
Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model.
1 code implementation • ACL 2022 • Xinyu Pi, Bing Wang, Yan Gao, Jiaqi Guo, Zhoujun Li, Jian-Guang Lou
The robustness of Text-to-SQL parsers against adversarial perturbations plays a crucial role in delivering highly reliable applications.
1 code implementation • 17 Dec 2022 • Bing Wang, Yan Gao, Zhoujun Li, Jian-Guang Lou
Following this study, we propose a simple yet effective counterfactual example generation approach that automatically produces ambiguous and unanswerable text-to-SQL examples.
no code implementations • 22 Oct 2022 • Yupeng Zhang, Hongzhi Zhang, Sirui Wang, Wei Wu, Zhoujun Li
A wide range of NLP tasks benefit from the fine-tuning of pretrained language models (PLMs).
no code implementations • 19 Oct 2022 • Hongcheng Guo, Jiaheng Liu, Haoyang Huang, Jian Yang, Zhoujun Li, Dongdong Zhang, Zheng Cui, Furu Wei
To this end, we first propose the Multilingual MMT task by establishing two new Multilingual MMT benchmark datasets covering seven languages.
1 code implementation • 13 Oct 2022 • Jian Yang, Shaohan Huang, Shuming Ma, Yuwei Yin, Li Dong, Dongdong Zhang, Hongcheng Guo, Zhoujun Li, Furu Wei
Specifically, the target sequence is first translated into the source language and then tagged by a source NER model.
no code implementations • 24 Aug 2022 • Chenhao Cui, Xinnian Liang, Shuangzhi Wu, Zhoujun Li
The core of ViL-Sum is a joint multi-modal encoder with two well-designed tasks, image reordering and image selection.
no code implementations • 23 Aug 2022 • Hongcheng Guo, Yuhui Guo, Renjie Chen, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Weichao Hou, Liangfan Zheng, Bo Zhang
Experiments on five benchmarks validate the effectiveness of LogLG for detecting anomalies on unlabeled log data and demonstrate that LogLG, as the state-of-the-art weakly supervised method, achieves significant performance improvements compared to existing methods.
1 code implementation • COLING 2022 • Xinnian Liang, Jing Li, Shuangzhi Wu, Jiali Zeng, Yufan Jiang, Mu Li, Zhoujun Li
To tackle this problem, in this paper, we proposed an efficient Coarse-to-Fine Facet-Aware Ranking (C2F-FAR) framework for unsupervised long document summarization, which is based on the semantic block.
1 code implementation • 29 Jul 2022 • Jian Yang, Yuwei Yin, Liqun Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Furu Wei, Zhoujun Li
Transformer structure, stacked by a sequence of encoder and decoder network layers, achieves significant development in neural machine translation.
1 code implementation • 11 Jul 2022 • Jian Yang, Yuwei Yin, Shuming Ma, Dongdong Zhang, Zhoujun Li, Furu Wei
Nonetheless, multilingual training is plagued by language interference degeneration in shared parameters because of the negative interference among different translation directions, especially on high-resource languages.
1 code implementation • 11 Jul 2022 • Jian Yang, Yuwei Yin, Shuming Ma, Dongdong Zhang, Shuangzhi Wu, Hongcheng Guo, Zhoujun Li, Furu Wei
Most translation tasks among languages belong to the zero-resource translation problem where parallel corpora are unavailable.
no code implementations • 16 May 2022 • Dongling Xiao, Linzheng Chai, Qian-Wen Zhang, Zhao Yan, Zhoujun Li, Yunbo Cao
Context-dependent text-to-SQL is the task of translating multi-turn questions into database-related SQL queries.
no code implementations • Findings (NAACL) 2022 • Ze Yang, Liran Wang, Zhoujin Tian, Wei Wu, Zhoujun Li
Another is that applying the existing pre-trained models to this task is tricky because of the structural dependence within the conversation and its informal expression, etc.
1 code implementation • NAACL 2022 • Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
In this paper, we propose a novel method to extract multi-granularity features based solely on the original input sentences.
no code implementations • 30 Mar 2022 • Hao Chen, Zhong Huang, Yue Xu, Zengde Deng, Feiran Huang, Peng He, Zhoujun Li
The experimental results verify that our proposed NEGCN framework can significantly enhance the performance for various typical GCN models on both node classification and recommendation tasks.
no code implementations • COLING 2022 • Juncheng Wan, Jian Yang, Shuming Ma, Dongdong Zhang, Weinan Zhang, Yong Yu, Zhoujun Li
While end-to-end neural machine translation (NMT) has achieved impressive progress, noisy input usually leads models to become fragile and unstable.
no code implementations • 31 Dec 2021 • Hongcheng Guo, Xingyu Lin, Jian Yang, Yi Zhuang, Jiaqi Bai, Tieqiao Zheng, Bo Zhang, Zhoujun Li
Therefore, we propose a unified Transformer-based framework for log anomaly detection (\ourmethod{}), which is comprised of the pretraining and adapter-based tuning stage.
no code implementations • EMNLP 2021 • Jiaqi Bai, Long Zhou, Ambrosio Blanco, Shujie Liu, Furu Wei, Ming Zhou, Zhoujun Li
We propose a novel task of jointly repairing program codes and generating commit messages.
6 code implementations • 21 Sep 2021 • Minghao Li, Tengchao Lv, Jingye Chen, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei
Text recognition is a long-standing research problem for document digitalization.
Ranked #1 on Handwritten Text Recognition on IAM(line-level) (using extra training data)
1 code implementation • EMNLP 2021 • Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
In terms of the local view, we first build a graph structure based on the document where phrases are regarded as vertices and the edges are similarities between vertices.
no code implementations • 5 Sep 2021 • Tong Sha, Wei zhang, Tong Shen, Zhoujun Li, Tao Mei
Deep person generation has attracted extensive research attention due to its wide applications in virtual agents, video conferencing, online shopping and art/movie production.
no code implementations • ACL 2021 • Bo Zhang, XiaoMing Zhang, Yun Liu, Lei Cheng, Zhoujun Li
Unsupervised Domain Adaptation (UDA) aims to transfer the knowledge of source domain to the unlabeled target domain.
no code implementations • ACL 2021 • Jian Yang, Yuwei Yin, Shuming Ma, Haoyang Huang, Dongdong Zhang, Zhoujun Li, Furu Wei
Although multilingual neural machine translation (MNMT) enables multiple language translations, the training process is based on independent multilingual objectives.
no code implementations • NAACL 2021 • Jian Yang, Shuming Ma, Dongdong Zhang, Juncheng Wan, Zhoujun Li, Ming Zhou
Most current neural machine translation models adopt a monotonic decoding order of either left-to-right or right-to-left.
no code implementations • 9 May 2021 • Hao Chen, Zengde Deng, Yue Xu, Zhoujun Li
In this way, each node can be directly represented by concatenating the information extracted independently from each hop of its neighbors thereby avoiding the recursive neighborhood expansion across layers.
no code implementations • 23 Dec 2020 • Daisheng Jin, Xiao Ma, Chongzhi Zhang, Yizhuo Zhou, Jiashu Tao, Mingyuan Zhang, Haiyu Zhao, Shuai Yi, Zhoujun Li, Xianglong Liu, Hongsheng Li
We observe that during training, the relationship proposal distribution is highly imbalanced: most of the negative relationship proposals are easy to identify, e. g., the inaccurate object detection, which leads to the under-fitting of low-frequency difficult proposals.
1 code implementation • COLING 2020 • Yunli Wang, Yu Wu, Lili Mou, Zhoujun Li, WenHan Chao
Conventional approaches for formality style transfer borrow models from neural machine translation, which typically requires massive parallel data for training.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Ze Yang, Wei Wu, Can Xu, Xinnian Liang, Jiaqi Bai, Liran Wang, Wei Wang, Zhoujun Li
Generating responses following a desired style has great potentials to extend applications of open-domain dialogue systems, yet is refrained by lacking of parallel data for training.
1 code implementation • IJCAI 2020 • Tianyang Zhao, Zhao Yan, Yunbo Cao, Zhoujun Li
Then, we propose to predict a subset of potential relations and filter out irrelevant ones to generate questions effectively.
Ranked #1 on Relation Extraction on ACE 2005 (Sentence Encoder metric)
no code implementations • ACL 2020 • Jian Yang, Shuming Ma, Dong-dong Zhang, Zhoujun Li, Ming Zhou
Although neural machine translation (NMT) has achieved significant progress in recent years, most previous NMT models only depend on the source text to generate translation.
2 code implementations • COLING 2020 • Minghao Li, Yiheng Xu, Lei Cui, Shaohan Huang, Furu Wei, Zhoujun Li, Ming Zhou
DocBank is constructed using a simple yet effective way with weak supervision from the \LaTeX{} documents available on the arXiv. com.
1 code implementation • LREC 2020 • Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, Zhoujun Li
We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet.
no code implementations • 4 Apr 2020 • Ze Yang, Wei Wu, Huang Hu, Can Xu, Wei Wang, Zhoujun Li
Thus, we propose learning a response generation model with both image-grounded dialogues and textual dialogues by assuming that the visual scene information at the time of a conversation can be represented by an image, and trying to recover the latent images of the textual dialogues through text-to-image generation techniques.
no code implementations • IJCNLP 2019 • Haoyan Liu, Lei Fang, Qian Liu, Bei Chen, Jian-Guang Lou, Zhoujun Li
One key component in text-to-SQL is to predict the comparison relations between columns and their values.
no code implementations • IJCNLP 2019 • Yunli Wang, Yu Wu, Lili Mou, Zhoujun Li, WenHan Chao
Formality text style transfer plays an important role in various NLP applications, such as non-native speaker assistants and child education.
no code implementations • EMNLP2019 2019 • Ze Yang, Can Xu, Wei Wu, Zhoujun Li
Automatic news comment generation is a new testbed for techniques of natural language generation.
1 code implementation • IJCNLP 2019 • Ze Yang, Wei Wu, Jian Yang, Can Xu, Zhoujun Li
Since the paired data now is no longer enough to train a neural generation model, we consider leveraging the large scale of unpaired data that are much easier to obtain, and propose response generation with both paired and unpaired data.
no code implementations • IJCNLP 2019 • Ze Yang, Can Xu, Wei Wu, Zhoujun Li
Automatic news comment generation is a new testbed for techniques of natural language generation.
no code implementations • 10 Jul 2019 • Hao Chen, Yue Xu, Feiran Huang, Zengde Deng, Wenbing Huang, Senzhang Wang, Peng He, Zhoujun Li
In this paper, we consider the problem of node classification and propose the Label-Aware Graph Convolutional Network (LAGCN) framework which can directly identify valuable neighbors to enhance the performance of existing GCN models.
no code implementations • 7 Mar 2019 • Chaozhuo Li, Senzhang Wang, Philip S. Yu, Zhoujun Li
Specifically, we propose a MCNE model to learn compact embeddings from pre-learned node features.
2 code implementations • LREC 2020 • Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, Zhoujun Li
We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet.
no code implementations • EMNLP 2018 • Jun Chen, Xiao-Ming Zhang, Yu Wu, Zhao Yan, Zhoujun Li
In this paper, we study automatic keyphrase generation.
3 code implementations • 19 Jun 2018 • Yu Wu, Furu Wei, Shaohan Huang, Yunli Wang, Zhoujun Li, Ming Zhou
Open domain response generation has achieved remarkable progress in recent years, but sometimes yields short and uninformative responses.
2 code implementations • 3 Jun 2018 • Yang Yang, Lei Zheng, Jiawei Zhang, Qingcai Cui, Zhoujun Li, Philip S. Yu
By projecting the explicit and latent features into a unified feature space, TI-CNN is trained with both the text and image information simultaneously.
no code implementations • 28 May 2018 • Yang Yang, Haoyan Liu, Xia Hu, Jiawei Zhang, Xiao-Ming Zhang, Zhoujun Li, Philip S. Yu
The number of missing people (i. e., people who get lost) greatly increases in recent years.
no code implementations • ACL 2018 • Yu Wu, Wei Wu, Zhoujun Li, Ming Zhou
We propose a method that can leverage unlabeled data to learn a matching model for response selection in retrieval-based chatbots.
no code implementations • 23 Jan 2018 • Zhao Yan, Duyu Tang, Nan Duan, Shujie Liu, Wendi Wang, Daxin Jiang, Ming Zhou, Zhoujun Li
We present assertion based question answering (ABQA), an open domain question answering task that takes a question and a passage as inputs, and outputs a semi-structured assertion consisting of a subject, a predicate and a list of arguments.
no code implementations • ICLR 2018 • Wei Wu, Can Xu, Yu Wu, Zhoujun Li
Conventional methods model open domain dialogue generation as a black box through end-to-end learning from large scale conversation data.
no code implementations • WS 2017 • Shuihua Li, Xiao-Ming Zhang, Zhoujun Li
The POS tree is useful to candidate answer extraction of web-based question answering.
no code implementations • 30 Nov 2017 • Yu Wu, Wei Wu, Dejian Yang, Can Xu, Zhoujun Li, Ming Zhou
We study response generation for open domain conversation in chatbots.
no code implementations • CL 2019 • Yu Wu, Wei Wu, Chen Xing, Can Xu, Zhoujun Li, Ming Zhou
The task requires matching a response candidate with a conversation context, whose challenges include how to recognize important parts of the context, and how to model the relationships among utterances in the context.
no code implementations • 18 Oct 2017 • Feiran Huang, Xiao-Ming Zhang, Zhoujun Li, Tao Mei, Yueying He, Zhonghua Zhao
Extensive experiments are conducted to investigate the effectiveness of our approach in the applications of multi-label classification and cross-modal search.
no code implementations • SEMEVAL 2017 • Wenzheng Feng, Yu Wu, Wei Wu, Zhoujun Li, Ming Zhou
This paper presents the system in SemEval-2017 Task 3, Community Question Answering (CQA).
no code implementations • 8 Jun 2017 • Zhao Yan, Duyu Tang, Nan Duan, Junwei Bao, Yuanhua Lv, Ming Zhou, Zhoujun Li
Understanding the connections between unstructured text and semi-structured table is an important yet neglected problem in natural language processing.
1 code implementation • ACL 2017 • Hai Ye, WenHan Chao, Zhunchen Luo, Zhoujun Li
Exploiting class ties between relations of one entity tuple will be promising for distantly supervised relation extraction.
3 code implementations • ACL 2017 • Yu Wu, Wei Wu, Chen Xing, Ming Zhou, Zhoujun Li
Existing work either concatenates utterances in context or matches a response with a highly abstract context vector finally, which may lose relationships among utterances or important contextual information.
Ranked #7 on Conversational Response Selection on RRS
no code implementations • 15 Nov 2016 • Yu Wu, Wei Wu, Zhoujun Li, Ming Zhou
Long text brings a big challenge to semantic matching due to their complicated semantic and syntactic structures.
no code implementations • COLING 2016 • Chaozhuo Li, Yu Wu, Wei Wu, Chen Xing, Zhoujun Li, Ming Zhou
While automatic response generation for building chatbot systems has drawn a lot of attention recently, there is limited understanding on when we need to consider the linguistic context of an input text in the generation process.
1 code implementation • 30 Apr 2016 • Yu Wu, Wei Wu, Zhoujun Li, Ming Zhou
The message vector, the response vector, and the two topic vectors are fed to neural tensors to calculate a matching score.