no code implementations • CCL 2022 • Xiaoxu Zhang, Zhiqiang Ma, Zhiqiang Liu, Caijilahu Bao
“针对 Transformer 模型在蒙古语语音识别任务中无法学习到带有控制符的蒙古语词和语音之间的对应关系, 造成模型对蒙古语的不适应问题。提出一种面向 Transformer 模型的蒙古语词编码方法, 方法使用蒙古语字母特征与词特征进行混合编码, 通过结合蒙古语字母信息使 Transformer 模型能够区分带有控制符的蒙古语词, 学习到蒙古语词与语音之间的对应关系。在 IMUT-MC 数据集上, 构建 Transformer 模型并进行了词特征编码方法的消融实验和对比实验。消融实验结果表明, 词特征编码方法在 HWER、WER、SER 上分别降低了 23. 4%、6. 9%、2. 6%;对比实验结果表明, 词特征编码方法领先于所有方法, HWER 和 WER 分别达到 11. 8%、19. 8%。”
no code implementations • CCL 2022 • Fangyuan Zhu, Zhiqiang Ma, Zhiqiang Liu, Caijilahu Bao, Hongbin Wang
“说话人特征提取模型提取到的说话人特征之间区分性低, 使得蒙古语声学模型无法学习到区分性信息, 导致模型无法适应不同说话人。提出一种基于注意力的说话人自适应方法, 方法引入神经图灵机进行自适应, 增加记忆模块存放说话人特征, 采用注意力机制计算记忆模块中说话人特征与当前语音说话人特征的相似权重矩阵, 通过权重矩阵重新组合成说话人特征s-vector, 进而提高说话人特征之间的区分性。在IMUT-MCT数据集上, 进行说话人特征提取方法的消融实验、模型自适应实验和案例分析。实验结果表明, 对比不同说话人特征s-vector、i-vector与d-vector, s-vector比其他两种方法的SER和WER分别降低4. 96%、1. 08%;在不同的蒙古语声学模型上进行比较, 提出的方法相对于基线均有性能提升。”
no code implementations • 27 Mar 2025 • Xianzhi Li, Ethan Callanan, Xiaodan Zhu, Mathieu Sibue, Antony Papadimitriou, Mahmoud Mahfouz, Zhiqiang Ma, Xiaomo Liu
While Large Language Models (LLMs) are effectively aligned through extensive pre-training and fine-tuning, they still struggle with varying levels of uncertainty during token generation.
no code implementations • 20 Oct 2024 • Ran Zmigrod, Pranav Shetty, Mathieu Sibue, Zhiqiang Ma, Armineh Nourbakhsh, Xiaomo Liu, Manuela Veloso
In this work, we present K2Q, a diverse collection of five datasets converted from KIE to a prompt-response format using a plethora of bespoke templates.
no code implementations • 3 Oct 2024 • Xianzhi Li, Ran Zmigrod, Zhiqiang Ma, Xiaomo Liu, Xiaodan Zhu
Language models are capable of memorizing detailed patterns and information, leading to a double-edged effect: they achieve impressive modeling performance on downstream tasks with the stored knowledge but also raise significant privacy concerns.
no code implementations • 5 Apr 2024 • Ran Zmigrod, Dongsheng Wang, Mathieu Sibue, Yulong Pei, Petr Babkin, Ivan Brugere, Xiaomo Liu, Nacho Navarro, Antony Papadimitriou, William Watson, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah
Several datasets exist for research on specific tasks of VRDU such as document classification (DC), key entity extraction (KEE), entity linking, visual question answering (VQA), inter alia.
no code implementations • 7 Feb 2024 • Ran Zmigrod, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah
Visually Rich Form Understanding (VRFU) poses a complex research problem due to the documents' highly structured nature and yet highly variable style and content.
no code implementations • 5 Jan 2024 • Dongsheng Wang, Zhiqiang Ma, Armineh Nourbakhsh, Kang Gu, Sameena Shah
Advances in Visually Rich Document Understanding (VrDU) have enabled information extraction and question answering over documents with complex layouts.
no code implementations • 31 Dec 2023 • Dongsheng Wang, Natraj Raman, Mathieu Sibue, Zhiqiang Ma, Petr Babkin, Simerjot Kaur, Yulong Pei, Armineh Nourbakhsh, Xiaomo Liu
Enterprise documents such as forms, invoices, receipts, reports, contracts, and other similar records, often carry rich semantics at the intersection of textual and spatial modalities.
no code implementations • 12 Oct 2023 • Ethan Callanan, Amarachi Mbakwe, Antony Papadimitriou, Yulong Pei, Mathieu Sibue, Xiaodan Zhu, Zhiqiang Ma, Xiaomo Liu, Sameena Shah
Large Language Models (LLMs) have demonstrated remarkable performance on a wide range of Natural Language Processing (NLP) tasks, often matching or even beating state-of-the-art task-specific models.
no code implementations • 27 Jun 2023 • Haitao Tang, Yu Fu, Lei Sun, Jiabin Xue, Dan Liu, Yongchao Li, Zhiqiang Ma, Minghui Wu, Jia Pan, Genshun Wan, Ming'en Zhao
In this paper, we propose an adaptive two-stage knowledge distillation method consisting of hidden layer learning and output layer learning.
1 code implementation • 5 Jun 2023 • Lin-Chi Wu, Zengjie Zhang, Sofie Haesaert, Zhiqiang Ma, Zhiyong Sun
Reinforcement learning (RL) is an effective approach to motion planning in autonomous driving, where an optimal driving policy can be automatically learned using the interaction data with the environment.
no code implementations • 10 May 2023 • Xianzhi Li, Samuel Chan, Xiaodan Zhu, Yulong Pei, Zhiqiang Ma, Xiaomo Liu, Sameena Shah
The most recent large language models(LLMs) such as ChatGPT and GPT-4 have shown exceptional capabilities of generalist models, achieving state-of-the-art performance on a wide range of NLP tasks with little or no adaptation.
Ranked #1 on
Question Answering
on ConvFinQA
1 code implementation • 7 Oct 2022 • Zhiyu Chen, Shiyang Li, Charese Smiley, Zhiqiang Ma, Sameena Shah, William Yang Wang
With the recent advance in large pre-trained language models, researchers have achieved record performances in NLP tasks that mostly focus on language pattern matching.
Ranked #2 on
Question Answering
on ConvFinQA
no code implementations • 23 Aug 2020 • Zhiqiang Ma, Grace Bang, Chong Wang, Xiaomo Liu
Earnings calls are hosted by management of public companies to discuss the company's financial performance with analysts and investors.
no code implementations • 17 May 2020 • Zhiqiang Ma, Steven Pomerville, Mingyang Di, Armineh Nourbakhsh
In this paper we present SPot, an automated tool for detecting operating segments and their related performance indicators from earnings reports.
no code implementations • 25 Aug 2019 • Azadeh Nematzadeh, Grace Bang, Xiaomo Liu, Zhiqiang Ma
Companies and financial investors are paying increasing attention to social consciousness in developing their corporate strategies and making investment decisions to support a sustainable economy for the future.
no code implementations • IWSLT (EMNLP) 2018 • Dan Liu, Junhua Liu, Wu Guo, Shifu Xiong, Zhiqiang Ma, Rui Song, Chongliang Wu, Quan Liu
This paper describes the USTC-NEL system to the speech translation task of the IWSLT Evaluation 2018.