Search Results for author: Zhiqiang Ma

Found 15 papers, 2 papers with code

基于注意力的蒙古语说话人特征提取方法(Attention based Mongolian Speaker Feature Extraction)

no code implementations CCL 2022 Fangyuan Zhu, Zhiqiang Ma, Zhiqiang Liu, Caijilahu Bao, Hongbin Wang

“说话人特征提取模型提取到的说话人特征之间区分性低, 使得蒙古语声学模型无法学习到区分性信息, 导致模型无法适应不同说话人。提出一种基于注意力的说话人自适应方法, 方法引入神经图灵机进行自适应, 增加记忆模块存放说话人特征, 采用注意力机制计算记忆模块中说话人特征与当前语音说话人特征的相似权重矩阵, 通过权重矩阵重新组合成说话人特征s-vector, 进而提高说话人特征之间的区分性。在IMUT-MCT数据集上, 进行说话人特征提取方法的消融实验、模型自适应实验和案例分析。实验结果表明, 对比不同说话人特征s-vector、i-vector与d-vector, s-vector比其他两种方法的SER和WER分别降低4. 96%、1. 08%;在不同的蒙古语声学模型上进行比较, 提出的方法相对于基线均有性能提升。”

面向 Transformer 模型的蒙古语语音识别词特征编码方法(Researching of the Mongolian word encoding method based on Transformer Mongolian speech recognition)

no code implementations CCL 2022 Xiaoxu Zhang, Zhiqiang Ma, Zhiqiang Liu, Caijilahu Bao

“针对 Transformer 模型在蒙古语语音识别任务中无法学习到带有控制符的蒙古语词和语音之间的对应关系, 造成模型对蒙古语的不适应问题。提出一种面向 Transformer 模型的蒙古语词编码方法, 方法使用蒙古语字母特征与词特征进行混合编码, 通过结合蒙古语字母信息使 Transformer 模型能够区分带有控制符的蒙古语词, 学习到蒙古语词与语音之间的对应关系。在 IMUT-MC 数据集上, 构建 Transformer 模型并进行了词特征编码方法的消融实验和对比实验。消融实验结果表明, 词特征编码方法在 HWER、WER、SER 上分别降低了 23. 4%、6. 9%、2. 6%;对比实验结果表明, 词特征编码方法领先于所有方法, HWER 和 WER 分别达到 11. 8%、19. 8%。”

speech-recognition Speech Recognition

BuDDIE: A Business Document Dataset for Multi-task Information Extraction

no code implementations5 Apr 2024 Ran Zmigrod, Dongsheng Wang, Mathieu Sibue, Yulong Pei, Petr Babkin, Ivan Brugere, Xiaomo Liu, Nacho Navarro, Antony Papadimitriou, William Watson, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah

Several datasets exist for research on specific tasks of VRDU such as document classification (DC), key entity extraction (KEE), entity linking, visual question answering (VQA), inter alia.

Document Classification document understanding +5

TreeForm: End-to-end Annotation and Evaluation for Form Document Parsing

no code implementations7 Feb 2024 Ran Zmigrod, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah

Visually Rich Form Understanding (VRFU) poses a complex research problem due to the documents' highly structured nature and yet highly variable style and content.

DocGraphLM: Documental Graph Language Model for Information Extraction

no code implementations5 Jan 2024 Dongsheng Wang, Zhiqiang Ma, Armineh Nourbakhsh, Kang Gu, Sameena Shah

Advances in Visually Rich Document Understanding (VrDU) have enabled information extraction and question answering over documents with complex layouts.

document understanding Language Modelling +2

DocLLM: A layout-aware generative language model for multimodal document understanding

no code implementations31 Dec 2023 Dongsheng Wang, Natraj Raman, Mathieu Sibue, Zhiqiang Ma, Petr Babkin, Simerjot Kaur, Yulong Pei, Armineh Nourbakhsh, Xiaomo Liu

Enterprise documents such as forms, invoices, receipts, reports, contracts, and other similar records, often carry rich semantics at the intersection of textual and spatial modalities.

document understanding Language Modelling

Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams

no code implementations12 Oct 2023 Ethan Callanan, Amarachi Mbakwe, Antony Papadimitriou, Yulong Pei, Mathieu Sibue, Xiaodan Zhu, Zhiqiang Ma, Xiaomo Liu, Sameena Shah

Large Language Models (LLMs) have demonstrated remarkable performance on a wide range of Natural Language Processing (NLP) tasks, often matching or even beating state-of-the-art task-specific models.

Risk-Aware Reward Shaping of Reinforcement Learning Agents for Autonomous Driving

1 code implementation5 Jun 2023 Lin-Chi Wu, Zengjie Zhang, Sofie Haesaert, Zhiqiang Ma, Zhiyong Sun

Reinforcement learning (RL) is an effective approach to motion planning in autonomous driving, where an optimal driving policy can be automatically learned using the interaction data with the environment.

Autonomous Driving Motion Planning +3

Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? A Study on Several Typical Tasks

no code implementations10 May 2023 Xianzhi Li, Samuel Chan, Xiaodan Zhu, Yulong Pei, Zhiqiang Ma, Xiaomo Liu, Sameena Shah

The most recent large language models(LLMs) such as ChatGPT and GPT-4 have shown exceptional capabilities of generalist models, achieving state-of-the-art performance on a wide range of NLP tasks with little or no adaptation.

Binary Classification named-entity-recognition +5

ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering

1 code implementation7 Oct 2022 Zhiyu Chen, Shiyang Li, Charese Smiley, Zhiqiang Ma, Sameena Shah, William Yang Wang

With the recent advance in large pre-trained language models, researchers have achieved record performances in NLP tasks that mostly focus on language pattern matching.

Conversational Question Answering

Towards Earnings Call and Stock Price Movement

no code implementations23 Aug 2020 Zhiqiang Ma, Grace Bang, Chong Wang, Xiaomo Liu

Earnings calls are hosted by management of public companies to discuss the company's financial performance with analysts and investors.

Management Stock Price Prediction

SPot: A tool for identifying operating segments in financial tables

no code implementations17 May 2020 Zhiqiang Ma, Steven Pomerville, Mingyang Di, Armineh Nourbakhsh

In this paper we present SPot, an automated tool for detecting operating segments and their related performance indicators from earnings reports.

Benchmarking

Empirical Study on Detecting Controversy in Social Media

no code implementations25 Aug 2019 Azadeh Nematzadeh, Grace Bang, Xiaomo Liu, Zhiqiang Ma

Companies and financial investors are paying increasing attention to social consciousness in developing their corporate strategies and making investment decisions to support a sustainable economy for the future.

Cannot find the paper you are looking for? You can Submit a new open access paper.