Search Results for author: Mark Liberman

Found 25 papers, 6 papers with code

Benchmarking: Past, Present and Future

1 code implementation ACL (BPPF) 2021 Kenneth Church, Mark Liberman, Valia Kordoni

There used to be more top-down leadership from government (and industry, in the case of systems, with benchmarks such as SPEC).

Benchmarking Reading Comprehension

Reflections on 30 Years of Language Resource Development and Sharing

no code implementations LREC 2022 Christopher Cieri, Mark Liberman, Sunghye Cho, Stephanie Strassel, James Fiumara, Jonathan Wright

The Linguistic Data Consortium was founded in 1992 to solve the problem that limitations in access to shareable data was impeding progress in Human Language Technology research and development.

Management Open-Ended Question Answering

GRAIL—Generalized Representation and Aggregation of Information Layers

no code implementations LREC (LAW) 2022 Sameer Pradhan, Mark Liberman

This paper identifies novel characteristics necessary to successfully represent multiple streams of natural language information from speech and text simultaneously, and proposes a multi-tiered system that implements these characteristics centered around a declarative configuration.

Using Mixed Incentives to Document Xi’an Guanzhong

no code implementations NIDCP (LREC) 2022 Juhong Zhan, Yue Jiang, Christopher Cieri, Mark Liberman, Jiahong Yuan, Yiya Chen, Odette Scharenborg

This paper describes our use of mixed incentives and the citizen science portal LanguageARC to prepare, collect and quality control a large corpus of object namings for the purpose of providing speech data to document the under-represented Guanzhong dialect of Chinese spoken in the Shaanxi province in the environs of Xi’an.

The NIEUW Project: Developing Language Resources through Novel Incentives

no code implementations NIDCP (LREC) 2022 James Fiumara, Christopher Cieri, Mark Liberman, Chris Callison-Burch, Jonathan Wright, Robert Parker

NIEUW leverages the power of novel incentives to elicit linguistic data and annotations from a wide variety of contributors including citizen scientists, game players, and language students and professionals.

Automatic recognition of suprasegmentals in speech

no code implementations2 Aug 2021 Jiahong Yuan, Neville Ryant, Xingyu Cai, Kenneth Church, Mark Liberman

This study reports our efforts to improve automatic recognition of suprasegmentals by fine-tuning wav2vec 2. 0 with CTC, a method that has been successful in automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

The Third DIHARD Diarization Challenge

3 code implementations2 Dec 2020 Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman

DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain.

speaker-diarization Speaker Diarization +1

Neural Representations for Modeling Variation in Speech

1 code implementation25 Nov 2020 Martijn Bartelds, Wietse de Vries, Faraz Sanal, Caitlin Richter, Mark Liberman, Martijn Wieling

We show that speech representations extracted from a specific type of neural model (i. e. Transformers) lead to a better match with human perception than two earlier approaches on the basis of phonetic transcriptions and MFCC-based acoustic features.

Probing Acoustic Representations for Phonetic Properties

1 code implementation25 Oct 2020 Danni Ma, Neville Ryant, Mark Liberman

Pre-trained acoustic representations such as wav2vec and DeCoAR have attained impressive word error rates (WER) for speech recognition benchmarks, particularly when labeled data is limited.

Benchmarking speech-recognition +1

LanguageARC: Developing Language Resources Through Citizen Linguistics

no code implementations LREC 2020 James Fiumara, Christopher Cieri, Jonathan Wright, Mark Liberman

Like other Citizen Science platforms and projects, LanguageARC harnesses the power and efforts of volunteers who are motivated by the incentives of contributing to science, learning and discovery, and belonging to a community dedicated to social improvement.

A Progress Report on Activities at the Linguistic Data Consortium Benefitting the LREC Community

no code implementations LREC 2020 Christopher Cieri, James Fiumara, Stephanie Strassel, Jonathan Wright, Denise DiPersio, Mark Liberman

This latest in a series of Linguistic Data Consortium (LDC) progress reports to the LREC community does not describe any single language resource, evaluation campaign or technology but sketches the activities, since the last report, of a data center devoted to supporting the work of LREC attendees among other research communities.

The Second DIHARD Diarization Challenge: Dataset, task, and baselines

1 code implementation18 Jun 2019 Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristia, Jun Du, Sriram Ganapathy, Mark Liberman

This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain.

Action Detection Activity Detection +5

From Human Language Technology to Human Language Science

no code implementations JEPTALNRECITAL 2016 Mark Liberman

Thirty years ago, in order to get past roadblocks in Machine Translation and Automatic Speech Recognition, DARPA invented a new way to organize and manage technological R{\&}D : a {``}common task{''} is defined by a formal quantitative evaluation metric and a body of shared training data, and researchers join an open competition to compare approaches.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Building Language Resources for Exploring Autism Spectrum Disorders

no code implementations LREC 2016 Julia Parish-Morris, Christopher Cieri, Mark Liberman, Leila Bateman, Emily Ferguson, Robert T. Schultz

Autism spectrum disorder (ASD) is a complex neurodevelopmental condition that would benefit from low-cost and reliable improvements to screening and diagnosis.

New Directions for Language Resource Development and Distribution

no code implementations LREC 2014 Christopher Cieri, Denise DiPersio, Mark Liberman, Andrea Mazzucchi, Stephanie Strassel, Jonathan Wright

Despite the growth in the number of linguistic data centers around the world, their accomplishments and expansions and the advances they have help enable, the language resources that exist are a small fraction of those required to meet the goals of Human Language Technologies (HLT) for the worldÂ’s languages and the promises they offer: broad access to knowledge, direct communication across language boundaries and engagement in a global community.

Transfer Learning

Twenty Years of Language Resource Development and Distribution: A Progress Report on LDC Activities

no code implementations LREC 2012 Christopher Cieri, Marian Reed, Denise DiPersio, Mark Liberman

On the Linguistic Data Consortium's (LDC) 20th anniversary, this paper describes the changes to the language resource landscape over the past two decades, how LDC has adjusted its practice to adapt to them and how the business model continues to grow.

Annotation graphs as a framework for multidimensional linguistic data analysis

1 code implementation5 Jul 1999 Steven Bird, Mark Liberman

In recent work we have presented a formal framework for linguistic annotation based on labeled acyclic digraphs.

Cannot find the paper you are looking for? You can Submit a new open access paper.