Search Results for author: Christopher Cieri

Found 27 papers, 2 papers with code

Reflections on 30 Years of Language Resource Development and Sharing

no code implementations LREC 2022 Christopher Cieri, Mark Liberman, Sunghye Cho, Stephanie Strassel, James Fiumara, Jonathan Wright

The Linguistic Data Consortium was founded in 1992 to solve the problem that limitations in access to shareable data was impeding progress in Human Language Technology research and development.

Management Open-Ended Question Answering

The NIEUW Project: Developing Language Resources through Novel Incentives

no code implementations NIDCP (LREC) 2022 James Fiumara, Christopher Cieri, Mark Liberman, Chris Callison-Burch, Jonathan Wright, Robert Parker

NIEUW leverages the power of novel incentives to elicit linguistic data and annotations from a wide variety of contributors including citizen scientists, game players, and language students and professionals.

Using Mixed Incentives to Document Xi’an Guanzhong

no code implementations NIDCP (LREC) 2022 Juhong Zhan, Yue Jiang, Christopher Cieri, Mark Liberman, Jiahong Yuan, Yiya Chen, Odette Scharenborg

This paper describes our use of mixed incentives and the citizen science portal LanguageARC to prepare, collect and quality control a large corpus of object namings for the purpose of providing speech data to document the under-represented Guanzhong dialect of Chinese spoken in the Shaanxi province in the environs of Xi’an.

The Third DIHARD Diarization Challenge

3 code implementations2 Dec 2020 Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman

DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain.

speaker-diarization Speaker Diarization +1

LanguageARC - a tutorial

no code implementations LREC 2020 Christopher Cieri, James Fiumara

LanguageARC is a portal that offers citizen linguists opportunities to contribute to language related research.

A Progress Report on Activities at the Linguistic Data Consortium Benefitting the LREC Community

no code implementations LREC 2020 Christopher Cieri, James Fiumara, Stephanie Strassel, Jonathan Wright, Denise DiPersio, Mark Liberman

This latest in a series of Linguistic Data Consortium (LDC) progress reports to the LREC community does not describe any single language resource, evaluation campaign or technology but sketches the activities, since the last report, of a data center devoted to supporting the work of LREC attendees among other research communities.

Related Works in the Linguistic Data Consortium Catalog

no code implementations LREC 2020 Daniel Jaquette, Christopher Cieri, Denise DiPersio

The authors go step-by-step through the development of the Related Works schema, implementation of the software and database changes, and data entry of the relations.

LanguageARC: Developing Language Resources Through Citizen Linguistics

no code implementations LREC 2020 James Fiumara, Christopher Cieri, Jonathan Wright, Mark Liberman

Like other Citizen Science platforms and projects, LanguageARC harnesses the power and efforts of volunteers who are motivated by the incentives of contributing to science, learning and discovery, and belonging to a community dedicated to social improvement.

ARC

The Second DIHARD Diarization Challenge: Dataset, task, and baselines

1 code implementation18 Jun 2019 Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristia, Jun Du, Sriram Ganapathy, Mark Liberman

This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain.

Action Detection Activity Detection +5

Data Management Plans and Data Centers

no code implementations LREC 2016 Denise DiPersio, Christopher Cieri, Daniel Jaquette

Data management plans, data sharing plans and the like are now required by funders worldwide as part of research proposals.

Management

Selection Criteria for Low Resource Language Programs

no code implementations LREC 2016 Christopher Cieri, Mike Maxwell, Stephanie Strassel, Jennifer Tracey

This paper documents and describes the criteria used to select languages for study within programs that include low resource languages whether given that label or another similar one.

Management

The Language Application Grid and Galaxy

no code implementations LREC 2016 Nancy Ide, Keith Suderman, James Pustejovsky, Marc Verhagen, Christopher Cieri

The NSF-SI2-funded LAPPS Grid project is a collaborative effort among Brandeis University, Vassar College, Carnegie-Mellon University (CMU), and the Linguistic Data Consortium (LDC), which has developed an open, web-based infrastructure through which resources can be easily accessed and within which tailored language services can be efficiently composed, evaluated, disseminated and consumed by researchers, developers, and students across a wide variety of disciplines.

Management

Trends in HLT Research: A Survey of LDC's Data Scholarship Program

no code implementations LREC 2016 Denise DiPersio, Christopher Cieri

Since its inception in 2010, the Linguistic Data Consortium{'}s data scholarship program has awarded no cost grants in data to 64 recipients from 26 countries.

Survey

Building Language Resources for Exploring Autism Spectrum Disorders

no code implementations LREC 2016 Julia Parish-Morris, Christopher Cieri, Mark Liberman, Leila Bateman, Emily Ferguson, Robert T. Schultz

Autism spectrum disorder (ASD) is a complex neurodevelopmental condition that would benefit from low-cost and reliable improvements to screening and diagnosis.

Facing the Identification Problem in Language-Related Scientific Data Analysis.

no code implementations LREC 2014 Joseph Mariani, Christopher Cieri, Gil Francopoulo, Patrick Paroubek, Marine Delaborde

This paper describes the problems that must be addressed when studying large amounts of data over time which require entity normalization applied not to the usual genres of news or political speech, but to the genre of academic discourse about language resources, technologies and sciences.

Language Identification

The Language Application Grid

no code implementations LREC 2014 Nancy Ide, James Pustejovsky, Christopher Cieri, Eric Nyberg, Di Wang, Keith Suderman, Marc Verhagen, Jonathan Wright

The Language Application (LAPPS) Grid project is establishing a framework that enables language service discovery, composition, and reuse and promotes sustainability, manageability, usability, and interoperability of natural language Processing (NLP) components.

Machine Translation Question Answering +1

New Directions for Language Resource Development and Distribution

no code implementations LREC 2014 Christopher Cieri, Denise DiPersio, Mark Liberman, Andrea Mazzucchi, Stephanie Strassel, Jonathan Wright

Despite the growth in the number of linguistic data centers around the world, their accomplishments and expansions and the advances they have help enable, the language resources that exist are a small fraction of those required to meet the goals of Human Language Technologies (HLT) for the worldÂ’s languages and the promises they offer: broad access to knowledge, direct communication across language boundaries and engagement in a global community.

Transfer Learning

Twenty Years of Language Resource Development and Distribution: A Progress Report on LDC Activities

no code implementations LREC 2012 Christopher Cieri, Marian Reed, Denise DiPersio, Mark Liberman

On the Linguistic Data Consortium's (LDC) 20th anniversary, this paper describes the changes to the language resource landscape over the past two decades, how LDC has adjusted its practice to adapt to them and how the business model continues to grow.

Cannot find the paper you are looking for? You can Submit a new open access paper.