Search Results for author: Chris Bryan

Found 9 papers, 2 papers with code

ASAP: Interpretable Analysis and Summarization of AI-generated Image Patterns at Scale

no code implementations • 3 Apr 2024 • Jinbin Huang, Chen Chen, Aditi Mishra, Bum Chul Kwon, Zhicheng Liu, Chris Bryan

Generative image models have emerged as a promising technology to produce realistic images.

Paper
Add Code

InterVLS: Interactive Model Understanding and Improvement with Vision-Language Surrogates

no code implementations • 6 Nov 2023 • Jinbin Huang, Wenbin He, Liang Gou, Liu Ren, Chris Bryan

Deep learning models are widely used in critical applications, highlighting the need for pre-deployment model understanding and improvement.

Paper
Add Code

LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity

no code implementations • 12 Apr 2023 • Anjana Arunkumar, Shubham Sharma, Rakhi Agrawal, Sriram Chandrasekaran, Chris Bryan

Cross-task generalization is a significant outcome that defines mastery in natural language understanding.

Natural Language Understanding

Paper
Add Code

Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow

1 code implementation • 9 Feb 2023 • Anjana Arunkumar, Swaroop Mishra, Bhavdeep Sachdeva, Chitta Baral, Chris Bryan

In pursuit of creating better benchmarks, we propose VAIDA, a novel benchmark creation paradigm for NLP, that focuses on guiding crowdworkers, an under-explored facet of addressing benchmark idiosyncrasies.

Paper
Code

Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task

no code implementations • 14 Oct 2022 • Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral

Evaluation of models on benchmarks is unreliable without knowing the degree of sample hardness; this subsequently overestimates the capability of AI systems and limits their adoption in real world applications.

Semantic Textual Similarity STS

Paper
Add Code

A Survey of Parameters Associated with the Quality of Benchmarks in NLP

no code implementations • 14 Oct 2022 • Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral

Inspired by successful quality indices in several domains such as power, food, and water, we take the first step towards a metric by identifying certain language properties that can represent various possible interactions leading to biases in a benchmark.

Benchmarking

Paper
Add Code

DQI: A Guide to Benchmark Evaluation

no code implementations • 10 Aug 2020 • Swaroop Mishra, Anjana Arunkumar, Bhavdeep Sachdeva, Chris Bryan, Chitta Baral

A `state of the art' model A surpasses humans in a benchmark B, but fails on similar benchmarks C, D, and E. What does B have that the other benchmarks do not?

Paper
Add Code

Our Evaluation Metric Needs an Update to Encourage Generalization

no code implementations • 14 Jul 2020 • Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral

In order to stop the inflation in model performance -- and thus overestimation in AI systems' capabilities -- we propose a simple and novel evaluation metric, WOOD Score, that encourages generalization during evaluation.

Paper
Add Code

DQI: Measuring Data Quality in NLP

1 code implementation • 2 May 2020 • Swaroop Mishra, Anjana Arunkumar, Bhavdeep Sachdeva, Chris Bryan, Chitta Baral

The data creation paradigm consists of several data visualizations to help data creators (i) understand the quality of data and (ii) visualize the impact of the created data instance on the overall quality.

Active Learning Benchmarking

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.