Search Results for author: Bryan Wang

Found 13 papers, 3 papers with code

Cost-Effective Hallucination Detection for LLMs

no code implementations31 Jul 2024 Simon Valentin, Jinmiao Fu, Gianluca Detommaso, Shaoyuan Xu, Giovanni Zappella, Bryan Wang

Large language models (LLMs) can be prone to hallucinations - generating unreliable outputs that are unfaithful to their inputs, external facts or internally inconsistent.

Decision Making Fact Checking +2

Q-Tuning: Queue-based Prompt Tuning for Lifelong Few-shot Language Learning

no code implementations22 Apr 2024 Yanhui Guo, Shaoyuan Xu, Jinmiao Fu, Jia Liu, Chaosheng Dong, Bryan Wang

This paper introduces \textbf{Q-tuning}, a novel approach for continual prompt tuning that enables the lifelong learning of a pre-trained language model.

Language Modelling

LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing

no code implementations15 Feb 2024 Bryan Wang, Yuliang Li, Zhaoyang Lv, Haijun Xia, Yan Xu, Raj Sodhi

Based on these findings, we propose design implications to inform the future development of agent-assisted content editing.

Video Editing

SynthScribe: Deep Multimodal Tools for Synthesizer Sound Retrieval and Exploration

no code implementations7 Dec 2023 Stephen Brade, Bryan Wang, Mauricio Sousa, Gregory Lee Newsome, Sageev Oore, Tovi Grossman

This is achieved with three main features: a multimodal search engine for a large library of synthesizer sounds; a user centered genetic algorithm by which completely new sounds can be created and selected given the users preferences; a sound editing support feature which highlights and gives examples for key control parameters with respect to a text or audio based query.

Multimodal Deep Learning Retrieval

A Zero-Shot Language Agent for Computer Control with Structured Reflection

no code implementations12 Oct 2023 Tao Li, Gang Li, Zhiwei Deng, Bryan Wang, Yang Li

To perform a task, recent works often require a model to learn from trace examples of the task via either supervised learning or few/many-shot prompting.

Management

AdaSelection: Accelerating Deep Learning Training through Data Subsampling

no code implementations19 Jun 2023 Minghe Zhang, Chaosheng Dong, Jinmiao Fu, Tianchen Zhou, Jia Liang, Jia Liu, Bo Liu, Michinari Momma, Bryan Wang, Yan Gao, Yi Sun

In this paper, we introduce AdaSelection, an adaptive sub-sampling method to identify the most informative sub-samples within each minibatch to speed up the training of large-scale deep learning models without sacrificing model performance.

Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models

no code implementations18 Apr 2023 Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman

Text-to-image generative models have demonstrated remarkable capabilities in generating high-quality images based on textual prompts.

Text-to-Image Generation

Enabling Conversational Interaction with Mobile UI using Large Language Models

1 code implementation18 Sep 2022 Bryan Wang, Gang Li, Yang Li

This paper investigates the feasibility of enabling versatile conversational interactions with mobile UIs using a single LLM.

CMA-CLIP: Cross-Modality Attention CLIP for Image-Text Classification

no code implementations7 Dec 2021 Huidong Liu, Shaoyuan Xu, Jinmiao Fu, Yang Liu, Ning Xie, Chien-Chih Wang, Bryan Wang, Yi Sun

In this paper, we propose the Cross-Modality Attention Contrastive Language-Image Pre-training (CMA-CLIP), a new framework which unifies two types of cross-modality attentions, sequence-wise attention and modality-wise attention, to effectively fuse information from image and text pairs.

Attribute Image-text Classification +3

Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning

2 code implementations7 Aug 2021 Bryan Wang, Gang Li, Xin Zhou, Zhourong Chen, Tovi Grossman, Yang Li

Mobile User Interface Summarization generates succinct language descriptions of mobile screens for conveying important contents and functionalities of the screen, which can be useful for many language-based application scenarios.

PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network

1 code implementation11 Nov 2018 Bryan Wang, Yi-Hsuan Yang

To build such an AI performer, we propose in this paper a deep convolutional model that learns in an end-to-end manner the score-to-audio mapping between a symbolic representation of music called the piano rolls and an audio representation of music called the spectrograms.

Sound Multimedia Audio and Speech Processing

Cannot find the paper you are looking for? You can Submit a new open access paper.