Search Results for author: Yujie Lu

Found 32 papers, 16 papers with code

VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following

no code implementations29 Nov 2023 Yujie Lu, Xiujun Li, William Yang Wang, Yejin Choi

We introduce VISUAL EMBEDDED INSTRUCTION (VIM), a new framework designed to evaluate the visual instruction following capability of Multimodal Large Language Models (MLLMs).

In-Context Learning visual instruction following

GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks

no code implementations2 Nov 2023 Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan, Lianke Qin, Heng Wang, Xifeng Yan, William Yang Wang, Linda Ruth Petzold

Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details.

Image Generation

Empowering Psychotherapy with Large Language Models: Cognitive Distortion Detection through Diagnosis of Thought Prompting

1 code implementation11 Oct 2023 Zhiyu Chen, Yujie Lu, William Yang Wang

Mental illness remains one of the most critical public health issues of our time, due to the severe scarcity and accessibility limit of professionals.

ImagenHub: Standardizing the evaluation of conditional image generation models

2 code implementations2 Oct 2023 Max Ku, Tianle Li, Kai Zhang, Yujie Lu, Xingyu Fu, Wenwen Zhuang, Wenhu Chen

Recently, a myriad of conditional image generation and editing models have been developed to serve different downstream tasks, including text-to-image generation, text-guided image editing, subject-driven image generation, control-guided image generation, etc.

Conditional Image Generation text-guided-image-editing

Learning Concise and Descriptive Attributes for Visual Recognition

1 code implementation ICCV 2023 An Yan, Yu Wang, Yiwu Zhong, chengyu dong, Zexue He, Yujie Lu, William Wang, Jingbo Shang, Julian McAuley

Recent advances in foundation models present new opportunities for interpretable visual recognition -- one can first query Large Language Models (LLMs) to obtain a set of attributes that describe each class, then apply vision-language models to classify images via these attributes.


Let's Think Frame by Frame with VIP: A Video Infilling and Prediction Dataset for Evaluating Video Chain-of-Thought

1 code implementation23 May 2023 Vaishnavi Himakunthala, Andy Ouyang, Daniel Rose, Ryan He, Alex Mei, Yujie Lu, Chinmay Sonar, Michael Saxon, William Yang Wang

Despite exciting recent results showing vision-language systems' capacity to reason about images using natural language, their capacity for video reasoning remains under-explored.

Descriptive Video Prediction

Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation

no code implementations18 May 2023 Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Eric Wang, Miguel Eckstein, William Yang Wang

We conduct a series of experiments to compare the common edits made by humans and GPT-k, evaluate the performance of GPT-k in prompting T2I, and examine factors that may influence this process.

Text Generation Text-to-Image Generation

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation

1 code implementation NeurIPS 2023 Yujie Lu, Xianjun Yang, Xiujun Li, Xin Eric Wang, William Yang Wang

Existing automatic evaluation on text-to-image synthesis can only provide an image-text matching score, without considering the object-level compositionality, which results in poor correlation with human judgments.

Attribute Image Generation +2

Multimodal Procedural Planning via Dual Text-Image Prompting

1 code implementation2 May 2023 Yujie Lu, Pan Lu, Zhiyu Chen, Wanrong Zhu, Xin Eric Wang, William Yang Wang

The key challenges of MPP are to ensure the informativeness, temporal coherence, and accuracy of plans across modalities.

Informativeness Text-to-Image Generation

Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

1 code implementation27 Oct 2022 Edwin Zhang, Yujie Lu, William Wang, Amy Zhang

Training generalist agents is difficult across several axes, requiring us to deal with high-dimensional inputs (space), long horizons (time), and generalization to novel tasks.

reinforcement-learning Reinforcement Learning (RL)

WikiWhy: Answering and Explaining Cause-and-Effect Questions

no code implementations21 Oct 2022 Matthew Ho, Aditya Sharma, Justin Chang, Michael Saxon, Sharon Levy, Yujie Lu, William Yang Wang

As large language models (LLMs) grow larger and more sophisticated, assessing their "reasoning" capabilities in natural language grows more challenging.

Question Answering

ULN: Towards Underspecified Vision-and-Language Navigation

1 code implementation18 Oct 2022 Weixi Feng, Tsu-Jui Fu, Yujie Lu, William Yang Wang

Vision-and-Language Navigation (VLN) is a task to guide an embodied agent moving to a target position using language instructions.

Vision and Language Navigation

Structured Knowledge Grounding for Question Answering

no code implementations17 Sep 2022 Yujie Lu, Siqi Ouyang, Kairui Zhou

In this paper, we propose to solely leverage the LMs to combine the language and knowledge for knowledge based question-answering with flexibility, breadth of coverage and structured reasoning.

Knowledge Graphs Open-Ended Question Answering +1

Anticipating the Unseen Discrepancy for Vision and Language Navigation

no code implementations10 Sep 2022 Yujie Lu, Huiliang Zhang, Ping Nie, Weixi Feng, Wenda Xu, Xin Eric Wang, William Yang Wang

In this paper, we propose an Unseen Discrepancy Anticipating Vision and Language Navigation (DAVIS) that learns to generalize to unseen environments via encouraging test-time visual consistency.

Data Augmentation Decision Making +3

Few-Shot Document-Level Event Argument Extraction

1 code implementation6 Sep 2022 Xianjun Yang, Yujie Lu, Linda Petzold

To fill this gap, we present FewDocAE, a Few-Shot Document-Level Event Argument Extraction benchmark, based on the existing document-level event extraction dataset.

Document-level Event Extraction Event Argument Extraction +2

Re4: Learning to Re-contrast, Re-attend, Re-construct for Multi-interest Recommendation

1 code implementation17 Aug 2022 Shengyu Zhang, Lingxiao Yang, Dong Yao, Yujie Lu, Fuli Feng, Zhou Zhao, Tat-Seng Chua, Fei Wu

Specifically, Re4 encapsulates three backward flows, i. e., 1) Re-contrast, which drives each interest embedding to be distinct from other interests using contrastive learning; 2) Re-attend, which ensures the interest-item correlation estimation in the forward flow to be consistent with the criterion used in final recommendation; and 3) Re-construct, which ensures that each interest embedding can semantically reflect the information of representative items that relate to the corresponding interest.

Contrastive Learning Recommendation Systems

Neuro-Symbolic Procedural Planning with Commonsense Prompting

no code implementations6 Jun 2022 Yujie Lu, Weixi Feng, Wanrong Zhu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Yang Wang

Procedural planning aims to implement complex high-level goals by decomposition into sequential simpler low-level steps.

Graph Sampling

Imagination-Augmented Natural Language Understanding

1 code implementation NAACL 2022 Yujie Lu, Wanrong Zhu, Xin Eric Wang, Miguel Eckstein, William Yang Wang

Human brains integrate linguistic and perceptual information simultaneously to understand natural language, and hold the critical ability to render imaginations.

Natural Language Understanding

AstBERT: Enabling Language Model for Financial Code Understanding with Abstract Syntax Trees

no code implementations20 Jan 2022 Rong Liang, Tiehua Zhang, Yujie Lu, Yuze Liu, Zhen Huang, Xin Chen

Specifically, we collect a sheer number of source codes (both Java and Python) from the Alipay code repository and incorporate both syntactic and semantic code knowledge into our model through the help of code parsers, in which AST information of the source codes can be interpreted and integrated.

Clone Detection Code Search +2

High-fidelity 3D Model Compression based on Key Spheres

1 code implementation19 Jan 2022 Yuanzhan Li, Yuqi Liu, Yujie Lu, Siyu Zhang, Shen Cai, Yanting Zhang

Compared to previous works, our method achieves the high-fidelity and high-compression 3D object coding and reconstruction.

Model Compression Object +1

MIC: Model-agnostic Integrated Cross-channel Recommenders

no code implementations22 Oct 2021 Yujie Lu, Ping Nie, Shengyu Zhang, Ming Zhao, Ruobing Xie, William Yang Wang, Yi Ren

However, existing work are primarily built upon pre-defined retrieval channels, including User-CF (U2U), Item-CF (I2I), and Embedding-based Retrieval (U2I), thus access to the limited correlation between users and items which solely entail from partial information of latent interactions.

Recommendation Systems Retrieval +2

Federated Natural Language Generation for Personalized Dialogue System

no code implementations13 Oct 2021 Yujie Lu, Chao Huang, Huanli Zhan, Yong Zhuang

FedNLG first pre-trains parameters of standard neural conversational model over a large dialogue corpus, and then fine-tune the model parameters and persona embeddings on specific datasets, in a federated manner.

Text Generation

Multi-trends Enhanced Dynamic Micro-video Recommendation

no code implementations8 Oct 2021 Yujie Lu, Yingxuan Huang, Shengyu Zhang, Wei Han, Hui Chen, Zhou Zhao, Fei Wu

In this paper, we propose the DMR framework to explicitly model dynamic multi-trends of users' current preference and make predictions based on both the history and future potential trends.

Recommendation Systems

RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms

1 code implementation3 Nov 2020 Wayne Xin Zhao, Shanlei Mu, Yupeng Hou, Zihan Lin, Yushuo Chen, Xingyu Pan, Kaiyuan Li, Yujie Lu, Hui Wang, Changxin Tian, Yingqian Min, Zhichao Feng, Xinyan Fan, Xu Chen, Pengfei Wang, Wendi Ji, Yaliang Li, Xiaoling Wang, Ji-Rong Wen

In this library, we implement 73 recommendation models on 28 benchmark datasets, covering the categories of general recommendation, sequential recommendation, context-aware recommendation and knowledge-based recommendation.

Collaborative Filtering Sequential Recommendation

Future-Aware Diverse Trends Framework for Recommendation

1 code implementation1 Nov 2020 Yujie Lu, Shengyu Zhang, Yingxuan Huang, Luyao Wang, Xinyao Yu, Zhou Zhao, Fei Wu

By diverse trends, supposing the future preferences can be diversified, we propose the diverse trends extractor and the time-aware mechanism to represent the possible trends of preferences for a given user with multiple vectors.

Representation Learning Sequential Recommendation

CLOUD: Contrastive Learning of Unsupervised Dynamics

no code implementations23 Oct 2020 Jianren Wang, Yujie Lu, Hang Zhao

Developing agents that can perform complex control tasks from high dimensional observations such as pixels is challenging due to difficulties in learning dynamics efficiently.

Contrastive Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.