1 code implementation • 5 Apr 2024 • Michael Saxon, Fatima Jahara, Mahsa Khoshnoodi, Yujie Lu, Aditya Sharma, William Yang Wang
With advances in the quality of text-to-image (T2I) models has come interest in benchmarking their prompt faithfulness-the semantic coherence of generated images to the prompts they were conditioned on.
1 code implementation • 3 Mar 2024 • Yujie Lu, Long Wan, Nayu Ding, Yulong Wang, Shuhan Shen, Shen Cai, Lin Gao
However, common distance field based implicit representations, specifically signed distance field (SDF) for watertight shapes or unsigned distance field (UDF) for arbitrary shapes, routinely suffer from degradation of reconstruction accuracy when converting to explicit surface points and meshes.
no code implementations • 29 Nov 2023 • Yujie Lu, Xiujun Li, William Yang Wang, Yejin Choi
We introduce VISUAL EMBEDDED INSTRUCTION (VIM), a new framework designed to evaluate the visual instruction following capability of Multimodal Large Language Models (MLLMs).
no code implementations • 2 Nov 2023 • Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan, Lianke Qin, Heng Wang, Xifeng Yan, William Yang Wang, Linda Ruth Petzold
Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details.
1 code implementation • 11 Oct 2023 • Zhiyu Chen, Yujie Lu, William Yang Wang
Mental illness remains one of the most critical public health issues of our time, due to the severe scarcity and accessibility limit of professionals.
2 code implementations • 2 Oct 2023 • Max Ku, Tianle Li, Kai Zhang, Yujie Lu, Xingyu Fu, Wenwen Zhuang, Wenhu Chen
Recently, a myriad of conditional image generation and editing models have been developed to serve different downstream tasks, including text-to-image generation, text-guided image editing, subject-driven image generation, control-guided image generation, etc.
1 code implementation • ICCV 2023 • An Yan, Yu Wang, Yiwu Zhong, chengyu dong, Zexue He, Yujie Lu, William Wang, Jingbo Shang, Julian McAuley
Recent advances in foundation models present new opportunities for interpretable visual recognition -- one can first query Large Language Models (LLMs) to obtain a set of attributes that describe each class, then apply vision-language models to classify images via these attributes.
1 code implementation • 23 May 2023 • Vaishnavi Himakunthala, Andy Ouyang, Daniel Rose, Ryan He, Alex Mei, Yujie Lu, Chinmay Sonar, Michael Saxon, William Yang Wang
Despite exciting recent results showing vision-language systems' capacity to reason about images using natural language, their capacity for video reasoning remains under-explored.
no code implementations • 18 May 2023 • Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
We conduct a series of experiments to compare the common edits made by humans and GPT-k, evaluate the performance of GPT-k in prompting T2I, and examine factors that may influence this process.
1 code implementation • NeurIPS 2023 • Yujie Lu, Xianjun Yang, Xiujun Li, Xin Eric Wang, William Yang Wang
Existing automatic evaluation on text-to-image synthesis can only provide an image-text matching score, without considering the object-level compositionality, which results in poor correlation with human judgments.
no code implementations • 3 May 2023 • Daniel Rose, Vaishnavi Himakunthala, Andy Ouyang, Ryan He, Alex Mei, Yujie Lu, Michael Saxon, Chinmay Sonar, Diba Mirza, William Yang Wang
Recent advances in large language models elicit reasoning in a chain-of-thought that allows models to decompose problems in a human-like fashion.
1 code implementation • 2 May 2023 • Yujie Lu, Pan Lu, Zhiyu Chen, Wanrong Zhu, Xin Eric Wang, William Yang Wang
The key challenges of MPP are to ensure the informativeness, temporal coherence, and accuracy of plans across modalities.
1 code implementation • 27 Oct 2022 • Edwin Zhang, Yujie Lu, William Wang, Amy Zhang
Training generalist agents is difficult across several axes, requiring us to deal with high-dimensional inputs (space), long horizons (time), and generalization to novel tasks.
no code implementations • 21 Oct 2022 • Matthew Ho, Aditya Sharma, Justin Chang, Michael Saxon, Sharon Levy, Yujie Lu, William Yang Wang
As large language models (LLMs) grow larger and more sophisticated, assessing their "reasoning" capabilities in natural language grows more challenging.
1 code implementation • 18 Oct 2022 • Weixi Feng, Tsu-Jui Fu, Yujie Lu, William Yang Wang
Vision-and-Language Navigation (VLN) is a task to guide an embodied agent moving to a target position using language instructions.
no code implementations • 11 Oct 2022 • An Yan, Jiacheng Li, Wanrong Zhu, Yujie Lu, William Yang Wang, Julian McAuley
However, the application of its text encoder solely for text understanding has been less explored.
1 code implementation • 10 Oct 2022 • Wenda Xu, YiLin Tuan, Yujie Lu, Michael Saxon, Lei LI, William Yang Wang
Is it possible to build a general and automatic natural language generation (NLG) evaluation metric?
1 code implementation • 7 Oct 2022 • Wanrong Zhu, An Yan, Yujie Lu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
Recent advances in text-to-image synthesis make it possible to visualize machine imaginations for a given context.
no code implementations • 17 Sep 2022 • Yujie Lu, Siqi Ouyang, Kairui Zhou
In this paper, we propose to solely leverage the LMs to combine the language and knowledge for knowledge based question-answering with flexibility, breadth of coverage and structured reasoning.
no code implementations • 10 Sep 2022 • Yujie Lu, Huiliang Zhang, Ping Nie, Weixi Feng, Wenda Xu, Xin Eric Wang, William Yang Wang
In this paper, we propose an Unseen Discrepancy Anticipating Vision and Language Navigation (DAVIS) that learns to generalize to unseen environments via encouraging test-time visual consistency.
1 code implementation • 6 Sep 2022 • Xianjun Yang, Yujie Lu, Linda Petzold
To fill this gap, we present FewDocAE, a Few-Shot Document-Level Event Argument Extraction benchmark, based on the existing document-level event extraction dataset.
Document-level Event Extraction Event Argument Extraction +2
1 code implementation • 17 Aug 2022 • Shengyu Zhang, Lingxiao Yang, Dong Yao, Yujie Lu, Fuli Feng, Zhou Zhao, Tat-Seng Chua, Fei Wu
Specifically, Re4 encapsulates three backward flows, i. e., 1) Re-contrast, which drives each interest embedding to be distinct from other interests using contrastive learning; 2) Re-attend, which ensures the interest-item correlation estimation in the forward flow to be consistent with the criterion used in final recommendation; and 3) Re-construct, which ensures that each interest embedding can semantically reflect the information of representative items that relate to the corresponding interest.
no code implementations • 6 Jun 2022 • Yujie Lu, Weixi Feng, Wanrong Zhu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
Procedural planning aims to implement complex high-level goals by decomposition into sequential simpler low-level steps.
1 code implementation • NAACL 2022 • Yujie Lu, Wanrong Zhu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
Human brains integrate linguistic and perceptual information simultaneously to understand natural language, and hold the critical ability to render imaginations.
no code implementations • 20 Jan 2022 • Rong Liang, Tiehua Zhang, Yujie Lu, Yuze Liu, Zhen Huang, Xin Chen
Specifically, we collect a sheer number of source codes (both Java and Python) from the Alipay code repository and incorporate both syntactic and semantic code knowledge into our model through the help of code parsers, in which AST information of the source codes can be interpreted and integrated.
1 code implementation • 19 Jan 2022 • Yuanzhan Li, Yuqi Liu, Yujie Lu, Siyu Zhang, Shen Cai, Yanting Zhang
Compared to previous works, our method achieves the high-fidelity and high-compression 3D object coding and reconstruction.
no code implementations • 22 Oct 2021 • Yujie Lu, Ping Nie, Shengyu Zhang, Ming Zhao, Ruobing Xie, William Yang Wang, Yi Ren
However, existing work are primarily built upon pre-defined retrieval channels, including User-CF (U2U), Item-CF (I2I), and Embedding-based Retrieval (U2I), thus access to the limited correlation between users and items which solely entail from partial information of latent interactions.
no code implementations • 13 Oct 2021 • Yujie Lu, Chao Huang, Huanli Zhan, Yong Zhuang
FedNLG first pre-trains parameters of standard neural conversational model over a large dialogue corpus, and then fine-tune the model parameters and persona embeddings on specific datasets, in a federated manner.
no code implementations • 8 Oct 2021 • Yujie Lu, Yingxuan Huang, Shengyu Zhang, Wei Han, Hui Chen, Zhou Zhao, Fei Wu
In this paper, we propose the DMR framework to explicitly model dynamic multi-trends of users' current preference and make predictions based on both the history and future potential trends.
1 code implementation • 3 Nov 2020 • Wayne Xin Zhao, Shanlei Mu, Yupeng Hou, Zihan Lin, Yushuo Chen, Xingyu Pan, Kaiyuan Li, Yujie Lu, Hui Wang, Changxin Tian, Yingqian Min, Zhichao Feng, Xinyan Fan, Xu Chen, Pengfei Wang, Wendi Ji, Yaliang Li, Xiaoling Wang, Ji-Rong Wen
In this library, we implement 73 recommendation models on 28 benchmark datasets, covering the categories of general recommendation, sequential recommendation, context-aware recommendation and knowledge-based recommendation.
1 code implementation • 1 Nov 2020 • Yujie Lu, Shengyu Zhang, Yingxuan Huang, Luyao Wang, Xinyao Yu, Zhou Zhao, Fei Wu
By diverse trends, supposing the future preferences can be diversified, we propose the diverse trends extractor and the time-aware mechanism to represent the possible trends of preferences for a given user with multiple vectors.
no code implementations • 23 Oct 2020 • Jianren Wang, Yujie Lu, Hang Zhao
Developing agents that can perform complex control tasks from high dimensional observations such as pixels is challenging due to difficulties in learning dynamics efficiently.
no code implementations • PACLIC 2018 • Yujie Lu, Tatsunori Mori
In this paper, we propose a new deep learning paradigm to assimilate the differences between languages for MSA.