no code implementations • 6 Feb 2024 • Fudan Zheng, Jindong Cao, Weijiang Yu, Zhiguang Chen, Nong Xiao, Yutong Lu
The weakly supervised prompt learning model only utilizes the classes of images in the dataset to guide the learning of the specific class vector in the prompt, while the learning of other context vectors in the prompt requires no manual annotations for guidance.
no code implementations • 6 Feb 2024 • Fudan Zheng, Mengfei Li, Ying Wang, Weijiang Yu, Ruixuan Wang, Zhiguang Chen, Nong Xiao, Yutong Lu
Given the above limitation in feature extraction, we propose a Globally-intensive Attention (GIA) module in the medical image encoder to simulate and integrate multi-view vision perception.
1 code implementation • 31 Aug 2023 • Yupan Huang, Zaiqiao Meng, Fangyu Liu, Yixuan Su, Nigel Collier, Yutong Lu
Our experiments validate the effectiveness of SparklesChat in understanding and reasoning across multiple images and dialogue turns.
1 code implementation • NeurIPS 2023 • Rui Jiao, Wenbing Huang, Peijia Lin, Jiaqi Han, Pin Chen, Yutong Lu, Yang Liu
To be specific, DiffCSP jointly generates the lattice and atom coordinates for each crystal by employing a periodic-E(3)-equivariant denoising model, to better model the crystal geometry.
no code implementations • 7 Apr 2023 • Gaojie Wu, Wei-Shi Zheng, Yutong Lu, Qi Tian
In this work, we propose a ladder self-attention block with multiple branches and a progressive shift mechanism to develop a light-weight transformer backbone that requires less computing resources (e. g. a relatively small number of parameters and FLOPs), termed Progressive Shift Ladder Transformer (PSLT).
no code implementations • 1 Apr 2023 • Yanci Zhang, Yutong Lu, Haitao Mao, Jiawei Huang, Cien Zhang, Xinyi Li, Rui Dai
Based on the output from our system, we construct a knowledge graph with more than 700 nodes and 1200 edges.
no code implementations • 18 Feb 2023 • Yanci Zhang, Mengjia Xia, Mingyang Li, Haitao Mao, Yutong Lu, Yupeng Lan, Jinlin Ye, Rui Dai
With the segmented Item sections, NLP techniques can directly apply on those Item sections related to downstream tasks.
no code implementations • 18 Feb 2023 • Yutong Lu, Gesine Reinert, Mihai Cucuringu
The time proximity of trades across stocks reveals interesting topological structures of the equity market in the United States.
no code implementations • 7 Dec 2022 • Jiangsu Du, Dongsheng Li, Yingpeng Wen, Jiazhi Jiang, Dan Huang, Xiangke Liao, Yutong Lu
In this paper, we propose a scalable evaluation methodology (SAIH) for analyzing the AI performance trend of HPC systems with scaling the problem sizes of customized AI applications.
no code implementations • 21 Sep 2022 • Yutong Lu, Gesine Reinert, Mihai Cucuringu
The time proximity of high-frequency trades can contain a salient signal.
no code implementations • 6 Sep 2022 • Jiangsu Du, Ziming Liu, Jiarui Fang, Shenggui Li, Yongbin Li, Yutong Lu, Yang You
Although the AI community has expanded the model scale to the trillion parameter level, the practical deployment of 10-100 billion parameter models is still uncertain due to the latency, throughput, and memory constraints.
2 code implementations • 18 Apr 2022 • Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
In this paper, we propose \textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with unified text and image masking.
Ranked #1 on Key Information Extraction on EPHOIE
1 code implementation • 19 Oct 2021 • Yupan Huang, Bei Liu, Jianlong Fu, Yutong Lu
In this work, we demonstrate such an AI creation system to produce both diverse captions and rich images.
1 code implementation • 19 Oct 2021 • Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
We adopt Transformer as our unified architecture for its strong performance and task-agnostic design.
no code implementations • ICCV 2021 • Yi Zhu, Yue Weng, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Yutong Lu, Jianbin Jiao
Vision-Dialog Navigation (VDN) requires an agent to ask questions and navigate following the human responses to find target objects.
1 code implementation • 16 Apr 2019 • Yupan Huang, Qi Dai, Yutong Lu
Each branch produces a set of action anchor layers by applying deconvolution to the feature maps of the main stream.
Ranked #26 on Temporal Action Localization on THUMOS’14