Search Results for author: Yutong Lu

Found 16 papers, 6 papers with code

Exploring Low-Resource Medical Image Classification with Weakly Supervised Prompt Learning

no code implementations6 Feb 2024 Fudan Zheng, Jindong Cao, Weijiang Yu, Zhiguang Chen, Nong Xiao, Yutong Lu

The weakly supervised prompt learning model only utilizes the classes of images in the dataset to guide the learning of the specific class vector in the prompt, while the learning of other context vectors in the prompt requires no manual annotations for guidance.

Few-Shot Learning Image Classification +3

Intensive Vision-guided Network for Radiology Report Generation

no code implementations6 Feb 2024 Fudan Zheng, Mengfei Li, Ying Wang, Weijiang Yu, Ruixuan Wang, Zhiguang Chen, Nong Xiao, Yutong Lu

Given the above limitation in feature extraction, we propose a Globally-intensive Attention (GIA) module in the medical image encoder to simulate and integrate multi-view vision perception.

Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models

1 code implementation31 Aug 2023 Yupan Huang, Zaiqiao Meng, Fangyu Liu, Yixuan Su, Nigel Collier, Yutong Lu

Our experiments validate the effectiveness of SparklesChat in understanding and reasoning across multiple images and dialogue turns.

Instruction Following Visual Reasoning

Crystal Structure Prediction by Joint Equivariant Diffusion

1 code implementation NeurIPS 2023 Rui Jiao, Wenbing Huang, Peijia Lin, Jiaqi Han, Pin Chen, Yutong Lu, Yang Liu

To be specific, DiffCSP jointly generates the lattice and atom coordinates for each crystal by employing a periodic-E(3)-equivariant denoising model, to better model the crystal geometry.

Denoising

PSLT: A Light-weight Vision Transformer with Ladder Self-Attention and Progressive Shift

no code implementations7 Apr 2023 Gaojie Wu, Wei-Shi Zheng, Yutong Lu, Qi Tian

In this work, we propose a ladder self-attention block with multiple branches and a progressive shift mechanism to develop a light-weight transformer backbone that requires less computing resources (e. g. a relatively small number of parameters and FLOPs), termed Progressive Shift Ladder Transformer (PSLT).

Image Classification Person Re-Identification

Company Competition Graph

no code implementations1 Apr 2023 Yanci Zhang, Yutong Lu, Haitao Mao, Jiawei Huang, Cien Zhang, Xinyi Li, Rui Dai

Based on the output from our system, we construct a knowledge graph with more than 700 nodes and 1200 edges.

Knowledge Graphs

Form 10-K Itemization

no code implementations18 Feb 2023 Yanci Zhang, Mengjia Xia, Mingyang Li, Haitao Mao, Yutong Lu, Yupeng Lan, Jinlin Ye, Rui Dai

With the segmented Item sections, NLP techniques can directly apply on those Item sections related to downstream tasks.

Retrieval

Co-trading networks for modeling dynamic interdependency structures and estimating high-dimensional covariances in US equity markets

no code implementations18 Feb 2023 Yutong Lu, Gesine Reinert, Mihai Cucuringu

The time proximity of trades across stocks reveals interesting topological structures of the equity market in the United States.

SAIH: A Scalable Evaluation Methodology for Understanding AI Performance Trend on HPC Systems

no code implementations7 Dec 2022 Jiangsu Du, Dongsheng Li, Yingpeng Wen, Jiazhi Jiang, Dan Huang, Xiangke Liao, Yutong Lu

In this paper, we propose a scalable evaluation methodology (SAIH) for analyzing the AI performance trend of HPC systems with scaling the problem sizes of customized AI applications.

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models

no code implementations6 Sep 2022 Jiangsu Du, Ziming Liu, Jiarui Fang, Shenggui Li, Yongbin Li, Yutong Lu, Yang You

Although the AI community has expanded the model scale to the trillion parameter level, the practical deployment of 10-100 billion parameter models is still uncertain due to the latency, throughput, and memory constraints.

Blocking

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

2 code implementations18 Apr 2022 Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei

In this paper, we propose \textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with unified text and image masking.

Document AI Document Image Classification +10

A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation

1 code implementation19 Oct 2021 Yupan Huang, Bei Liu, Jianlong Fu, Yutong Lu

In this work, we demonstrate such an AI creation system to produce both diverse captions and rich images.

Unifying Multimodal Transformer for Bi-directional Image and Text Generation

1 code implementation19 Oct 2021 Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu

We adopt Transformer as our unified architecture for its strong performance and task-agnostic design.

Text Generation Text-to-Image Generation

Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation

no code implementations ICCV 2021 Yi Zhu, Yue Weng, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Yutong Lu, Jianbin Jiao

Vision-Dialog Navigation (VDN) requires an agent to ask questions and navigate following the human responses to find target objects.

Imitation Learning Navigate

Decoupling Localization and Classification in Single Shot Temporal Action Detection

1 code implementation16 Apr 2019 Yupan Huang, Qi Dai, Yutong Lu

Each branch produces a set of action anchor layers by applying deconvolution to the feature maps of the main stream.

Action Detection Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.