Search Results for author: Yucheng Han

Found 7 papers, 4 papers with code

Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object Detection

1 code implementation • 10 Jan 2024 • Yucheng Han, Na Zhao, Weiling Chen, Keng Teck Ma, Hanwang Zhang

Our DPKE enriches the knowledge of limited training data, particularly unlabeled data, from two perspectives: data-perspective and feature-perspective.

3D Object Detection Data Augmentation +2

Paper
Code

AppAgent: Multimodal Agents as Smartphone Users

no code implementations • 21 Dec 2023 • Chi Zhang, Zhao Yang, Jiaxuan Liu, Yucheng Han, Xin Chen, Zebiao Huang, Bin Fu, Gang Yu

Recent advancements in large language models (LLMs) have led to the creation of intelligent agents capable of performing complex tasks.

Navigate

Paper
Add Code

ICD-LM: Configuring Vision-Language In-Context Demonstrations by Language Modeling

1 code implementation • 15 Dec 2023 • Yingzhe Peng, Xu Yang, Haoxuan Ma, Shuo Xu, Chi Zhang, Yucheng Han, Hanwang Zhang

Moreover, during data construction, we use the LVLM intended for ICL implementation to validate the strength of each ICD sequence, resulting in a model-specific dataset and the ICD-LM trained by this dataset is also model-specific.

Image Captioning In-Context Learning +4

Paper
Code

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

no code implementations • 27 Nov 2023 • Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang

Next, we introduce ChartLlama, a multi-modal large language model that we've trained using our created dataset.

Language Modelling Large Language Model

Paper
Add Code

Prompt-aligned Gradient for Prompt Tuning

1 code implementation • ICCV 2023 • Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang

Thanks to the large pre-trained vision-language models (VLMs) like CLIP, we can craft a zero-shot classifier by "prompt", e. g., the confidence score of an image being "[CLASS]" can be obtained by using the VLM provided similarity measure between the image and the prompt sentence "a photo of a [CLASS]".

Domain Adaptation Few-Shot Learning +2

131

Paper
Code

Fast AdvProp

1 code implementation • ICLR 2022 • Jieru Mei, Yucheng Han, Yutong Bai, Yixiao Zhang, Yingwei Li, Xianhang Li, Alan Yuille, Cihang Xie

Specifically, our modifications in Fast AdvProp are guided by the hypothesis that disentangled learning with adversarial examples is the key for performance improvements, while other training recipes (e. g., paired clean and adversarial training samples, multi-step adversarial attackers) could be largely simplified.

Data Augmentation object-detection +1

Paper
Code

Relational Reasoning Over Spatial-Temporal Graphs for Video Summarization

no code implementations • IEEE Transactions on Image Processing 2022 • Wencheng Zhu, Yucheng Han, Jiwen Lu, Jie zhou

Then, we construct a temporal graph by using the aggregated representations of spatial graphs.

Ranked #1 on Video Summarization on TvSum (using extra training data)

Graph Classification Relation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.