Search Results for author: Kun Yan

Found 13 papers, 4 papers with code

Voila-A: Aligning Vision-Language Models with User's Gaze Attention

no code implementations • 22 Dec 2023 • Kun Yan, Lei Ji, Zeyu Wang, Yuntao Wang, Nan Duan, Shuai Ma

In this paper, we introduce gaze information, feasibly collected by AR or VR devices, as a proxy for human attention to guide VLMs and propose a novel approach, Voila-A, for gaze alignment to enhance the interpretability and effectiveness of these models in real-world applications.

Paper
Add Code

KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

no code implementations • 10 Jul 2023 • Gangwoo Kim, Hajung Kim, Lei Ji, Seongsu Bae, Chanhwi Kim, Mujeen Sung, Hyunjae Kim, Kun Yan, Eric Chang, Jaewoo Kang

In this paper, we introduce CheXOFA, a new pre-trained vision-language model (VLM) for the chest X-ray domain.

Language Modelling

Paper
Add Code

GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

1 code implementation • 27 Jun 2023 • Zhijian Hou, Lei Ji, Difei Gao, Wanjun Zhong, Kun Yan, Chao Li, Wing-Kwong Chan, Chong-Wah Ngo, Nan Duan, Mike Zheng Shou

Motivated by this, we leverage a two-stage pre-training strategy to train egocentric feature extractors and the grounding model on video narrations, and further fine-tune the model on annotated data.

Natural Language Queries

Paper
Code

Two-shot Video Object Segmentation

1 code implementation • CVPR 2023 • Kun Yan, Xiao Li, Fangyun Wei, Jinglu Wang, Chenbin Zhang, Ping Wang, Yan Lu

The underlying idea is to generate pseudo labels for unlabeled frames during training and to optimize the model on the combination of labeled and pseudo-labeled data.

Object Pseudo Label +5

Paper
Code

An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

no code implementations • 16 Nov 2022 • Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan

This technical report describes the CONE approach for Ego4D Natural Language Queries (NLQ) Challenge in ECCV 2022.

Contrastive Learning Natural Language Queries

Paper
Add Code

HORIZON: High-Resolution Semantically Controlled Panorama Synthesis

no code implementations • 10 Oct 2022 • Kun Yan, Lei Ji, Chenfei Wu, Jian Liang, Ming Zhou, Nan Duan, Shuai Ma

Panorama synthesis endeavors to craft captivating 360-degree visual landscapes, immersing users in the heart of virtual worlds.

Vocal Bursts Intensity Prediction

Paper
Add Code

CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding

1 code implementation • 22 Sep 2022 • Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan

This paper tackles an emerging and challenging problem of long video temporal grounding~(VTG) that localizes video moments related to a natural language (NL) query.

Contrastive Learning Video Grounding

Paper
Code

Inferring Prototypes for Multi-Label Few-Shot Image Classification with Word Vector Guided Attention

no code implementations • 2 Dec 2021 • Kun Yan, Chenbin Zhang, Jun Hou, Ping Wang, Zied Bouraoui, Shoaib Jameel, Steven Schockaert

A key feature of the multi-label setting is that images often have multiple labels, which typically refer to different regions of the image.

Descriptive Few-Shot Image Classification +1

Paper
Add Code

Control Image Captioning Spatially and Temporally

no code implementations • ACL 2021 • Kun Yan, Lei Ji, Huaishao Luo, Ming Zhou, Nan Duan, Shuai Ma

Moreover, the controllability and explainability of LoopCAG are validated by analyzing spatial and temporal sensitivity during the generation process.

Ranked #1 on Image Captioning on Localized Narratives

Contrastive Learning Image Captioning +1

Paper
Add Code

CETransformer: Casual Effect Estimation via Transformer Based Representation Learning

no code implementations • 19 Jul 2021 • Zhenyu Guo, Shuai Zheng, Zhizhe Liu, Kun Yan, Zhenfeng Zhu

Treatment effect estimation, which refers to the estimation of causal effects and aims to measure the strength of the causal relationship, is of great importance in many fields but is a challenging problem in practice.

counterfactual Representation Learning +1

Paper
Add Code

Aligning Visual Prototypes with BERT Embeddings for Few-Shot Learning

no code implementations • 21 May 2021 • Kun Yan, Zied Bouraoui, Ping Wang, Shoaib Jameel, Steven Schockaert

While the use of class names has already been explored in previous work, our approach differs in two key aspects.

Cross-Lingual Word Embeddings Few-Shot Learning +2

Paper
Add Code

Few-shot Image Classification with Multi-Facet Prototypes

no code implementations • 1 Feb 2021 • Kun Yan, Zied Bouraoui, Ping Wang, Shoaib Jameel, Steven Schockaert

The aim of few-shot learning (FSL) is to learn how to recognize image categories from a small number of training examples.

Classification Few-Shot Image Classification +2

Paper
Add Code

Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks

2 code implementations • NeurIPS 2019 • Zhonghui You, Kun Yan, Jinmian Ye, Meng Ma, Ping Wang

When the scaling factor is set to zero, it is equivalent to removing the corresponding filter.

193

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.