Search Results for author: Yong Rui

Found 25 papers, 5 papers with code

Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting

1 code implementation29 Mar 2024 Haipeng Liu, Yang Wang, Biao Qian, Meng Wang, Yong Rui

Denoising diffusion probabilistic models for image inpainting aim to add the noise to the texture of image during the forward process and recover masked regions with unmasked ones of the texture via the reverse denoising process. Despite the meaningful semantics generation, the existing arts suffer from the semantic discrepancy between masked and unmasked regions, since the semantically dense unmasked texture fails to be completely degraded while the masked regions turn to the pure noise in diffusion process, leading to the large discrepancy between them.

Denoising Image Inpainting

DomainVerse: A Benchmark Towards Real-World Distribution Shifts For Tuning-Free Adaptive Domain Generalization

no code implementations5 Mar 2024 Feng Hou, Jin Yuan, Ying Yang, Yang Liu, Yang Zhang, Cheng Zhong, Zhongchao shi, Jianping Fan, Yong Rui, Zhiqiang He

With the recent advance of vision-language models (VLMs), viewed as natural source models, the cross-domain task changes to directly adapt the pre-trained source model to arbitrary target domains equipped with prior domain knowledge, and we name this task Adaptive Domain Generalization (ADG).

Domain Generalization

A Survey on Video Moment Localization

no code implementations13 Jun 2023 Meng Liu, Liqiang Nie, Yunxiao Wang, Meng Wang, Yong Rui

Video moment localization, also known as video moment retrieval, aiming to search a target segment within a video described by a given natural language query.

Moment Retrieval Retrieval +1

Epistemic Graph: A Plug-And-Play Module For Hybrid Representation Learning

no code implementations30 May 2023 Jin Yuan, Yang Zhang, Yangzhou Du, Zhongchao shi, Xin Geng, Jianping Fan, Yong Rui

In this paper, a novel Epistemic Graph Layer (EGLayer) is introduced to enable hybrid learning, enhancing the exchange of information between deep features and a structured knowledge graph.

Few-Shot Learning Knowledge Graphs +1

Learning cross space mapping via DNN using large scale click-through logs

no code implementations26 Feb 2023 Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui

The image and query are mapped to a common vector space via these two parts respectively, and image-query similarity is naturally defined as an inner product of their mappings in the space.

Image Classification Image Retrieval +1

Delving Globally into Texture and Structure for Image Inpainting

1 code implementation17 Sep 2022 Haipeng Liu, Yang Wang, Meng Wang, Yong Rui

Our model is orthogonal to the fashionable arts, such as Convolutional Neural Networks (CNNs), Attention and Transformer model, from the perspective of texture and structure information for image inpainting.

Image Inpainting

Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation

1 code implementation8 Apr 2022 Jin Yuan, Feng Hou, Yangzhou Du, Zhongchao shi, Xin Geng, Jianping Fan, Yong Rui

Domain adaptation (DA) tries to tackle the scenarios when the test data does not fully follow the same distribution of the training data, and multi-source domain adaptation (MSDA) is very attractive for real world applications.

Domain Adaptation Self-Supervised Learning +1

Graph Attention Transformer Network for Multi-Label Image Classification

1 code implementation8 Mar 2022 Jin Yuan, Shikai Chen, Yao Zhang, Zhongchao shi, Xin Geng, Jianping Fan, Yong Rui

Subsequently, we design the graph attention transformer layer to transfer this adjacency matrix to adapt to the current domain.

Classification Graph Attention +2

A Survey on Food Computing

no code implementations22 Aug 2018 Weiqing Min, Shuqiang Jiang, Linhu Liu, Yong Rui, Ramesh Jain

This is the first comprehensive survey that targets the study of computing technology for the food area and also offers a collection of research studies and technologies to benefit researchers and practitioners working in different food-related fields.

Computers and Society Multimedia

AI Oriented Large-Scale Video Management for Smart City: Technologies, Standards and Beyond

no code implementations5 Dec 2017 Ling-Yu Duan, Yihang Lou, Shiqi Wang, Wen Gao, Yong Rui

To practically facilitate deep neural network models in the large-scale video analysis, there are still unprecedented challenges for the large-scale video data management.

Management

Multi-Level Attention Networks for Visual Question Answering

no code implementations CVPR 2017 Dongfei Yu, Jianlong Fu, Tao Mei, Yong Rui

To solve the challenges, we propose a multi-level attention network for visual question answering that can simultaneously reduce the semantic gap by semantic attention and benefit fine-grained spatial inference by visual attention.

Question Answering Visual Question Answering

Enhancing Person Re-identification in a Self-trained Subspace

1 code implementation20 Apr 2017 Xun Yang, Meng Wang, Richang Hong, Qi Tian, Yong Rui

To address this problem, in this paper, we propose a self-trained subspace learning paradigm for person re-ID which effectively utilizes both labeled and unlabeled data to learn a discriminative subspace where person images across disjoint camera views can be easily matched.

Person Re-Identification

MSR-VTT: A Large Video Description Dataset for Bridging Video and Language

no code implementations CVPR 2016 Jun Xu, Tao Mei, Ting Yao, Yong Rui

In this paper we present MSR-VTT (standing for "ABC-Video to Text") which is a new large-scale video benchmark for video understanding, especially the emerging task of translating video to text.

Image Captioning Sentence +2

Joint Multiview Segmentation and Localization of RGB-D Images Using Depth-Induced Silhouette Consistency

no code implementations CVPR 2016 Chi Zhang, Zhiwei Li, Rui Cai, Hongyang Chao, Yong Rui

In this paper, we propose an RGB-D camera localization approach which takes an effective geometry constraint, i. e. silhouette consistency, into consideration.

Camera Localization Image Segmentation +2

Highlight Detection With Pairwise Deep Ranking for First-Person Video Summarization

no code implementations CVPR 2016 Ting Yao, Tao Mei, Yong Rui

The emergence of wearable devices such as portable cameras and smart glasses makes it possible to record life logging first-person videos.

Highlight Detection Video Summarization

Network Morphism

no code implementations5 Mar 2016 Tao Wei, Changhu Wang, Yong Rui, Chang Wen Chen

The second requirement for this network morphism is its ability to deal with non-linearity in a network.

MORPH

Relaxing From Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging

no code implementations ICCV 2015 Jianlong Fu, Yue Wu, Tao Mei, Jinqiao Wang, Hanqing Lu, Yong Rui

The development of deep learning has empowered machines with comparable capability of recognizing limited image categories to human beings.

Query Adaptive Similarity Measure for RGB-D Object Recognition

no code implementations ICCV 2015 Yanhua Cheng, Rui Cai, Chi Zhang, Zhiwei Li, Xin Zhao, Kaiqi Huang, Yong Rui

The reasons are in two-fold: (1) existing similarity measures are sensitive to object pose and scale changes, as well as intra-class variations; and (2) effectively fusing RGB and depth cues is still an open problem.

Object Object Recognition

Jointly Modeling Embedding and Translation to Bridge Video and Language

no code implementations CVPR 2016 Yingwei Pan, Tao Mei, Ting Yao, Houqiang Li, Yong Rui

Our proposed LSTM-E consists of three components: a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep RNN for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics.

Sentence Translation

Visualizing and Comparing Convolutional Neural Networks

no code implementations20 Dec 2014 Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui

Convolutional Neural Networks (CNNs) have achieved comparable error rates to well-trained human on ILSVRC2014 image classification task.

Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.