Search Results for author: Zhenyang Li

Found 13 papers, 5 papers with code

Attribute-driven Disentangled Representation Learning for Multimodal Recommendation

no code implementations • 22 Dec 2023 • Zhenyang Li, Fan Liu, Yinwei Wei, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli

To obtain robust and independent representations for each factor associated with a specific attribute, we first disentangle the representations of features both within and across different modalities.

Attribute Multimodal Recommendation +1

Paper
Add Code

Unsupervised Anomaly Detection with Local-Sensitive VQVAE and Global-Sensitive Transformers

no code implementations • 29 Mar 2023 • Mingqing Wang, Jiawei Li, Zhenyang Li, Chengxiao Luo, Bin Chen, Shu-Tao Xia, Zhi Wang

In this work, the VQVAE focus on feature extraction and reconstruction of images, and the transformers fit the manifold and locate anomalies in the latent space.

Unsupervised Anomaly Detection

Paper
Add Code

Learning to Agree on Vision Attention for Visual Commonsense Reasoning

no code implementations • 4 Feb 2023 • Zhenyang Li, Yangyang Guo, Kejie Wang, Fan Liu, Liqiang Nie, Mohan Kankanhalli

Visual Commonsense Reasoning (VCR) remains a significant yet challenging research problem in the realm of visual reasoning.

Visual Commonsense Reasoning

Paper
Add Code

Alignment-guided Temporal Attention for Video Action Recognition

no code implementations • 30 Sep 2022 • Yizhou Zhao, Zhenyang Li, Xun Guo, Yan Lu

Temporal modeling is crucial for various video learning tasks.

Action Recognition Attribute +1

Paper
Add Code

Factorized and Controllable Neural Re-Rendering of Outdoor Scene for Photo Extrapolation

no code implementations • 14 Jul 2022 • Boming Zhao, Bangbang Yang, Zhenyang Li, Zuoyue Li, Guofeng Zhang, Jiashu Zhao, Dawei Yin, Zhaopeng Cui, Hujun Bao

Expanding an existing tourist photo from a partially captured scene to a full scene is one of the desired experiences for photography applications.

Paper
Add Code

Enhancing Multi-view Stereo with Contrastive Matching and Weighted Focal Loss

1 code implementation • 21 Jun 2022 • Yikang Ding, Zhenyang Li, Dihe Huang, Zhiheng Li, Kai Zhang

Learning-based multi-view stereo (MVS) methods have made impressive progress and surpassed traditional methods in recent years.

Contrastive Learning

257

Paper
Code

Joint Answering and Explanation for Visual Commonsense Reasoning

1 code implementation • 25 Feb 2022 • Zhenyang Li, Yangyang Guo, Kejie Wang, Yinwei Wei, Liqiang Nie, Mohan Kankanhalli

Given that our framework is model-agnostic, we apply it to the existing popular baselines and validate its effectiveness on the benchmark dataset.

Knowledge Distillation Question Answering +2

Paper
Code

LICHEE: Improving Language Model Pre-training with Multi-grained Tokenization

1 code implementation • Findings (ACL) 2021 • Weidong Guo, Mingjun Zhao, Lusheng Zhang, Di Niu, Jinwen Luo, Zhenhua Liu, Zhenyang Li, Jianbo Tang

Language model pre-training based on large corpora has achieved tremendous success in terms of constructing enriched contextual representations and has led to significant performance gains on a diverse range of Natural Language Understanding (NLU) tasks.

Language Modelling Natural Language Understanding

Paper
Code

A Benchmark dataset for both underwater image enhancement and underwater object detection

no code implementations • 29 Jun 2020 • Long Chen, Lei Tong, Feixiang Zhou, Zheheng Jiang, Zhenyang Li, Jialin Lv, Junyu Dong, Huiyu Zhou

To investigate how the underwater image enhancement methods influence the following underwater object detection tasks, in this paper, we provide a large-scale underwater object detection dataset with both bounding box annotations and high quality reference images, namely OUC dataset.

Image Enhancement Image Quality Assessment +3