Search Results for author: Lichen Zhao

Found 7 papers, 5 papers with code

Distortion-aware Transformer in 360° Salient Object Detection

1 code implementation • 7 Aug 2023 • Yinjie Zhao, Lichen Zhao, Qian Yu, Jing Zhang, Lu Sheng, Dong Xu

The first is a Distortion Mapping Module, which guides the model to pre-adapt to distorted features globally.

Paper
Code

VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud

1 code implementation • CVPR 2023 • Ziqin Wang, Bowen Cheng, Lichen Zhao, Dong Xu, Yang Tang, Lu Sheng

Since 2D images provide rich semantics and scene graphs are in nature coped with languages, in this study, we propose Visual-Linguistic Semantics Assisted Training (VL-SAT) scheme that can significantly empower 3DSSG prediction models with discrimination about long-tailed and ambiguous semantic relations.

Ranked #1 on 3d scene graph generation on 3DSSG (using extra training data)

3d scene graph generation Relation

Paper
Code

Towards Explainable 3D Grounded Visual Question Answering: A New Benchmark and Strong Baseline

1 code implementation • 24 Sep 2022 • Lichen Zhao, Daigang Cai, Jing Zhang, Lu Sheng, Dong Xu, Rui Zheng, Yinjie Zhao, Lipeng Wang, Xibo Fan

We also propose a new 3D VQA framework to effectively predict the completely visually grounded and explainable answer.

Question Answering Visual Question Answering

Paper
Code

Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision

1 code implementation • 11 Mar 2022 • Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao

This is because researchers do not choose consistent training recipes and even use different data, hampering the fair comparison between different methods.

603

Paper
Code

3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds

no code implementations • CVPR 2022 • Daigang Cai, Lichen Zhao, Jing Zhang, Lu Sheng, Dong Xu

Observing that the 3D captioning task and the 3D grounding task contain both shared and complementary information in nature, in this work, we propose a unified framework to jointly solve these two distinct but closely related tasks in a synergistic fashion, which consists of both shared task-agnostic modules and lightweight task-specific modules.

Attribute Dense Captioning +1

Paper
Add Code

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

3 code implementations • ICLR 2022 • Yangguang Li, Feng Liang, Lichen Zhao, Yufeng Cui, Wanli Ouyang, Jing Shao, Fengwei Yu, Junjie Yan

Recently, large-scale Contrastive Language-Image Pre-training (CLIP) has attracted unprecedented attention for its impressive zero-shot recognition ability and excellent transferability to downstream tasks.

Zero-Shot Learning

649

Paper
Code

3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds

no code implementations • ICCV 2021 • Lichen Zhao, Daigang Cai, Lu Sheng, Dong Xu

Visual grounding on 3D point clouds is an emerging vision and language task that benefits various applications in understanding the 3D visual world.

Object Object Proposal Generation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.