Search Results for author: Kaicheng Yu

Found 32 papers, 16 papers with code

BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science

1 code implementation29 Jun 2024 Xinna Lin, Siqi Ma, Junjie Shan, Xiaojing Zhang, Shell Xu Hu, Tiannan Guo, Stan Z. Li, Kaicheng Yu

On the widely used popular knowledge graph, we discover over 90 factual errors which provide scenarios for agents to make discoveries and demonstrate the effectiveness of our approach.

AI Agent Claim Verification +4

Gentle-CLIP: Exploring Aligned Semantic In Low-Quality Multimodal Data With Soft Alignment

no code implementations9 Jun 2024 Zijia Song, Zelin Zang, Yelin Wang, Guozheng Yang, Jiangbin Zheng, Kaicheng Yu, Wanyu Chen, Stan Z. Li

Multimodal fusion breaks through the barriers between diverse modalities and has already yielded numerous impressive performances.

M3GIA: A Cognition Inspired Multilingual and Multimodal General Intelligence Ability Benchmark

1 code implementation8 Jun 2024 Wei Song, Yadong Li, Jianhua Xu, Guowei Wu, Lingfeng Ming, Kexin Yi, Weihua Luo, Houyi Li, Yi Du, Fangda Guo, Kaicheng Yu

As recent multi-modality large language models (MLLMs) have shown formidable proficiency on various complex tasks, there has been increasing attention on debating whether these models could eventually mirror human intelligence.


Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

no code implementations3 Jun 2024 Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu

Instead of randomly generating new data, we further design a sampling policy to let Delphi generate new data that are similar to those failure cases to improve the sample efficiency.

Autonomous Driving Video Generation

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis

1 code implementation CVPR 2024 Tao Tang, Guangrun Wang, Yixing Lao, Peng Chen, Jie Liu, Liang Lin, Kaicheng Yu, Xiaodan Liang

Through extensive experiments across various datasets and scenes, we demonstrate the effectiveness of our approach in facilitating better interaction between LiDAR and camera modalities within a unified neural field.

Novel View Synthesis

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

no code implementations12 Dec 2023 Hu Zhang, Jianhua Xu, Tao Tang, Haiyang Sun, Xin Yu, Zi Huang, Kaicheng Yu

OpenSight utilizes 2D-3D geometric priors for the initial discernment and localization of generic objects, followed by a more specific semantic interpretation of the detected objects.

cross-modal alignment object-detection +1

BEVHeight++: Toward Robust Visual Centric 3D Object Detection

no code implementations28 Sep 2023 Lei Yang, Tao Tang, Jun Li, Peng Chen, Kun Yuan, Li Wang, Yi Huang, Xinyu Zhang, Kaicheng Yu

In essence, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods.

3D Object Detection Autonomous Driving +2

Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training

1 code implementation CVPR 2024 Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao

In contrast, such privilege has not yet fully benefited 3D deep learning, mainly due to the limited availability of large-scale 3D datasets.

Ranked #3 on 3D Semantic Segmentation on SemanticKITTI (val mIoU metric, using extra training data)

3D Semantic Segmentation LIDAR Semantic Segmentation +1

FusionAD: Multi-modality Fusion for Prediction and Planning Tasks of Autonomous Driving

1 code implementation2 Aug 2023 Tengju Ye, Wei Jing, Chunyong Hu, Shikun Huang, Lingping Gao, Fangzhen Li, Jingke Wang, Ke Guo, Wencong Xiao, Weibo Mao, Hang Zheng, Kun Li, Junbo Chen, Kaicheng Yu

Building a multi-modality multi-task neural network toward accurate and robust performance is a de-facto standard in perception task of autonomous driving.

Autonomous Driving

Dyn-E: Local Appearance Editing of Dynamic Neural Radiance Fields

no code implementations24 Jul 2023 Shangzhan Zhang, Sida Peng, Yinji ShenTu, Qing Shuai, Tianrun Chen, Kaicheng Yu, Hujun Bao, Xiaowei Zhou

We extensively evaluate our approach on various scenes and show that our approach achieves spatially and temporally consistent editing results.

LiDAR-NeRF: Novel LiDAR View Synthesis via Neural Radiance Fields

1 code implementation20 Apr 2023 Tang Tao, Longfei Gao, Guangrun Wang, Yixing Lao, Peng Chen, Hengshuang Zhao, Dayang Hao, Xiaodan Liang, Mathieu Salzmann, Kaicheng Yu

We address this challenge by formulating, to the best of our knowledge, the first differentiable end-to-end LiDAR rendering framework, LiDAR-NeRF, leveraging a neural radiance field (NeRF) to facilitate the joint learning of geometry and the attributes of 3D points.

3D Reconstruction Novel LiDAR View Synthesis +1

BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection

1 code implementation CVPR 2023 Lei Yang, Kaicheng Yu, Tao Tang, Jun Li, Kun Yuan, Li Wang, Xinyu Zhang, Peng Chen

In essence, instead of predicting the pixel-wise depth, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods.

3D Object Detection Autonomous Driving +1

Painting 3D Nature in 2D: View Synthesis of Natural Scenes from a Single Semantic Mask

no code implementations CVPR 2023 Shangzhan Zhang, Sida Peng, Tianrun Chen, Linzhan Mou, Haotong Lin, Kaicheng Yu, Yiyi Liao, Xiaowei Zhou

We introduce a novel approach that takes a single semantic mask as input to synthesize multi-view consistent color images of natural scenes, trained with a collection of single images from the Internet.

3D-Aware Image Synthesis

Learning Self-Regularized Adversarial Views for Self-Supervised Vision Transformers

1 code implementation16 Oct 2022 Tao Tang, Changlin Li, Guangrun Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang

Despite the success, its development and application on self-supervised vision transformers have been hindered by several barriers, including the high search cost, the lack of supervision, and the unsuitable search space.

Data Augmentation Image Retrieval +3

NAS-Bench-Suite: NAS Evaluation is (Now) Surprisingly Easy

1 code implementation ICLR 2022 Yash Mehta, Colin White, Arber Zela, Arjun Krishnakumar, Guri Zabergja, Shakiba Moradian, Mahmoud Safari, Kaicheng Yu, Frank Hutter

The release of tabular benchmarks, such as NAS-Bench-101 and NAS-Bench-201, has significantly lowered the computational overhead for conducting scientific research in neural architecture search (NAS).

Image Classification Neural Architecture Search +4

An Analysis of Super-Net Heuristics in Weight-Sharing NAS

no code implementations4 Oct 2021 Kaicheng Yu, René Ranftl, Mathieu Salzmann

Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware.

Neural Architecture Search

Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

1 code implementation CVPR 2021 Kaicheng Yu, Rene Ranftl, Mathieu Salzmann

Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware.

Neural Architecture Search

How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS

no code implementations9 Mar 2020 Kaicheng Yu, Rene Ranftl, Mathieu Salzmann

Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware.

Neural Architecture Search

Recurrent U-Net for Resource-Constrained Segmentation

no code implementations ICCV 2019 Wei Wang, Kaicheng Yu, Joachim Hugonot, Pascal Fua, Mathieu Salzmann

State-of-the-art segmentation methods rely on very deep networks that are not always easy to train without very large training datasets and tend to be relatively slow to run on standard GPUs.

Hand Segmentation Road Segmentation +1

Overcoming Multi-Model Forgetting

no code implementations ICLR 2019 Yassine Benyahia, Kaicheng Yu, Kamil Bennani-Smires, Martin Jaggi, Anthony Davison, Mathieu Salzmann, Claudiu Musat

We identify a phenomenon, which we refer to as multi-model forgetting, that occurs when sequentially training multiple deep networks with partially-shared parameters; the performance of previously-trained models degrades as one optimizes a subsequent one, due to the overwriting of shared parameters.

Neural Architecture Search

Beyond One Glance: Gated Recurrent Architecture for Hand Segmentation

no code implementations27 Nov 2018 Wei Wang, Kaicheng Yu, Joachim Hugonot, Pascal Fua, Mathieu Salzmann

As evidenced by our results on standard hand segmentation benchmarks and on our own dataset, our approach outperforms these other, simpler recurrent segmentation techniques, as well as the state-of-the-art hand segmentation one.

Decoder Hand Segmentation +3

Statistically-motivated Second-order Pooling

1 code implementation ECCV 2018 Kaicheng Yu, Mathieu Salzmann

We then propose to make use of a square-root normalization, which makes the distribution of the resulting representation converge to a Gaussian, with which most classifiers of recent first-order networks complying.

Statistically Motivated Second Order Pooling

1 code implementation23 Jan 2018 Kaicheng Yu, Mathieu Salzmann

Our approach is motivated by a statistical analysis of the network's activations, relying on operations that lead to a Gaussian-distributed final representation, as inherently used by first-order deep networks.

Second-order Convolutional Neural Networks

no code implementations20 Mar 2017 Kaicheng Yu, Mathieu Salzmann

By performing linear combinations and element-wise nonlinear operations, these networks can be thought of as extracting solely first-order information from an input image.

Image Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.