no code implementations • 19 Dec 2019 • Weisong Wen, Yiyang Zhou, Guohao Zhang, Saman Fahandezh-Saadi, Xiwei Bai, Wei Zhan, Masayoshi Tomizuka, Li-Ta Hsu
Mapping and localization is a critical module of autonomous driving, and significant achievements have been reached in this field.
no code implementations • 7 Mar 2020 • Zining Wang, Di Feng, Yiyang Zhou, Lars Rosenbaum, Fabian Timm, Klaus Dietmayer, Masayoshi Tomizuka, Wei Zhan
Based on the spatial distribution, we further propose an extension of IoU, called the Jaccard IoU (JIoU), as a new evaluation metric that incorporates label uncertainty.
no code implementations • 18 Dec 2020 • Di Feng, Zining Wang, Yiyang Zhou, Lars Rosenbaum, Fabian Timm, Klaus Dietmayer, Masayoshi Tomizuka, Wei Zhan
As a result, an in-depth evaluation among different object detection methods remains challenging, and the training process of object detectors is sub-optimal, especially in probabilistic object detection.
no code implementations • 8 Jul 2022 • Akio Kodaira, Yiyang Zhou, Pengwei Zang, Wei Zhan, Masayoshi Tomizuka
With information from multiple input modalities, sensor fusion-based algorithms usually out-perform their single-modality counterparts in robotics.
no code implementations • 26 Sep 2022 • Philip Jacobson, Yiyang Zhou, Wei Zhan, Masayoshi Tomizuka, Ming C. Wu
In this work, we propose a novel approach Center Feature Fusion (CFF), in which we leverage center-based detection networks in both the camera and LiDAR streams to identify relevant object locations.
no code implementations • 26 Feb 2023 • Yiyang Zhou, Qinghai Zheng, Wenbiao Yan, Yifei Wang, Pengcheng Shi, Jihua Zhu
Further, we designed a multi-level consistency collaboration strategy, which utilizes the consistent information of semantic space as a self-supervised signal to collaborate with the cluster assignments in feature space.
Ranked #1 on Multiview Clustering on Fashion-MNIST
no code implementations • 28 Feb 2023 • Wenbiao Yan, Jihua Zhu, Yiyang Zhou, Yifei Wang, Qinghai Zheng
In this way, the learned semantic consistency from multi-view data can improve the information bottleneck to more exactly distinguish the consistent information and learn a unified feature representation with more discriminative consistent information for clustering.
no code implementations • 8 Mar 2023 • Yiyang Zhou, Qinghai Zheng, Shunshun Bai, Jihua Zhu
In this work, we devote ourselves to the challenging task of Unsupervised Multi-view Representation Learning (UMRL), which requires learning a unified feature representation from multiple views in an unsupervised manner.
no code implementations • 16 May 2023 • Pengcheng Shi, Haozhe Cheng, Xu Han, Yiyang Zhou, Jihua Zhu
To tackle these challenges, we propose an information interaction-based generative network for point cloud completion ($\mathbf{DualGenerator}$).
no code implementations • 16 May 2023 • Yifei Wang, Yiyang Zhou, Jihua Zhu, Xinyuan Liu, Wenbiao Yan, Zhiqiang Tian
Label distribution learning (LDL) is a new machine learning paradigm for solving label ambiguity.
no code implementations • 18 Aug 2023 • Pengcheng Shi, Jie Zhang, Haozhe Cheng, Junyang Wang, Yiyang Zhou, Chenlin Zhao, Jihua Zhu
Specifically, we propose a plug-and-play Overlap Bias Matching Module (OBMM) comprising two integral components, overlap sampling module and bias prediction module.
no code implementations • 19 Aug 2023 • Jie Zhang, Pengcheng Shi, Zaiwang Gu, Yiyang Zhou, Zhi Wang
In this paper, we present Semantic-Human, a novel method that achieves both photorealistic details and viewpoint-consistent human parsing for the neural rendering of humans.
1 code implementation • 29 Aug 2023 • Junyang Wang, Yiyang Zhou, Guohai Xu, Pengcheng Shi, Chenlin Zhao, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang, Jihua Zhu, Jitao Sang, Haoyu Tang
In this paper, we propose Hallucination Evaluation based on Large Language Models (HaELM), an LLM-based hallucination evaluation framework.
1 code implementation • 17 Mar 2022 • Jinhyung Park, Chenfeng Xu, Yiyang Zhou, Masayoshi Tomizuka, Wei Zhan
While numerous 3D detection works leverage the complementary relationship between RGB images and point clouds, developments in the broader framework of semi-supervised object recognition remain uninfluenced by multi-modal fusion.
1 code implementation • 18 Feb 2024 • Yiyang Zhou, Chenhang Cui, Rafael Rafailov, Chelsea Finn, Huaxiu Yao
This procedure is not perfect and can cause the model to hallucinate - provide answers that do not accurately reflect the image, even when the core LLM is highly factual and the vision backbone has sufficiently complete representations.
1 code implementation • 19 Jul 2022 • Guangming Wang, Yunzhe Hu, Zhe Liu, Yiyang Zhou, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang
Our proposed model surpasses all existing methods by at least 38. 2% on FlyingThings3D dataset and 24. 7% on KITTI Scene Flow dataset for EPE3D metric.
1 code implementation • 27 Nov 2023 • Haoqin Tu, Chenhang Cui, Zijun Wang, Yiyang Zhou, Bingchen Zhao, Junlin Han, Wangchunshu Zhou, Huaxiu Yao, Cihang Xie
Different from prior studies, we shift our focus from evaluating standard performance to introducing a comprehensive safety evaluation suite, covering both out-of-distribution (OOD) generalization and adversarial robustness.
1 code implementation • 6 Nov 2023 • Chenhang Cui, Yiyang Zhou, Xinyu Yang, Shirley Wu, Linjun Zhang, James Zou, Huaxiu Yao
To bridge this gap, we introduce a new benchmark, namely, the Bias and Interference Challenges in Visual Language Models (Bingo).
1 code implementation • 6 Mar 2021 • Di Feng, Yiyang Zhou, Chenfeng Xu, Masayoshi Tomizuka, Wei Zhan
Detecting dynamic objects and predicting static road information such as drivable areas and ground heights are crucial for safe autonomous driving.
1 code implementation • 1 Oct 2023 • Yiyang Zhou, Chenhang Cui, Jaehong Yoon, Linjun Zhang, Zhun Deng, Chelsea Finn, Mohit Bansal, Huaxiu Yao
Large vision-language models (LVLMs) have shown remarkable abilities in understanding visual information with human languages.
1 code implementation • 27 Apr 2023 • Qinghao Ye, Haiyang Xu, Guohai Xu, Jiabo Ye, Ming Yan, Yiyang Zhou, Junyang Wang, Anwen Hu, Pengcheng Shi, Yaya Shi, Chenliang Li, Yuanhong Xu, Hehong Chen, Junfeng Tian, Qi Qian, Ji Zhang, Fei Huang, Jingren Zhou
Our code, pre-trained model, instruction-tuned models, and evaluation set are available at https://github. com/X-PLUG/mPLUG-Owl.
Ranked #3 on Visual Question Answering (VQA) on HallusionBench
Visual Question Answering (VQA) Zero-Shot Video Question Answer