Search Results for author: Guang Chen

Found 75 papers, 38 papers with code

Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency

no code implementations9 Jun 2025 Xiangyu Guo, Zhanqian Wu, Kaixin Xiong, Ziyang Xu, Lijun Zhou, Gangwei Xu, Shaoqing Xu, Haiyang Sun, Bing Wang, Guang Chen, Hangjun Ye, Wenyu Liu, Xinggang Wang

We present Genesis, a unified framework for joint generation of multi-view driving videos and LiDAR sequences with spatio-temporal and cross-modal consistency.

NeRF Scene Generation

ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving

no code implementations9 Jun 2025 Yongkang Li, Kaixin Xiong, Xiangyu Guo, Fang Li, Sixu Yan, Gangwei Xu, Lijun Zhou, Long Chen, Haiyang Sun, Bing Wang, Guang Chen, Hangjun Ye, Wenyu Liu, Xinggang Wang

Recent approaches attempt to address this challenge by leveraging the rich world knowledge of Vision-Language Models (VLMs), but these methods suffer from several limitations: (1) a significant domain gap between the pre-training data of VLMs and real-world driving data, (2) a dimensionality mismatch between the discrete language space and the continuous action space, and (3) imitation learning tends to capture the average behavior present in the dataset, which may be suboptimal even dangerous.

Imitation Learning NavSim +2

UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors

no code implementations29 May 2025 Tianhang Wang, Fan Lu, Sanqing Qu, Guo Yu, Shihang Du, Ya Wu, Yuan Huang, Guang Chen

Existing neural rendering-based urban scene reconstruction methods mainly focus on the Interpolated View Synthesis (IVS) setting that synthesizes from views close to training camera trajectory.

Neural Rendering

AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving

no code implementations21 May 2025 Kangan Qian, Sicong Jiang, Yang Zhong, Ziang Luo, Zilin Huang, Tianze Zhu, Kun Jiang, Mengmeng Yang, Zheng Fu, Jinyu Miao, Yining Shi, He Zhe Lim, Li Liu, Tianbao Zhou, Huang Yu, Yifei Hu, Guang Li, Guang Chen, Hao Ye, Lijun Sun, Diange Yang

Vision-Language Models (VLMs) show promise for autonomous driving, yet their struggle with hallucinations, inefficient reasoning, and limited real-world validation hinders accurate perception and robust step-by-step reasoning.

Autonomous Driving

Beyond Intermediate States: Explaining Visual Redundancy through Language

1 code implementation26 Mar 2025 Dingchen Yang, Bowen Cao, Anran Zhang, Weibo Gu, Winston Hu, Guang Chen

Multi-modal Large Langue Models (MLLMs) often process thousands of visual tokens, which consume a significant portion of the context window and impose a substantial computational burden.

ChatBEV: A Visual Language Model that Understands BEV Maps

no code implementations18 Mar 2025 Qingyao Xu, Siheng Chen, Guang Chen, Yanfeng Wang, Ya zhang

Traffic scene understanding is essential for intelligent transportation systems and autonomous driving, ensuring safe and efficient vehicle operation.

Autonomous Driving Language Modeling +4

EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models

no code implementations14 Mar 2025 Yixuan Zhang, Qing Chang, Yuxi Wang, Guang Chen, Zhaoxiang Zhang, Junran Peng

Speech-driven 3D facial animation seeks to produce lifelike facial expressions that are synchronized with the speech content and its emotional nuances, finding applications in various multimedia fields.

Generative Multi-Agent Collaboration in Embodied AI: A Systematic Review

no code implementations17 Feb 2025 Di wu, Xian Wei, Guang Chen, Hao Shen, Xiangfeng Wang, Wenhao Li, Bo Jin

Embodied multi-agent systems (EMAS) have attracted growing attention for their potential to address complex, real-world challenges in areas such as logistics and robotics.

Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition

1 code implementation17 Feb 2025 Jianyi Peng, Fan Lu, Bin Li, Yuan Huang, Sanqing Qu, Guang Chen

Compared to single-modal VPR, this approach benefits from the widespread availability of RGB cameras and the robustness of point clouds in providing accurate spatial geometry and distance information.

Re-Ranking Triplet +1

RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse Corruptions

1 code implementation CVPR 2025 Shihang Du, Sanqing Qu, Tianhang Wang, Xudong Zhang, Yunwei Zhu, Jian Mao, Fan Lu, Qiao Lin, Guang Chen

Extensive experiments on 10 leading collaborative perception models reveal that, while these models perform well under ideal conditions, they are significantly affected by corruptions.

Benchmarking

Towards Low-Resource Harmful Meme Detection with LMM Agents

1 code implementation8 Nov 2024 Jianzhao Huang, Hongzhan Lin, Ziyan Liu, Ziyang Luo, Guang Chen, Jing Ma

The proliferation of Internet memes in the age of social media necessitates effective identification of harmful ones.

Multimodal Reasoning

AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow

no code implementations27 Sep 2024 Huizi Yu, Jiayan Zhou, Lingyao Li, Shan Chen, Jack Gallifant, Anye Shi, Xiang Li, Wenyue Hua, Mingyu Jin, Guang Chen, Yang Zhou, Zhao Li, Trisha Gupte, Ming-Li Chen, Zahra Azizi, Yongfeng Zhang, Themistocles L. Assimes, Xin Ma, Danielle S. Bitterman, Lin Lu, Lizhou Fan

Here, we developed AIPatient, an advanced simulated patient system with AIPatient Knowledge Graph (AIPatient KG) as the input and the Reasoning Retrieval-Augmented Generation (Reasoning RAG) agentic workflow as the generation backbone.

Question Answering RAG +2

Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey

2 code implementations19 Aug 2024 Ruiqi Zhang, Jing Hou, Florian Walter, Shangding Gu, Jiayi Guan, Florian Röhrbein, Yali Du, Panpan Cai, Guang Chen, Alois Knoll

Reinforcement Learning (RL) is a potent tool for sequential decision-making and has achieved performance surpassing human capabilities across many challenging real-world tasks.

Autonomous Driving Decision Making +6

WPN: An Unlearning Method Based on N-pair Contrastive Learning in Language Models

no code implementations18 Aug 2024 Guitao Chen, Yunshen Wang, Hongye Sun, Guang Chen

Generative language models (LMs) offer numerous advantages but may produce inappropriate or harmful outputs due to the harmful knowledge acquired during pre-training.

Contrastive Learning

HGL: Hierarchical Geometry Learning for Test-time Adaptation in 3D Point Cloud Segmentation

1 code implementation17 Jul 2024 Tianpei Zou, Sanqing Qu, Zhijun Li, Alois Knoll, Lianghua He, Guang Chen, Changjun Jiang

HGL comprises three complementary modules from local, global to temporal learning in a bottom-up manner. Technically, we first construct a local geometry learning module for pseudo-label generation.

Point Cloud Segmentation Pseudo Label +1

Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection

1 code implementation17 Jul 2024 Hu Cao, Zehua Zhang, Yan Xia, Xinyi Li, Jiahao Xia, Guang Chen, Alois Knoll

The core concept is the design of the coarse-to-fine fusion module, denoted as the cross-modality adaptive feature refinement (CAFR) module.

object-detection Object Detection

GeoNLF: Geometry guided Pose-Free Neural LiDAR Fields

no code implementations8 Jul 2024 Weiyi Xue, Zehan Zheng, Fan Lu, Haiyun Wei, Guang Chen, Changjun Jiang

Based on this, we propose Geometry guided Neural LiDAR Fields(GeoNLF), a hybrid framework performing alternately global neural reconstruction and pure geometric pose optimization.

NeRF Novel View Synthesis +2

MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models

1 code implementation17 Jun 2024 Shengkang Wang, Hongzhan Lin, Ziyang Luo, Zhen Ye, Guang Chen, Jing Ma

Large vision-language models (LVLMs) have significantly improved multimodal reasoning tasks, such as visual question answering and image captioning.

Benchmarking Fact Checking +5

Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model

1 code implementation15 Jun 2024 Lu Xu, Sijie Zhu, Chunyuan Li, Chia-Wen Kuo, Fan Chen, Xinyao Wang, Guang Chen, Dawei Du, Ye Yuan, Longyin Wen

However, a large portion of videos in real-world applications are edited videos, \textit{e. g.}, users usually cut and add effects/modifications to the raw video before publishing it on social media platforms.

Question Answering Video Understanding +1

RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-based 3D Neural Modeling

no code implementations27 May 2024 Tianhang Wang, Fan Lu, Zehan Zheng, Guang Chen, Changjun Jiang

To address above problems, we propose RCDN, a Robust Camera-insensitivity collaborative perception with a novel Dynamic feature-based 3D Neural modeling mechanism.

Neural Rendering

Risk Assessment for Nonlinear Cyber-Physical Systems under Stealth Attacks

no code implementations4 May 2024 Guang Chen, Zhicong Sun, Yulong Ding, Shuang-Hua Yang

To comprehensively quantify these risks, we propose a framework that considers both the reachability of a system and the risk distribution of a scenario.

CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models

1 code implementation1 May 2024 Hongzhan Lin, Zixin Chen, Ziyang Luo, Mingfei Cheng, Jing Ma, Guang Chen

Current methods for Multimodal Sarcasm Target Identification (MSTI) predominantly focus on superficial indicators in an end-to-end manner, overlooking the nuanced understanding of multimodal sarcasm conveyed through both the text and image.

Language Modeling Language Modelling +3

Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior

1 code implementation10 Apr 2024 Fan Lu, Kwan-Yee Lin, Yan Xu, Hongsheng Li, Guang Chen, Changjun Jiang

(2) To handle the unbounded nature of urban scenes, we represent 3D scene with a Scalable Hash Grid structure, incrementally adapting to the growing scale of urban scenes.

3D Generation Model Optimization +2

Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination

1 code implementation21 Mar 2024 Dingchen Yang, Bowen Cao, Guang Chen, Changjun Jiang

Multi-modal Large Language Models (MLLMs) demonstrate remarkable success across various vision-language tasks.

Hallucination MME +1

MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection

1 code implementation CVPR 2024 Boyang Peng, Sanqing Qu, Yong Wu, Tianpei Zou, Lianghua He, Alois Knoll, Guang Chen, Changjun Jiang

In this paper, we target a practical setting where only a well-trained source model is available and investigate how we can realize IP protection.

PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds

no code implementations29 Feb 2024 Haotian Liu, Sanqing Qu, Fan Lu, Zongtao Bu, Florian Roehrbein, Alois Knoll, Guang Chen

Therefore, existing complementary learning approaches for MDE fuse intensity information from images and scene details from event data for better scene understanding.

Depth Prediction Monocular Depth Estimation +2

GarchingSim: An Autonomous Driving Simulator with Photorealistic Scenes and Minimalist Workflow

1 code implementation28 Jan 2024 Liguo Zhou, Yinglei Song, Yichao Gao, Zhou Yu, Michael Sodamin, Hongshen Liu, Liang Ma, Lian Liu, Hao liu, Yang Liu, Haichuan Li, Guang Chen, Alois Knoll

However, the availability of free and open-source simulators is limited, and the installation and configuration process can be daunting for beginners and interdisciplinary researchers.

Autonomous Driving

Spreeze: High-Throughput Parallel Reinforcement Learning Framework

no code implementations11 Dec 2023 Jing Hou, Guang Chen, Ruiqi Zhang, Zhijun Li, Shangding Gu, Changjun Jiang

While existing parallel RL frameworks encompass a variety of RL algorithms and parallelization techniques, the excessively burdensome communication frameworks hinder the attainment of the hardware's limit for final throughput and training effects on a single desktop.

reinforcement-learning Reinforcement Learning +1

HDMNet: A Hierarchical Matching Network with Double Attention for Large-scale Outdoor LiDAR Point Cloud Registration

no code implementations29 Oct 2023 Weiyi Xue, Fan Lu, Guang Chen

Specifically, A novel feature consistency enhanced double-soft matching network is introduced to achieve two-stage matching with high flexibility while enlarging the receptive field with high efficiency in a patch-to patch manner, which significantly improves the registration performance.

Point Cloud Registration Pose Estimation

Dual-Scale Interest Extraction Framework with Self-Supervision for Sequential Recommendation

no code implementations16 Oct 2023 Liangliang Chen, Hongzhan Lin, Jinshan Ma, Guang Chen

Nevertheless, the existing approaches just extract each interest independently for the corresponding sub-sequence while ignoring the global correlation of the entire interaction sequence, which may fail to capture the user's inherent preference for the potential interests generalization and unavoidably make the recommended items homogeneous with the historical behaviors.

Contrastive Learning Sequential Recommendation

Urban Radiance Field Representation with Deformable Neural Mesh Primitives

2 code implementations ICCV 2023 Fan Lu, Yan Xu, Guang Chen, Hongsheng Li, Kwan-Yee Lin, Changjun Jiang

To construct urban-level radiance fields efficiently, we design Deformable Neural Mesh Primitive~(DNMP), and propose to parameterize the entire scene with such primitives.

Image Generation Novel View Synthesis

NeuralPCI: Spatio-temporal Neural Field for 3D Point Cloud Multi-frame Non-linear Interpolation

1 code implementation CVPR 2023 Zehan Zheng, Danni Wu, Ruisi Lu, Fan Lu, Guang Chen, Changjun Jiang

In light of these issues, we present NeuralPCI: an end-to-end 4D spatio-temporal Neural field for 3D Point Cloud Interpolation, which implicitly integrates multi-frame information to handle nonlinear large motions for both indoor and outdoor scenarios.

3D Point Cloud Interpolation Autonomous Driving

Text with Knowledge Graph Augmented Transformer for Video Captioning

no code implementations CVPR 2023 Xin Gu, Guang Chen, YuFei Wang, Libo Zhang, Tiejian Luo, Longyin Wen

Meanwhile, the internal stream is designed to exploit the multi-modality information in videos (e. g., the appearance of video frames, speech transcripts, and video captions) to ensure the quality of caption results.

Video Captioning

TMA: Temporal Motion Aggregation for Event-based Optical Flow

1 code implementation ICCV 2023 Haotian Liu, Guang Chen, Sanqing Qu, Yanping Zhang, Zhijun Li, Alois Knoll, Changjun Jiang

In this paper, we argue that temporal continuity is a vital element of event-based optical flow and propose a novel Temporal Motion Aggregation (TMA) approach to unlock its potential.

Event-based Optical Flow Optical Flow Estimation

Upcycling Models under Domain and Category Shift

3 code implementations CVPR 2023 Sanqing Qu, Tianpei Zou, Florian Roehrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang

We examine the superiority of our GLC on multiple benchmarks with different category shift scenarios, including partial-set, open-set, and open-partial-set DA.

Clustering Source-Free Domain Adaptation +2

Modality-Agnostic Debiasing for Single Domain Generalization

no code implementations CVPR 2023 Sanqing Qu, Yingwei Pan, Guang Chen, Ting Yao, Changjun Jiang, Tao Mei

We validate the superiority of our MAD in a variety of single-DG scenarios with different modalities, including recognition on 1D texts, 2D images, 3D point clouds, and semantic segmentation on 2D images.

Data Augmentation Domain Generalization +1

SUPS: A Simulated Underground Parking Scenario Dataset for Autonomous Driving

1 code implementation25 Feb 2023 Jiawei Hou, Qi Chen, Yurong Cheng, Guang Chen, xiangyang xue, Taiping Zeng, Jian Pu

However, there is a lack of underground parking scenario datasets with multiple sensors and well-labeled images that support both SLAM tasks and perception tasks, such as semantic segmentation and parking slot detection.

3D Reconstruction Autonomous Driving +4

A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors

no code implementations25 Feb 2023 Shangding Gu, Alap Kshirsagar, Yali Du, Guang Chen, Jan Peters, Alois Knoll

Deployment of Reinforcement Learning (RL) algorithms for robotics applications in the real world requires ensuring the safety of the robot and its environment.

reinforcement-learning Reinforcement Learning (RL) +1

Dual-Stream Transformer for Generic Event Boundary Captioning

1 code implementation7 Jul 2022 Xin Gu, Hanhua Ye, Guang Chen, YuFei Wang, Libo Zhang, Longyin Wen

This paper describes our champion solution for the CVPR2022 Generic Event Boundary Captioning (GEBC) competition.

Boundary Captioning

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

1 code implementation20 May 2022 Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Alois Knoll

To establish a good foundation for future safe RL research, in this paper, we provide a review of safe RL from the perspectives of methods, theories, and applications.

Autonomous Driving Decision Making +4

BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation

1 code implementation6 Apr 2022 Sanqing Qu, Guang Chen, Jing Zhang, Zhijun Li, wei he, DaCheng Tao

Source-free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to the unlabeled target domain without accessing the well-labeled source data, which is a much more practical setting due to the data privacy, security, and transmission issues.

Clustering Pseudo Label +1

Unsupervised Domain Adaptation for Nighttime Aerial Tracking

2 code implementations CVPR 2022 Junjie Ye, Changhong Fu, Guangze Zheng, Danda Pani Paudel, Guang Chen

Previous advances in object tracking mostly reported on favorable illumination circumstances while neglecting performance at nighttime, which significantly impeded the development of related aerial robot applications.

Object Discovery Object Tracking +1

HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

1 code implementation ICCV 2021 Fan Lu, Guang Chen, Yinlong Liu, Lijun Zhang, Sanqing Qu, Shu Liu, Rongqi Gu

Extensive experiments are conducted on two large-scale outdoor LiDAR point cloud datasets to demonstrate the high accuracy and efficiency of the proposed HRegNet.

Point Cloud Registration

DMInet: An Accurate and Highly Flexible Deep Learning Framework for Drug Membrane Interaction with Membrane Selectivity

no code implementations27 May 2021 Guang Chen

Inheriting from coarse-grained Martini representation of organic molecules and combined with deep learning, DMInet has the potential for more accelerated high throughput screening in drug discovery across a much larger chemical space than that can be explored by physics-based simulations alone.

Drug Discovery

ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action Localization

2 code implementations7 Apr 2021 Sanqing Qu, Guang Chen, Zhijun Li, Lijun Zhang, Fan Lu, Alois Knoll

Traditional methods mainly focus on foreground and background frames separation with only a single attention branch and class activation sequence.

Weakly Supervised Action Localization

Data Augmentation for Object Detection via Differentiable Neural Rendering

1 code implementation4 Mar 2021 Guanghan Ning, Guang Chen, Chaowei Tan, Si Luo, Liefeng Bo, Heng Huang

We propose a new offline data augmentation method for object detection, which semantically interpolates the training data with novel views.

Data Augmentation Neural Rendering +4

NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting

1 code implementation10 Feb 2021 Kai Chen, Guang Chen, Dan Xu, Lijun Zhang, Yuyao Huang, Alois Knoll

Although Transformer has made breakthrough success in widespread domains especially in Natural Language Processing (NLP), applying it to time series forecasting is still a great challenge.

Time Series Time Series Forecasting

Lightweight Convolutional Neural Network with Gaussian-based Grasping Representation for Robotic Grasping Detection

no code implementations25 Jan 2021 Hu Cao, Guang Chen, Zhijun Li, Jianjie Lin, Alois Knoll

Extensive experiments on two public grasping datasets, Cornell and Jacquard demonstrate the state-of-the-art performance of our method in balancing accuracy and inference speed.

object-detection Robotic Grasping

PointINet: Point Cloud Frame Interpolation Network

1 code implementation18 Dec 2020 Fan Lu, Guang Chen, Sanqing Qu, Zhijun Li, Yinlong Liu, Alois Knoll

Generally, the frame rates of mechanical LiDAR sensors are 10 to 20 Hz, which is much lower than other commonly used sensors like cameras.

3D Point Cloud Interpolation

MoNet: Motion-based Point Cloud Prediction Network

no code implementations21 Nov 2020 Fan Lu, Guang Chen, Yinlong Liu, Zhijun Li, Sanqing Qu, Tianpei Zou

3D point clouds accurately model 3D information of surrounding environment and are crucial for intelligent vehicles to perceive the scene.

Autonomous Driving Prediction

LAP-Net: Adaptive Features Sampling via Learning Action Progression for Online Action Detection

no code implementations16 Nov 2020 Sanqing Qu, Guang Chen, Dan Xu, Jinhu Dong, Fan Lu, Alois Knoll

At each time step, this sampling strategy first estimates current action progression and then decide what temporal ranges should be used to aggregate the optimal supplementary features.

Online Action Detection

RSKDD-Net: Random Sample-based Keypoint Detector and Descriptor

1 code implementation NeurIPS 2020 Fan Lu, Guang Chen, Yinlong Liu, Zhongnan Qu, Alois Knoll

To tackle the information loss of random sampling, we exploit a novel random dilation cluster strategy to enlarge the receptive field of each sampled point and an attention mechanism to aggregate the positions and features of neighbor points.

Point Cloud Registration Saliency Prediction

Efficient Pig Counting in Crowds with Keypoints Tracking and Spatial-aware Temporal Response Filtering

no code implementations27 May 2020 Guang Chen, Shiwen Shen, Longyin Wen, Si Luo, Liefeng Bo

Existing methods only focused on pig counting using single image, and its accuracy is challenged by several factors, including pig movements, occlusion and overlapping.

Edge-computing

Gen-LaneNet: A Generalized and Scalable Approach for 3D Lane Detection

1 code implementation ECCV 2020 Yuliang Guo, Guang Chen, Peitao Zhao, Weide Zhang, Jinghao Miao, Jingao Wang, Tae Eun Choe

The method, inspired by the latest state-of-the-art 3D-LaneNet, is a unified framework solving image encoding, spatial transform of features and 3D lane prediction in a single network.

3D Lane Detection Image Segmentation +1

Indirect and Direct Training of Spiking Neural Networks for End-to-End Control of a Lane-Keeping Vehicle

no code implementations10 Mar 2020 Zhenshan Bing, Claus Meschede, Guang Chen, Alois Knoll, Kai Huang

Building spiking neural networks (SNNs) based on biological synaptic plasticities holds a promising potential for accomplishing fast and energy-efficient computing, which is beneficial to mobile robotic applications.

Q-Learning Reinforcement Learning

OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement

no code implementations8 Mar 2020 Fangyi Zhu, Jenq-Neng Hwang, Zhanyu Ma, Guang Chen, Jun Guo

Thereafter, we construct a new dataset, providing consistent object-sentence pairs, to facilitate effective cross-modal learning.

Object Sentence +1

Mechanisms underlying the response of mouse cortical networks to optogenetic manipulation

no code implementations1 Jul 2019 Alexandre Mahrach, Guang Chen, Nuo Li, Carl van Vreeswijk, David Hansel

Networks with three inhibitory populations and V1-like architecture account for the data in ALM layer 2/3.

Globally optimal vertical direction estimation in Atlanta World

1 code implementation29 Apr 2019 Yinlong Liu, Alois Knoll, Guang Chen

Accordingly, we propose a vertical direction estimation method by considering the relationship between the vertical frame and horizontal frames.

A Novel Method for the Absolute Pose Problem with Pairwise Constraints

no code implementations25 Mar 2019 Yinlong Liu, Xuechen Li, Manning Wang, Guang Chen, Zhijian Song, Alois Knoll

In this paper, we consider pairwise constraints and propose a globally optimal algorithm for solving the absolute pose estimation problem.

parameter estimation Pose Estimation +1

Salience Biased Loss for Object Detection in Aerial Images

no code implementations18 Oct 2018 Peng Sun, Guang Chen, Guerdan Luke, Yi Shang

Experimental results show our proposed loss function with the RetinaNet architecture outperformed other state-of-art object detection models by at least 4. 31 mAP, and RetinaNet by 2. 26 mAP with the same inference speed of RetinaNet.

Object object-detection +1

Deep Anticipation: Light Weight Intelligent Mobile Sensing in IoT by Recurrent Architecture

no code implementations6 Dec 2017 Guang Chen, Shu Liu, Kejia Ren, Zhongnan Qu, Changhong Fu, Gereon Hinz, Alois Knoll

However, the mobile sensing perception brings new challenges for how to efficiently analyze and intelligently interpret the deluge of IoT data in mission- critical services.

Hierarchical Latent Semantic Mapping for Automated Topic Generation

no code implementations11 Nov 2015 Guorui Zhou, Guang Chen

Inspired by these algorithms, in this paper, we propose a novel method named Hierarchical Latent Semantic Mapping (HLSM), which automatically generates topics from corpus.

Community Detection

Large-Scale Visual Font Recognition

no code implementations CVPR 2014 Guang Chen, Jianchao Yang, Hailin Jin, Jonathan Brandt, Eli Shechtman, Aseem Agarwala, Tony X. Han

This paper addresses the large-scale visual font recognition (VFR) problem, which aims at automatic identification of the typeface, weight, and slope of the text in an image or photo without any knowledge of content.

Font Recognition Image Categorization +1

Detection Evolution with Multi-order Contextual Co-occurrence

no code implementations CVPR 2013 Guang Chen, Yuanyuan Ding, Jing Xiao, Tony X. Han

The so-called (1 st -order) context feature is computed as a set of randomized binary comparisons on the response map of the baseline object detector.

Object object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.