Search Results for author: Haotian Liu

Found 33 papers, 14 papers with code

Generalizable Face Landmarking Guided by Conditional Face Warping

1 code implementation18 Apr 2024 Jiayi Liang, Haotian Liu, Hongteng Xu, Dixin Luo

Given a pair of real and stylized facial images, the conditional face warper predicts a warping field from the real face to the stylized one, in which the face landmarker predicts the ending points of the warping field and provides us with high-quality pseudo landmarks for the corresponding stylized facial images.

Domain Adaptation

Deep Cooperation in ISAC System: Resource, Node and Infrastructure Perspectives

no code implementations5 Mar 2024 Zhiqing Wei, Haotian Liu, Zhiyong Feng, Huici Wu, Fan Liu, Qixun Zhang

With the mobile communication system evolving into 6th-generation (6G), the Internet of Everything (IoE) is becoming reality, which connects human, big data and intelligent machines to support the intelligent decision making, reconfiguring the traditional industries and human life.

PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds

no code implementations29 Feb 2024 Haotian Liu, Sanqing Qu, Fan Lu, Zongtao Bu, Florian Roehrbein, Alois Knoll, Guang Chen

Therefore, existing complementary learning approaches for MDE fuse intensity information from images and scene details from event data for better scene understanding.

Depth Prediction Monocular Depth Estimation +2

Edit One for All: Interactive Batch Image Editing

no code implementations18 Jan 2024 Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, Yong Jae Lee

With increased human control, it is now possible to edit an image in a plethora of ways; from specifying in text what we want to change, to straight up dragging the contents of the image in an interactive point-based manner.

Making Large Multimodal Models Understand Arbitrary Visual Prompts

no code implementations1 Dec 2023 Mu Cai, Haotian Liu, Siva Karthik Mustikovela, Gregory P. Meyer, Yuning Chai, Dennis Park, Yong Jae Lee

Furthermore, we present ViP-Bench, a comprehensive benchmark to assess the capability of models in understanding visual prompts across multiple dimensions, enabling future research in this domain.

Visual Commonsense Reasoning Visual Prompting

Integrated Sensing and Communication Signal Processing Based on Compressed Sensing Over Unlicensed Spectrum Bands

no code implementations4 Oct 2023 Haotian Liu, Zhiqing Wei, Fengyun Li, Yuewei Lin, Hanyang Qu, Huici Wu, Zhiyong Feng

The ISAC-enabled mobile communication system regularly operate in non-continuous spectrum bands due to crowded licensed frequency bands.

Carrier Aggregation Enabled Integrated Sensing and Communication Signal Design and Processing

no code implementations25 Sep 2023 Zhiqing Wei, Haotian Liu, Xinyi Yang, Wangjun Jiang, Huici Wu, Xingwang Li, Zhiyong Feng

The future mobile communication systems will support intelligent applications such as Internet of Vehicles (IoV) and Extended Reality (XR).

Aligning Large Multimodal Models with Factually Augmented RLHF

no code implementations25 Sep 2023 Zhiqing Sun, Sheng Shen, Shengcao Cao, Haotian Liu, Chunyuan Li, Yikang Shen, Chuang Gan, Liang-Yan Gui, Yu-Xiong Wang, Yiming Yang, Kurt Keutzer, Trevor Darrell

Large Multimodal Models (LMM) are built across modalities and the misalignment between two modalities can result in "hallucination", generating textual outputs that are not grounded by the multimodal information in context.

Hallucination Image Captioning +1

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

1 code implementation18 Sep 2023 Yadong Lu, Chunyuan Li, Haotian Liu, Jianwei Yang, Jianfeng Gao, Yelong Shen

We find that scaling LMM consistently enhances model performance and improves language capabilities, and performance of LoRA/QLoRA tuning of LMM are comparable to the performance of full-model fine-tuning.

Visual Question Answering

Benchmarking and Analyzing Generative Data for Visual Recognition

no code implementations25 Jul 2023 Bo Li, Haotian Liu, Liangyu Chen, Yong Jae Lee, Chunyuan Li, Ziwei Liu

Advancements in large pre-trained generative models have expanded their potential as effective data generators in visual recognition.

Benchmarking Retrieval

Generate Anything Anywhere in Any Scene

no code implementations29 Jun 2023 Yuheng Li, Haotian Liu, Yangming Wen, Yong Jae Lee

Text-to-image diffusion models have attracted considerable interest due to their wide applicability across diverse fields.

Data Augmentation Object

LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day

no code implementations NeurIPS 2023 Chunyuan Li, Cliff Wong, Sheng Zhang, Naoto Usuyama, Haotian Liu, Jianwei Yang, Tristan Naumann, Hoifung Poon, Jianfeng Gao

In this paper, we propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.

Instruction Following Language Modelling +2

Visual Instruction Tuning

9 code implementations NeurIPS 2023 Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee

Instruction tuning large language models (LLMs) using machine-generated instruction-following data has improved zero-shot capabilities on new tasks, but the idea is less explored in the multimodal field.

Video Question Answering visual instruction following +2

Data-Efficient Image Quality Assessment with Attention-Panel Decoder

1 code implementation11 Apr 2023 Guanyi Qin, Runze Hu, Yutao Liu, Xiawu Zheng, Haotian Liu, Xiu Li, Yan Zhang

Blind Image Quality Assessment (BIQA) is a fundamental task in computer vision, which however remains unresolved due to the complex distortion conditions and diversified image contents.

Blind Image Quality Assessment

TMA: Temporal Motion Aggregation for Event-based Optical Flow

1 code implementation ICCV 2023 Haotian Liu, Guang Chen, Sanqing Qu, Yanping Zhang, Zhijun Li, Alois Knoll, Changjun Jiang

In this paper, we argue that temporal continuity is a vital element of event-based optical flow and propose a novel Temporal Motion Aggregation (TMA) approach to unlock its potential.

Event-based Optical Flow Optical Flow Estimation

Reducing Action Space: Reference-Model-Assisted Deep Reinforcement Learning for Inverter-based Volt-Var Control

no code implementations10 Oct 2022 Qiong Liu, Ye Guo, Lirong Deng, Haotian Liu, Dongyu Li, Hongbin Sun

We investigate that a large action space increases the learning difficulties of DRL and degrades the optimization performance in the process of generating data and training neural networks.

End-to-End Instance Edge Detection

no code implementations6 Apr 2022 Xueyan Zou, Haotian Liu, Yong Jae Lee

We demonstrate highly competitive instance edge detection performance compared to state-of-the-art baselines, and also show that the proposed task and loss are complementary to instance segmentation and object detection.

Edge Detection Instance Segmentation +5

Reducing Learning Difficulties: One-Step Two-Critic Deep Reinforcement Learning for Inverter-based Volt-Var Control

no code implementations30 Mar 2022 Qiong Liu, Ye Guo, Lirong Deng, Haotian Liu, Dongyu Li, Hongbin Sun, Wenqi Huang

Then we design the one-step actor-critic DRL scheme which is a simplified version of recent DRL algorithms, and it avoids the issue of Q value overestimation successfully.

Masked Discrimination for Self-Supervised Learning on Point Clouds

1 code implementation21 Mar 2022 Haotian Liu, Mu Cai, Yong Jae Lee

Masked autoencoding has achieved great success for self-supervised learning in the image and language domains.

3D Shape Classification Binary Classification +4

M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation in Fundus Images

1 code implementation30 Oct 2021 Qing Liu, Haotian Liu, Wei Ke, Yixiong Liang

It reassembles features in a dimension-reduced feature space and simultaneously aggregates multiple features inside a large predefined region into multiple target features.

Lesion Segmentation Segmentation

Bi-level Off-policy Reinforcement Learning for Volt/VAR Control Involving Continuous and Discrete Devices

no code implementations13 Apr 2021 Haotian Liu, Wenchuan Wu

Such VCC is formulated as a two-timescale optimization problem to jointly optimize FTCDs and STDDs in ADNs.

Reinforcement Learning (RL)

YolactEdge: Real-time Instance Segmentation on the Edge

2 code implementations22 Dec 2020 Haotian Liu, Rafael A. Rivera Soto, Fanyi Xiao, Yong Jae Lee

We propose YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds.

Real-time Instance Segmentation Semantic Segmentation

Dual-Branch Network with Dual-Sampling Modulated Dice Loss for Hard Exudate Segmentation from Colour Fundus Images

no code implementations3 Dec 2020 Qing Liu, Haotian Liu, Yixiong Liang

In detail, for the first branch, we use a uniform sampler to sample pixels from predicted segmentation mask for Dice loss calculation, which leads to this branch naturally be biased in favour of large hard exudates as Dice loss generates larger cost on misidentification of large hard exudates than small hard exudates.

Online Multi-agent Reinforcement Learning for Decentralized Inverter-based Volt-VAR Control

no code implementations23 Jun 2020 Haotian Liu, Wenchuan Wu

In this paper, we propose an online multi-agent reinforcement learning and decentralized control framework (OLDC) for VVC.

Multi-agent Reinforcement Learning reinforcement-learning +1

Universal time delay in static spherically symmetric spacetimes for null and timelike signals

no code implementations5 Jun 2020 Haotian Liu, Junji Jia

A perturbative method to compute the total travel time of both null and lightlike rays in arbitrary static spherically symmetric spacetimes in the weak field limit is proposed.

General Relativity and Quantum Cosmology

Two-stage Deep Reinforcement Learning for Inverter-based Volt-VAR Control in Active Distribution Networks

no code implementations20 May 2020 Haotian Liu, Wenchuan Wu

In the sequential online stage, we transfer the offline agent safely as the online agent to perform continuous learning and controlling online with significantly improved safety and efficiency.

reinforcement-learning Reinforcement Learning (RL)

A Constructive Algorithm for Decomposing a Tensor into a Finite Sum of Orthonormal Rank-1 Terms

1 code implementation7 Jul 2014 Kim Batselier, Haotian Liu, Ngai Wong

We propose a constructive algorithm that decomposes an arbitrary real tensor into a finite sum of orthonormal rank-1 outer products.

Numerical Analysis Numerical Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.