Search Results for author: Haoran Wang

Found 99 papers, 34 papers with code

Supervised Edge Attention Network for Accurate Image Instance Segmentation

1 code implementation ECCV 2020 Xier Chen, Yanchao Lian, Licheng Jiao, Haoran Wang, YanJie Gao, Shi Lingling

In this task, many works segment instance based on a bounding box from the box head, which means the quality of the detection also affects the completeness of the mask.

Instance Segmentation Segmentation +1

GroverGPT-2: Simulating Grover's Algorithm via Chain-of-Thought Reasoning and Quantum-Native Tokenization

no code implementations8 May 2025 Min Chen, Jinglei Cheng, Pingzhi Li, Haoran Wang, Tianlong Chen, Junyu Liu

Our results show that GroverGPT-2 can learn and internalize quantum circuit logic through efficient processing of quantum-native tokens, providing direct evidence that classical models like LLMs can capture the structure of quantum algorithms.

Visual enhancement and 3D representation for underwater scenes: a review

no code implementations3 May 2025 Guoxi Huang, Haoran Wang, Brett Seymour, Evan Kovacs, John Ellerbrock, Dave Blackham, Nantheera Anantrasirichai

Underwater visual enhancement (UVE) and underwater 3D reconstruction pose significant challenges in computer vision and AI-based tasks due to complex imaging conditions in aquatic environments.

3D Reconstruction

LLGS: Unsupervised Gaussian Splatting for Image Enhancement and Reconstruction in Pure Dark Environment

no code implementations24 Mar 2025 Haoran Wang, Jingwei Huang, Lu Yang, Tianchen Deng, Gaojing Zhang, Mingrui Li

Simply using enhanced images as inputs would lead to issues with multi-view consistency, and current single-view enhancement systems rely on pre-trained data, lacking scene generalization.

Image Enhancement

Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study

1 code implementation19 Mar 2025 Xingxuan Zhang, Haoran Wang, Jiansheng Li, Yuan Xue, Shikai Guan, Renzhe Xu, Hao Zou, Han Yu, Peng Cui

Large language models (LLMs) like GPT-4 and LLaMA-3 utilize the powerful in-context learning (ICL) capability of Transformer architecture to learn on the fly from limited examples.

Diversity In-Context Learning

Understanding Driver Cognition and Decision-Making Behaviors in High-Risk Scenarios: A Drift Diffusion Perspective

no code implementations16 Mar 2025 Heye Huang, Zheng Li, Hao Cheng, Haoran Wang, Junkai Jiang, Xiaopeng Li, Arkady Zgonnikov

First, a risk sensitivity model based on a multivariate Gaussian distribution is developed to characterize individual differences in risk cognition.

Autonomous Vehicles Decision Making

WeakMedSAM: Weakly-Supervised Medical Image Segmentation via SAM with Sub-Class Exploration and Prompt Affinity Mining

1 code implementation6 Mar 2025 Haoran Wang, Lian Huai, Wenbin Li, Lei Qi, Xingqun Jiang, Yinghuan Shi

Currently, several recent works have utilized the segmenting anything model (SAM) to boost the segmentation performance in medical images, where most of them focus on training an adaptor for fine-tuning a large amount of pixel-wise annotated medical images following a fully supervised manner.

Image Segmentation Medical Image Segmentation +1

Can Multimodal LLMs Perform Time Series Anomaly Detection?

1 code implementation25 Feb 2025 Xiongxiao Xu, Haoran Wang, Yueqing Liang, Philip S. Yu, Yue Zhao, Kai Shu

Starting with the univariate case (point- and range-wise anomalies), we extend our evaluation to more practical scenarios, including multivariate and irregular time series scenarios, and variate-wise anomalies.

Anomaly Detection Irregular Time Series +2

Benchmarking LLMs for Political Science: A United Nations Perspective

1 code implementation19 Feb 2025 Yueqing Liang, Liangwei Yang, Chen Wang, Congying Xia, Rui Meng, Xiongxiao Xu, Haoran Wang, Ali Payani, Kai Shu

Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored.

Benchmarking Decision Making

PersonaHOI: Effortlessly Improving Personalized Face with Human-Object Interaction Generation

1 code implementation10 Jan 2025 Xinting Hu, Haoran Wang, Jan Eric Lenssen, Bernt Schiele

We introduce PersonaHOI, a training- and tuning-free framework that fuses a general StableDiffusion model with a personalized face diffusion (PFD) model to generate identity-consistent human-object interaction (HOI) images.

Human-Object Interaction Detection Human-Object Interaction Generation

GroverGPT: A Large Language Model with 8 Billion Parameters for Quantum Searching

no code implementations30 Dec 2024 Haoran Wang, Pingzhi Li, Min Chen, Jinglei Cheng, Junyu Liu, Tianlong Chen

In this work, we explore the potential of leveraging Large Language Models (LLMs) to simulate the output of a quantum Turing machine using Grover's quantum circuits, known to provide quadratic speedups over classical counterparts.

Language Modeling Language Modelling +1

Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective

no code implementations22 Dec 2024 Hankun Wang, Haoran Wang, Yiwei Guo, Zhihan Li, Chenpeng Du, Xie Chen, Kai Yu

Although text-based large language models exhibit human-level writing ability and remarkable intelligence, speech language models (SLMs) still struggle to generate semantically coherent outputs.

text-to-speech Text to Speech

Automated Driving with Evolution Capability: A Reinforcement Learning Method with Monotonic Performance Enhancement

no code implementations14 Dec 2024 Jia Hu, Xuerun Yan, Tian Xu, Haoran Wang

Hence, the proposed HCPI-RL planner has the following features: i) Evolutionary automated driving with monotonic performance enhancement; ii) With the capability of handling scenarios with emergency; iii) With enhanced decision-making optimality.

Decision Making reinforcement-learning +2

MV-Adapter: Multi-view Consistent Image Generation Made Easy

1 code implementation4 Dec 2024 Zehuan Huang, Yuan-Chen Guo, Haoran Wang, Ran Yi, Lizhuang Ma, Yan-Pei Cao, Lu Sheng

To efficiently model the 3D geometric knowledge within the adapter, we introduce innovative designs that include duplicated self-attention layers and parallel attention architecture, enabling the adapter to inherit the powerful priors of the pre-trained models to model the novel 3D knowledge.

3D Generation

Gated Parametric Neuron for Spike-based Audio Recognition

no code implementations2 Dec 2024 Haoran Wang, Herui Zhang, Siyang Li, Dongrui Wu

Compared with the LIF neuron, the GPN has two distinguishing advantages: 1) it copes well with the vanishing gradients by improving the flow of gradient propagation; and, 2) it learns spatio-temporal heterogeneous neuronal parameters automatically.

Boosting 3D Object Generation through PBR Materials

no code implementations25 Nov 2024 Yitong Wang, Xudong Xu, Li Ma, Haoran Wang, Bo Dai

By analyzing the components of PBR materials, we choose to consider albedo, roughness, metalness, and bump maps.

Object

Piecing It All Together: Verifying Multi-Hop Multimodal Claims

no code implementations14 Nov 2024 Haoran Wang, Aman Rangapur, Xiongxiao Xu, Yueqing Liang, Haroon Gharwi, Carl Yang, Kai Shu

Existing claim verification datasets often do not require systems to perform complex reasoning or effectively interpret multimodal evidence.

16k All +1

UNSCT-HRNet: Modeling Anatomical Uncertainty for Landmark Detection in Total Hip Arthroplasty

no code implementations13 Nov 2024 Jiaxin Wan, Lin Liu, Haoran Wang, Liangwei Li, Wei Li, Shuheng Kou, Runtian Li, Jiayi Tang, Juanxiu Liu, Jing Zhang, Xiaohui Du, Ruqian Hao

Total hip arthroplasty (THA) relies on accurate landmark detection from radiographic images, but unstructured data caused by irregular patient postures or occluded anatomical markers pose significant challenges for existing methods.

HG2P: Hippocampus-inspired High-reward Graph and Model-Free Q-Gradient Penalty for Path Planning and Motion Control

2 code implementations12 Oct 2024 Haoran Wang, Yaoru Sun, Zeshen Tang, Haibo Shi, Chenyuan Jiao

Goal-conditioned hierarchical reinforcement learning (HRL) decomposes complex reaching tasks into a sequence of simple subgoal-conditioned tasks, showing significant promise for addressing long-horizon planning in large-scale environments.

Hierarchical Reinforcement Learning Hippocampus

UW-GS: Distractor-Aware 3D Gaussian Splatting for Enhanced Underwater Scene Reconstruction

1 code implementation2 Oct 2024 Haoran Wang, Nantheera Anantrasirichai, Fan Zhang, David Bull

3D Gaussian splatting (3DGS) offers the capability to achieve real-time high quality 3D scene rendering.

3DGS

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction

no code implementations23 Jul 2024 Haoran Wang, Xinji Mai, Zeng Tao, Yan Wang, Jiawen Yu, Ziheng Zhou, Xuan Tong, Shaoqi Yan, Qing Zhao, Shuyong Gao, Wenqiang Zhang

We propose a novel Emotion Forecasting (EF) task grounded in the theory that an individuals emotions are easily influenced by the emotions or other information conveyed during interactions with another person.

Benchmarking

All rivers run into the sea: Unified Modality Brain-like Emotional Central Mechanism

no code implementations22 Jul 2024 Xinji Mai, Junxiong Lin, Haoran Wang, Zeng Tao, Yan Wang, Shaoqi Yan, Xuan Tong, Jiawen Yu, Boyang Wang, Ziheng Zhou, Qing Zhao, Shuyong Gao, Wenqiang Zhang

In the field of affective computing, fully leveraging information from a variety of sensory modalities is essential for the comprehensive understanding and processing of human emotions.

All Dynamic Facial Expression Recognition +2

EMPL: A novel Efficient Meta Prompt Learning Framework for Few-shot Unsupervised Domain Adaptation

no code implementations4 Jul 2024 Wanqi Yang, Haoran Wang, Lei Wang, Ge Song, Yang Gao

However, current FS-UDA methods are still suffer from two issues: 1) the data from different domains can not be effectively aligned by few-shot labeled data due to the large domain gaps, 2) it is unstable and time-consuming to generalize to new FS-UDA tasks. To address this issue, we put forward a novel Efficient Meta Prompt Learning Framework for FS-UDA.

Bilevel Optimization Meta-Learning +2

The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention

1 code implementation29 Jun 2024 Yixin Wan, Di wu, Haoran Wang, Kai-Wei Chang

In this work, we propose DemOgraphic FActualIty Representation (DoFaiR), a benchmark to systematically quantify the trade-off between using diversity interventions and preserving demographic factuality in T2I models.

Diversity Language Modeling +3

Suppressing Uncertainties in Degradation Estimation for Blind Super-Resolution

no code implementations24 Jun 2024 Junxiong Lin, Zeng Tao, Xuan Tong, Xinji Mai, Haoran Wang, Boyang Wang, Yan Wang, Qing Zhao, Jiawen Yu, Yuxuan Lin, Shaoqi Yan, Shuyong Gao, Wenqiang Zhang

To extract Uncertainty-based Degradation Representation from LR images, the AUDE utilizes the Self-supervised Uncertainty Contrast module with Uncertainty Suppression Loss to suppress the inherent model uncertainty of the Degradation Extractor.

Blind Super-Resolution Image Super-Resolution +1

D2SP: Dynamic Dual-Stage Purification Framework for Dual Noise Mitigation in Vision-based Affective Recognition

no code implementations24 Jun 2024 Haoran Wang, Xinji Mai, Zeng Tao, Xuan Tong, Junxiong Lin, Yan Wang, Jiawen Yu, Boyang Wang, Shaoqi Yan, Qing Zhao, Ziheng Zhou, Shuyong Gao, Wenqiang Zhang

The contemporary state-of-the-art of Dynamic Facial Expression Recognition (DFER) technology facilitates remarkable progress by deriving emotional mappings of facial expressions from video content, underpinned by training on voluminous datasets.

Dynamic Facial Expression Recognition Facial Expression Recognition

FaceGPT: Self-supervised Learning to Chat about 3D Human Faces

no code implementations11 Jun 2024 Haoran Wang, Mohit Mendiratta, Christian Theobalt, Adam Kortylewski

We introduce FaceGPT, a self-supervised learning framework for Large Vision-Language Models (VLMs) to reason about 3D human faces from images and text.

3D Face Reconstruction Face Model +2

VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

1 code implementation3 Jun 2024 Jinze Yang, Haoran Wang, Zining Zhu, Chenglong Liu, Meng Wymond Wu, Mingming Sun

In this paper, we focus on resolving the problem of image outpainting, which aims to extrapolate the surrounding parts given the center contents of an image.

Image Outpainting Language Modeling +3

OUS: Scene-Guided Dynamic Facial Expression Recognition

no code implementations29 May 2024 Xinji Mai, Haoran Wang, Zeng Tao, Junxiong Lin, Shaoqi Yan, Yan Wang, Jing Liu, Jiawen Yu, Xuan Tong, YaTing Li, Wenqiang Zhang

By analyzing the Rigid Cognitive Problem, OUS successfully understands the complex relationship between scene context and emotional expression, closely aligning with human emotional understanding in real-world scenarios.

Dynamic Facial Expression Recognition Facial Expression Recognition

Accelerating the Evolution of Personalized Automated Lane Change through Lesson Learning

no code implementations13 May 2024 Jia Hu, Mingyue Lei, Duo Li, Zhenning Li, Jaehyun, So, Haoran Wang

Guided by the objective of optimizing rewards within the constraints of the driving zone, this approach employs model predictive control for trajectory planning.

Computational Efficiency Model Predictive Control +1

From Friendship Networks to Classroom Dynamics: Leveraging Neural Networks, Instrumental Variable and Genetic Algorithms for Optimal Educational Outcomes

no code implementations3 Apr 2024 Lei Bill Wang, Om Prakash Bedant, Zhenbang Jiao, Haoran Wang

Though the result is much more efficient (i. e. more positive average peer effect) than random classroom assignment (i. e. the current practice in most Chinese middle schools), GA policy is highly inequitable: a small number of students are predicted to experience severely negative peer effects.

counterfactual Fairness

MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise

no code implementations3 Apr 2024 Chunyuan Deng, Xiangru Tang, Yilun Zhao, Hanming Wang, Haoran Wang, Wangchunshu Zhou, Arman Cohan, Mark Gerstein

Recently, large language models (LLMs) have evolved into interactive agents, proficient in planning, tool use, and task execution across a wide variety of tasks.

SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

no code implementations29 Mar 2024 Zhongrui Yu, Haoran Wang, Jinze Yang, Hanzhang Wang, Zeke Xie, Yunfeng Cai, Jiale Cao, Zhong Ji, Mingming Sun

To tackle this problem, we propose a novel approach that enhances the capacity of 3DGS by leveraging prior from a Diffusion Model along with complementary multi-modal data.

3DGS Autonomous Driving +3

CACA Agent: Capability Collaboration based AI Agent

no code implementations22 Mar 2024 Peng Xu, Haoran Wang, Chuang Wang, Xu Liu

As AI Agents based on Large Language Models (LLMs) have shown potential in practical applications across various fields, how to quickly deploy an AI agent and how to conveniently expand the application scenario of AI agents has become a challenge.

AI Agent

Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations

no code implementations21 Mar 2024 Xun Lin, Yi Yu, Song Xia, Jue Jiang, Haoran Wang, Zitong Yu, Yizhong Liu, Ying Fu, Shuai Wang, Wenzhong Tang, Alex Kot

This is particularly true for medical image segmentation (MIS) datasets, where the processes of collection and fine-grained annotation are time-intensive and laborious.

Image Classification Image Generation +4

A$^{3}$lign-DFER: Pioneering Comprehensive Dynamic Affective Alignment for Dynamic Facial Expression Recognition with CLIP

no code implementations7 Mar 2024 Zeng Tao, Yan Wang, Junxiong Lin, Haoran Wang, Xinji Mai, Jiawen Yu, Xuan Tong, Ziheng Zhou, Shaoqi Yan, Qing Zhao, Liyuan Han, Wenqiang Zhang

Specifically, our A$^{3}$lign-DFER method is designed with multiple modules that work together to obtain the most suitable expanded-dimensional embeddings for classification and to achieve alignment in three key aspects: affective, dynamic, and bidirectional.

Dynamic Facial Expression Recognition Facial Expression Recognition

Neural Field Classifiers via Target Encoding and Classification Loss

no code implementations2 Mar 2024 Xindi Yang, Zeke Xie, Xiong Zhou, Boyu Liu, Buhua Liu, Yi Liu, Haoran Wang, Yunfeng Cai, Mingming Sun

We successfully propose a novel Neural Field Classifier (NFC) framework which formulates existing neural field methods as classification tasks rather than regression tasks.

Classification Multi-Label Classification +4

HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion Models

no code implementations11 Jan 2024 Hanzhang Wang, Haoran Wang, Jinze Yang, Zhongrui Yu, Zeke Xie, Lei Tian, Xinyan Xiao, Junjun Jiang, Xianming Liu, Mingming Sun

In the specific, our model is constructed based on Latent Diffusion Model (LDM) and elaborately designed to absorb content and style instance as conditions of LDM.

Style Transfer

Parallel Ranking of Ads and Creatives in Real-Time Advertising Systems

no code implementations20 Dec 2023 Zhiguang Yang, Lu Wang, Chun Gan, Liufang Sang, Haoran Wang, Wenlong Chen, Jie He, Changping Peng, Zhangang Lin, Jingping Shao

In this paper, we propose for the first time a novel architecture for online parallel estimation of ads and creatives ranking, as well as the corresponding offline joint optimization model.

Marketing

RankDVQA-mini: Knowledge Distillation-Driven Deep Video Quality Assessment

no code implementations14 Dec 2023 Chen Feng, Duolikun Danier, Haoran Wang, Fan Zhang, Benoit Vallade, Alex Mackin, David Bull

Deep learning-based video quality assessment (deep VQA) has demonstrated significant potential in surpassing conventional metrics, with promising improvements in terms of correlation with human perception.

Knowledge Distillation Model Compression +2

Trojan Activation Attack: Red-Teaming Large Language Models using Activation Steering for Safety-Alignment

1 code implementation15 Nov 2023 Haoran Wang, Kai Shu

To ensure AI safety, instruction-tuned Large Language Models (LLMs) are specifically trained to ensure alignment, which refers to making models behave in accordance with human intentions.

Red Teaming Safety Alignment

A comprehensive survey on deep active learning in medical image analysis

1 code implementation22 Oct 2023 Haoran Wang, Qiuye Jin, Shiman Li, Siyu Liu, Manning Wang, Zhijian Song

Deep learning has achieved widespread success in medical image analysis, leading to an increasing demand for large-scale expert-annotated medical image datasets.

Active Learning Informativeness +2

Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models

1 code implementation8 Oct 2023 Haoran Wang, Kai Shu

While existing works on claim verification have shown promising results, a crucial piece of the puzzle that remains unsolved is to understand how to verify claims without relying on human-annotated data, which is expensive to create at a large scale.

Claim Verification Decision Making +2

Exposing Image Splicing Traces in Scientific Publications via Uncertainty-guided Refinement

1 code implementation28 Sep 2023 Xun Lin, Wenzhong Tang, Haoran Wang, Yizhong Liu, Yakun Ju, Shuai Wang, Zitong Yu

Compared to image duplication and synthesis, image splicing detection is more challenging due to the lack of reference images and the typically small tampered areas.

Image Forensics Image Manipulation

Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout

1 code implementation24 Sep 2023 Haoran Wang, Zeshen Tang, Leya Yang, Yaoru Sun, Fang Wang, Siyu Zhang, Yeming Chen

Here, we propose a goal-conditioned HRL framework named Guided Cooperation via Model-based Rollout (GCMR), aiming to bridge inter-layer information synchronization and cooperation by exploiting forward dynamics.

Hierarchical Reinforcement Learning reinforcement-learning +1

Investigating Online Financial Misinformation and Its Consequences: A Computational Perspective

no code implementations6 Sep 2023 Aman Rangapur, Haoran Wang, Kai Shu

In conclusion, this research paper sheds light on the pervasive issue of online financial misinformation and its wide-ranging consequences.

Misinformation

Artificial-Spiking Hierarchical Networks for Vision-Language Representation Learning

no code implementations18 Aug 2023 Yeming Chen, Siyu Zhang, Yaoru Sun, Weijian Liang, Haoran Wang

In this work, we propose an efficient computation framework for multimodal alignment by introducing a novel visual semantic module to further improve the performance of the VL tasks.

Computational Efficiency Contrastive Learning +2

S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields

1 code implementation ICCV 2023 Zeke Xie, Xindi Yang, Yujie Yang, Qi Sun, Yixiang Jiang, Haoran Wang, Yunfeng Cai, Mingming Sun

Recently, Neural Radiance Field (NeRF) has shown great success in rendering novel-view images of a given scene by learning an implicit representation with only posed RGB images.

NeRF Novel View Synthesis +1

LOIS: Looking Out of Instance Semantics for Visual Question Answering

no code implementations26 Jul 2023 Siyu Zhang, Yeming Chen, Yaoru Sun, Fang Wang, Haibo Shi, Haoran Wang

Visual question answering (VQA) has been intensively studied as a multimodal task that requires effort in bridging vision and language to infer answers correctly.

Question Answering Visual Question Answering +1

Free-Form Composition Networks for Egocentric Action Recognition

no code implementations13 Jul 2023 Haoran Wang, Qinghua Cheng, Baosheng Yu, Yibing Zhan, Dapeng Tao, Liang Ding, Haibin Ling

We evaluated our method on three popular egocentric action recognition datasets, Something-Something V2, H2O, and EPIC-KITCHENS-100, and the experimental results demonstrate the effectiveness of the proposed method for handling data scarcity problems, including long-tailed and few-shot egocentric action recognition.

Action Recognition Form +1

Hierarchical Matching and Reasoning for Multi-Query Image Retrieval

1 code implementation26 Jun 2023 Zhong Ji, Zhihao LI, Yan Zhang, Haoran Wang, Yanwei Pang, Xuelong Li

Afterwards, the VR module is developed to excavate the potential semantic correlations among multiple region-query pairs, which further explores the high-level reasoning similarity.

Image Retrieval Retrieval

Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation

no code implementations CVPR 2022 Deyi Ji, Haoran Wang, Mingyuan Tao, Jianqiang Huang, Xian-Sheng Hua, Hongtao Lu

Existing knowledge distillation works for semantic segmentation mainly focus on transferring high-level contextual knowledge from teacher to student.

Knowledge Distillation Quantization +1

ChatGPT-Crawler: Find out if ChatGPT really knows what it's talking about

no code implementations6 Apr 2023 Aman Rangapur, Haoran Wang

Large language models have gained considerable interest for their impressive performance on various tasks.

Natural Language Inference

Multi-organ segmentation: a progressive exploration of learning paradigms under scarce annotation

no code implementations7 Feb 2023 Shiman Li, Haoran Wang, Yucong Meng, Chenxi Zhang, Zhijian Song

Precise delineation of multiple organs or abnormal regions in the human body from medical images plays an essential role in computer-aided diagnosis, surgical simulation, image-guided interventions, and especially in radiotherapy treatment planning.

Organ Segmentation Partially Labeled Datasets +2

EndoBoost: a plug-and-play module for false positive suppression during computer-aided polyp detection in real-world colonoscopy (with dataset)

no code implementations23 Dec 2022 Haoran Wang, Yan Zhu, Wenzheng Qin, Yizhe Zhang, Pinghong Zhou, QuanLin Li, Shuo Wang, Zhijian Song

In addition, the released dataset can be used to perform 'stress' tests on established detection systems and encourages further research toward robust and reliable computer-aided endoscopic image analysis.

Anomaly Detection Density Estimation

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

1 code implementation CVPR 2023 Lukas Hoyer, Dengxin Dai, Haoran Wang, Luc van Gool

MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA.

Image Classification object-detection +4

Combating Health Misinformation in Social Media: Characterization, Detection, Intervention, and Open Issues

no code implementations10 Nov 2022 Canyu Chen, Haoran Wang, Matthew Shapiro, Yunyu Xiao, Fei Wang, Kai Shu

Because of the uniqueness and importance of combating health misinformation in social media, we conduct this survey to further facilitate interdisciplinary research on this problem.

Misinformation Survey

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval

no code implementations21 Aug 2022 Haoran Wang, Dongliang He, Wenhao Wu, Boyang xia, Min Yang, Fu Li, Yunlong Yu, Zhong Ji, Errui Ding, Jingdong Wang

We introduce dynamic dictionaries for both modalities to enlarge the scale of image-text pairs, and diversity-sensitiveness is achieved by adaptive negative pair weighting.

Clustering Contrastive Learning +5

SoMoFormer: Social-Aware Motion Transformer for Multi-Person Motion Prediction

no code implementations19 Aug 2022 Xiaogang Peng, Yaodi Shen, Haoran Wang, Binling Nie, Yigang Wang, Zizhao Wu

Most prior methods only involve learning local pose dynamics for individual motion (without global body trajectory) and also struggle to capture complex interaction dependencies for social interactions.

motion prediction Representation Learning

Positively transitioned sentiment dialogue corpus for developing emotion-affective open-domain chatbots

no code implementations9 Aug 2022 Weixuan Wang, Wei Peng, Chong Hsuan Huang, Haoran Wang

In this paper, we describe a data enhancement method for developing Emily, an emotion-affective open-domain chatbot.

Chatbot

Boosting Video-Text Retrieval with Explicit High-Level Semantics

no code implementations8 Aug 2022 Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding

Video-text retrieval (VTR) is an attractive yet challenging task for multi-modal understanding, which aims to search for relevant video (text) given a query (video).

Text Retrieval Video Captioning +2

Temporal Saliency Query Network for Efficient Video Recognition

no code implementations21 Jul 2022 Boyang xia, Zhihao Wang, Wenhao Wu, Haoran Wang, Jungong Han

For each category, the common pattern of it is employed as a query and the most salient frames are responded to it.

Action Recognition Video Recognition

NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition

no code implementations21 Jul 2022 Boyang xia, Wenhao Wu, Haoran Wang, Rui Su, Dongliang He, Haosen Yang, Xiaoran Fan, Wanli Ouyang

On the video level, a temporal attention module is learned under dual video-level supervisions on both the salient and the non-salient representations.

Action Recognition Video Classification +1

Can multi-label classification networks know what they don't know?

1 code implementation NeurIPS 2021 Haoran Wang, Weitang Liu, Alex Bocchieri, Yixuan Li

Our results show consistent improvement over previous methods that are based on the maximum-valued scores, which fail to capture joint information from multiple labels.

Multi-class Classification Multi-Label Classification +2

Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization

2 code implementations Part of the Lecture Notes in Computer Science book series 2021 Haoran Wang, Chong Li, Thibaut Tachon, Hongxing Wang, Sheng Yang, Sébastien Limet, Sophie Robert

We propose the Flex-Edge Recursive Graph and the Double Recursive Algorithm, successfully limiting our parallelization strategy generation to a linear complexity with a good quality of parallelization strategy.

Freeing Hybrid Distributed AI Training Configuration

2 code implementations Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering 2021 Haoran Wang

Hybrid Parallelism (HP), which applies different parallel strategies on different parts of DNNs, is more efficient but requires advanced configurations.

CONet: Channel Optimization for Convolutional Neural Networks

1 code implementation15 Aug 2021 Mahdi S. Hosseini, Jia Shu Zhang, Zhe Liu, Andre Fu, Jingxuan Su, Mathieu Tuli, Sepehr Hosseini, Arsh Kadakia, Haoran Wang, Konstantinos N. Plataniotis

To solve this, we introduce an efficient dynamic scaling algorithm -- CONet -- that automatically optimizes channel sizes across network layers for a given CNN.

Neural Architecture Search

Step-Wise Hierarchical Alignment Network for Image-Text Matching

no code implementations11 Jun 2021 Zhong Ji, Kexin Chen, Haoran Wang

Image-text matching plays a central role in bridging the semantic gap between vision and language.

Image-text matching Text Matching

Can multi-label classification networks know what they don’t know?

1 code implementation NeurIPS 2021 Haoran Wang, Weitang Liu, Alex Bocchieri, Yixuan Li

Our results show consistent improvement over previous methods that are based on the maximum-valued scores, which fail to capture joint information from multiple labels.

Multi-class Classification Multi-Label Classification +2

Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning

no code implementations19 May 2021 Haoran Wang, Shi Yu

Machine Learning (ML) has been embraced as a powerful tool by the financial industry, with notable applications spreading in various domains including investment management.

Deep Reinforcement Learning Management +3

Hierarchical Cost Analysis for Distributed DL

1 code implementation IEEE International Parallel and Distributed Processing Symposium Workshops 2021 Haoran Wang

In order to formalize the behaviors of the HP in distributed DL and quantitatively evaluate the cost caused by HP, we are studying Bridging DL composed by a double-level execution model associated with a symbolic cost model.

Code Generation

Strictly Decentralized Adaptive Estimation of External Fields using Reproducing Kernels

no code implementations23 Mar 2021 Jia Guo, Michael E. Kepler, Sai Tej Paruchuri, Haoran Wang, Andrew J. Kurdila, Daniel J. Stilwell

Approximations of the evolution of the ideal local estimate $\hat{g}^i_t$ of agent $i$ is constructed solely using observations made by agent $i$ on a fine time scale.

Unity

Energy-based Out-of-distribution Detection for Multi-label Classification

no code implementations1 Jan 2021 Haoran Wang, Weitang Liu, Alex Bocchieri, Yixuan Li

Our results show consistent improvement over previous methods that are based on the maximum-valued scores, which fail to capture joint information from multiple labels.

General Classification Multi-class Classification +4

Exploring Fluent Query Reformulations with Text-to-Text Transformers and Reinforcement Learning

no code implementations18 Dec 2020 Jerry Zikun Chen, Shi Yu, Haoran Wang

Query reformulation aims to alter noisy or ambiguous text sequences into coherent ones closer to natural language questions.

intent-classification Intent Classification +4

Context-Aware Graph Convolution Network for Target Re-identification

no code implementations8 Dec 2020 Deyi Ji, Haoran Wang, Hanzhe Hu, Weihao Gan, Wei Wu, Junjie Yan

Most existing re-identification methods focus on learning robust and discriminative features with deep convolution networks.

Vehicle Re-Identification

Frame Aggregation and Multi-Modal Fusion Framework for Video-Based Person Recognition

no code implementations19 Oct 2020 Fangtao Li, Wenzhe Wang, Zihe Liu, Haoran Wang, Chenghao Yan, Bin Wu

To tackle the challenges above, we propose a novel Frame Aggregation and Multi-Modal Fusion (FAMF) framework for video-based person recognition, which aggregates face features and incorporates them with multi-modal information to identify persons in videos.

Person Recognition

Learning Risk Preferences from Investment Portfolios Using Inverse Optimization

no code implementations4 Oct 2020 Shi Yu, Haoran Wang, Chaosheng Dong

Our approach allows the learner to continuously estimate real-time risk preferences using concurrent observed portfolios and market price data.

Decision Making Management

Classes Matter: A Fine-grained Adversarial Approach to Cross-domain Semantic Segmentation

1 code implementation ECCV 2020 Haoran Wang, Tong Shen, Wei zhang, Ling-Yu Duan, Tao Mei

To fully exploit the supervision in the source domain, we propose a fine-grained adversarial learning strategy for class-level feature alignment while preserving the internal structure of semantics across domains.

Domain Adaptation Semantic Segmentation +1

Consensus-Aware Visual-Semantic Embedding for Image-Text Matching

1 code implementation ECCV 2020 Haoran Wang, Ying Zhang, Zhong Ji, Yanwei Pang, Lin Ma

In this paper, we propose a Consensus-aware Visual-Semantic Embedding (CVSE) model to incorporate the consensus information, namely the commonsense knowledge shared between both modalities, into image-text matching.

Image Captioning Image-text matching +2

Large scale continuous-time mean-variance portfolio allocation via reinforcement learning

no code implementations26 Jul 2019 Haoran Wang

We propose to solve large scale Markowitz mean-variance (MV) portfolio allocation problem using reinforcement learning (RL).

reinforcement-learning Reinforcement Learning +1

Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework

1 code implementation25 Apr 2019 Haoran Wang, Xun Yu Zhou

We approach the continuous-time mean-variance (MV) portfolio selection with reinforcement learning (RL).

Continuous Control Portfolio Optimization +3

Saliency-Guided Attention Network for Image-Sentence Matching

no code implementations ICCV 2019 Zhong Ji, Haoran Wang, Jungong Han, Yanwei Pang

Concretely, the saliency detector provides the visual saliency information as the guidance for the two attention modules.

Sentence

Enhancing Topic Modeling for Short Texts with Auxiliary Word Embeddings

no code implementations22 Dec 2018 Chenliang Li, Yu Duan, Haoran Wang, Zhiqian Zhang, Aixin Sun, Zongyang Ma

Recent studies show that the Dirichlet Multinomial Mixture (DMM) model is effective for topic inference over short texts by assuming that each piece of short text is generated by a single topic.

text-classification Topic Models +1

Exploration versus exploitation in reinforcement learning: a stochastic control approach

no code implementations4 Dec 2018 Haoran Wang, Thaleia Zariphopoulou, Xunyu Zhou

We carry out a complete analysis of the problem in the linear--quadratic (LQ) setting and deduce that the optimal feedback control distribution for balancing exploitation and exploration is Gaussian.

reinforcement-learning Reinforcement Learning +1

Parameter-Free Spatial Attention Network for Person Re-Identification

3 code implementations29 Nov 2018 Haoran Wang, Yue Fan, Zexin Wang, Licheng Jiao, Bernt Schiele

We propose a novel architecture for Person Re-Identification, based on a novel parameter-free spatial attention layer introducing spatial relations among the feature map activations back to the model.

Person Re-Identification

Multi-task Sparse Learning with Beta Process Prior for Action Recognition

no code implementations CVPR 2013 Chunfeng Yuan, Weiming Hu, Guodong Tian, Shuang Yang, Haoran Wang

In this paper, we formulate human action recognition as a novel Multi-Task Sparse Learning(MTSL) framework which aims to construct a test sample with multiple features from as few bases as possible.

Action Recognition Sparse Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.