Search Results for author: Haoran Tang

Found 24 papers, 11 papers with code

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models

no code implementations24 Mar 2025 Meng Cao, Pengfei Hu, Yingyao Wang, Jihao Gu, Haoran Tang, Haoze Zhao, Jiahua Dong, Wangbo Yu, Ge Zhang, Ian Reid, Xiaodan Liang

Recent advancements in Large Video Language Models (LVLMs) have highlighted their potential for multi-modal understanding, yet evaluating their factual grounding in video contexts remains a critical unsolved challenge.

Retrieval-augmented Generation

Dual Mutual Learning Network with Global-local Awareness for RGB-D Salient Object Detection

1 code implementation3 Jan 2025 Kang Yi, Haoran Tang, Yumeng Li, Jing Xu, Jun Zhang

RGB-D salient object detection (SOD), aiming to highlight prominent regions of a given scene by jointly modeling RGB and depth information, is one of the challenging pixel-level prediction tasks.

object-detection RGB-D Salient Object Detection +1

PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos

1 code implementation2 Dec 2024 Meng Cao, Haoran Tang, Haoze Zhao, Hangyu Guo, Jiaheng Liu, Ge Zhang, Ruyang Liu, Qiang Sun, Ian Reid, Xiaodan Liang

In this paper, we propose PhysGame as a pioneering benchmark to evaluate physical commonsense violations in gameplay videos.

Question Answering Video Understanding

A Peaceman-Rachford Splitting Approach with Deep Equilibrium Network for Channel Estimation

1 code implementation31 Oct 2024 Dingli Yuan, Shitong Wu, Haoran Tang, Lu Yang, Chenghui Peng

The main idea is to construct a fixed-point equation for channel estimation, which can be implemented into the deep equilibrium (DEQ) model with a fixed network.

MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval

1 code implementation20 Aug 2024 Haoran Tang, Meng Cao, Jinfa Huang, Ruyang Liu, Peng Jin, Ge Li, Xiaodan Liang

Text-Video Retrieval (TVR) aims to align and associate relevant video content with corresponding natural language queries.

Mamba Natural Language Queries +2

Toward Structure Fairness in Dynamic Graph Embedding: A Trend-aware Dual Debiasing Approach

1 code implementation19 Jun 2024 Yicong Li, Yu Yang, Jiannong Cao, Shuaiqi Liu, Haoran Tang, Guandong Xu

We first identify biased structural evolutions in a dynamic graph based on the evolving trend of vertex degree and then propose FairDGE, the first structurally Fair Dynamic Graph Embedding algorithm.

Dynamic graph embedding Fairness +1

Expert-Guided Extinction of Toxic Tokens for Debiased Generation

no code implementations29 May 2024 Xueyao Sun, Kaize Shi, Haoran Tang, Guandong Xu, Qing Li

Large language models (LLMs) can elicit social bias during generations, especially when inference with toxic prompts.

Fairness Retrieval

ST-LLM: Large Language Models Are Effective Temporal Learners

1 code implementation30 Mar 2024 Ruyang Liu, Chen Li, Haoran Tang, Yixiao Ge, Ying Shan, Ge Li

In this paper, we investigate a straightforward yet unexplored question: Can we feed all spatial-temporal tokens into the LLM, thus delegating the task of video sequence modeling to the LLMs?

MVBench Reading Comprehension +3

Context Matters: Data-Efficient Augmentation of Large Language Models for Scientific Applications

2 code implementations12 Dec 2023 Xiang Li, Haoran Tang, Siyu Chen, Ziwei Wang, Anurag Maravi, Marcin Abram

In this paper, we explore the challenges inherent to Large Language Models (LLMs) like GPT-4, particularly their propensity for hallucinations, logic mistakes, and incorrect conclusions when tasked with answering complex questions.

Weighted Joint Maximum Mean Discrepancy Enabled Multi-Source-Multi-Target Unsupervised Domain Adaptation Fault Diagnosis

no code implementations20 Oct 2023 Zixuan Wang, Haoran Tang, Haibo Wang, Bo Qin, Mark D. Butala, Weiming Shen, Hongwei Wang

Despite the remarkable results that can be achieved by data-driven intelligent fault diagnosis techniques, they presuppose the same distribution of training and test data as well as sufficient labeled data.

Fault Diagnosis Unsupervised Domain Adaptation

Contrastive Learning Relies More on Spatial Inductive Bias Than Supervised Learning: An Empirical Study

no code implementations ICCV 2023 Yuanyi Zhong, Haoran Tang, Jun-Kun Chen, Yu-Xiong Wang

Though self-supervised contrastive learning (CL) has shown its potential to achieve state-of-the-art accuracy without any supervision, its behavior still remains under investigated by academia.

Contrastive Learning Inductive Bias

HashEncoding: Autoencoding with Multiscale Coordinate Hashing

no code implementations29 Nov 2022 Lukas Zhornyak, Zhengjie Xu, Haoran Tang, Jianbo Shi

We present HashEncoding, a novel autoencoding architecture that leverages a non-parametric multiscale coordinate hash function to facilitate a per-pixel decoder without convolutions.

Decoder Optical Flow Estimation

WEKA-Based: Key Features and Classifier for French of Five Countries

no code implementations10 Nov 2022 Zeqian Li, Keyu Qiu, Chenxu Jiao, Wen Zhu, Haoran Tang

This paper describes a French dialect recognition system that will appropriately distinguish between different regional French dialects.

Shuffle Augmentation of Features from Unlabeled Data for Unsupervised Domain Adaptation

no code implementations28 Jan 2022 Changwei Xu, Jianfei Yang, Haoran Tang, Han Zou, Cheng Lu, Tianshuo Zhang

Unsupervised Domain Adaptation (UDA), a branch of transfer learning where labels for target samples are unavailable, has been widely researched and developed in recent years with the help of adversarially trained models.

Transfer Learning Unsupervised Domain Adaptation

Reinforcement Learning with Deep Energy-Based Policies

4 code implementations ICML 2017 Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine

We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.

Q-Learning reinforcement-learning +2

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

3 code implementations NeurIPS 2017 Haoran Tang, Rein Houthooft, Davis Foote, Adam Stooke, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel

In this work, we describe a surprising finding: a simple generalization of the classic count-based approach can reach near state-of-the-art performance on various high-dimensional and/or continuous deep RL benchmarks.

Atari Games continuous-control +4

Cannot find the paper you are looking for? You can Submit a new open access paper.