Search Results for author: WenBo Hu

Found 45 papers, 20 papers with code

Revisiting PCA for time series reduction in temporal dimension

1 code implementation27 Dec 2024 Jiaxin Gao, WenBo Hu, Yuntian Chen

Revisiting PCA for Time Series Reduction in Temporal Dimension; Jiaxin Gao, Wenbo Hu, Yuntian Chen; Deep learning has significantly advanced time series analysis (TSA), enabling the extraction of complex patterns for tasks like classification, forecasting, and regression.

Computational Efficiency Dimensionality Reduction +2

SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage

no code implementations19 Dec 2024 Xiaoning Dong, WenBo Hu, Wei Xu, Tianxing He

It then employs a simple assistive task such as a masked language model task or an element lookup by position task to encode the semantics of the masked keywords.

Language Modeling Language Modelling +3

Unveiling Uncertainty: A Deep Dive into Calibration and Performance of Multimodal Large Language Models

1 code implementation19 Dec 2024 Zijun Chen, WenBo Hu, Guande He, Zhijie Deng, Zheng Zhang, Richang Hong

This paper investigates representative MLLMs, focusing on their calibration across various scenarios, including before and after visual fine-tuning, as well as before and after multimodal training of the base LLMs.

Autonomous Driving Image Captioning +2

Verbalized Representation Learning for Interpretable Few-Shot Generalization

1 code implementation27 Nov 2024 Cheng-Fu Yang, Da Yin, WenBo Hu, Nanyun Peng, Bolei Zhou, Kai-Wei Chang

Humans recognize objects after observing only a few examples, a remarkable capability enabled by their inherent language understanding of the real-world environment.

Language Modeling Language Modelling +2

TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction

no code implementations18 Nov 2024 Dadong Jiang, Zhihui Ke, Xiaobo Zhou, Zhi Hou, Xianghui Yang, WenBo Hu, Tie Qiu, Chunchao Guo

To address the above issue, we design a plug-and-play module called TimeFormer to enable existing deformable 3D Gaussians reconstruction methods with the ability to implicitly model motion patterns from a learning perspective.

MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

no code implementations10 Oct 2024 WenBo Hu, Jia-Chen Gu, Zi-Yi Dou, Mohsen Fayyaz, Pan Lu, Kai-Wei Chang, Nanyun Peng

In this paper, we introduce a multimodal retrieval-augmented generation benchmark, MRAG-Bench, in which we systematically identify and categorize scenarios where visually augmented knowledge is better than textual knowledge, for instance, more images from varying viewpoints.

Multiple-choice Question Answering +1

StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos

no code implementations11 Sep 2024 Sijie Zhao, WenBo Hu, Xiaodong Cun, Yong Zhang, Xiaoyu Li, Zhe Kong, Xiangjun Gao, Muyao Niu, Ying Shan

This paper presents a novel framework for converting 2D videos to immersive stereoscopic 3D, addressing the growing demand for 3D content in immersive experience.

Video Inpainting

ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis

1 code implementation3 Sep 2024 Wangbo Yu, Jinbo Xing, Li Yuan, WenBo Hu, Xiaoyu Li, Zhipeng Huang, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian

Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames with precise camera pose control.

3D Generation 3D Reconstruction +3

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

1 code implementation3 Sep 2024 WenBo Hu, Xiangjun Gao, Xiaoyu Li, Sijie Zhao, Xiaodong Cun, Yong Zhang, Long Quan, Ying Shan

Our training approach enables the model to generate depth sequences with variable lengths at one time, up to 110 frames, and harvest both precise depth details and rich content diversity from realistic and synthetic datasets.

Diversity Monocular Depth Estimation +2

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

1 code implementation30 May 2024 Sijie Zhao, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Muyao Niu, Xiaoyu Li, WenBo Hu, Ying Shan

Moreover, since current diffusion-based approaches are often implemented using pre-trained text-to-image (T2I) models, directly training a video VAE without considering the compatibility with existing T2I models will result in a latent space gap between them, which will take huge computational resources for training to bridge the gap even with the T2I models as initialization.

Quantization

Matryoshka Query Transformer for Large Vision-Language Models

1 code implementation29 May 2024 WenBo Hu, Zi-Yi Dou, Liunian Harold Li, Amita Kamath, Nanyun Peng, Kai-Wei Chang

This raises the question: can we achieve flexibility in the number of visual tokens to suit different tasks and computational resources?

Language Modelling Representation Learning

Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh

no code implementations28 May 2024 Xiangjun Gao, Xiaoyu Li, Yiyu Zhuang, Qi Zhang, WenBo Hu, Chaopeng Zhang, Yao Yao, Ying Shan, Long Quan

This approach reduces the need to design various algorithms for different types of Gaussian manipulation.

Novel View Synthesis

Rip-NeRF: Anti-aliasing Radiance Fields with Ripmap-Encoded Platonic Solids

1 code implementation3 May 2024 Junchen Liu, WenBo Hu, Zhuo Yang, Jianteng Chen, Guoliang Wang, Xiaoxue Chen, Yantong Cai, Huan-ang Gao, Hao Zhao

Despite significant advancements in Neural Radiance Fields (NeRFs), the renderings may still suffer from aliasing and blurring artifacts, since it remains a fundamental challenge to effectively and efficiently characterize anisotropic areas induced by the cone-casting procedure.

VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models

1 code implementation22 Apr 2024 Haoyi Qiu, WenBo Hu, Zi-Yi Dou, Nanyun Peng

Our work also highlights the critical balance between faithfulness and coverage of model outputs, and encourages future works to address hallucinations in LVLMs while keeping their outputs informative.

Hallucination Informativeness +3

Inverse Rendering of Glossy Objects via the Neural Plenoptic Function and Radiance Fields

no code implementations CVPR 2024 Haoyuan Wang, WenBo Hu, Lei Zhu, Rynson W. H. Lau

Our method has two stages: the geometry of the target object and the pre-filtered environmental radiance fields are reconstructed in the first stage, and materials of the target object are estimated in the second stage with the proposed NeP and material-aware cone sampling strategy.

Inverse Rendering Object

Pixel-GS: Density Control with Pixel-aware Gradient for 3D Gaussian Splatting

no code implementations22 Mar 2024 Zheng Zhang, WenBo Hu, Yixing Lao, Tong He, Hengshuang Zhao

3D Gaussian Splatting (3DGS) has demonstrated impressive novel view synthesis results while advancing real-time rendering performance.

Novel View Synthesis

Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration

no code implementations17 Mar 2024 Zhihao Liang, Qi Zhang, WenBo Hu, Ying Feng, Lei Zhu, Kui Jia

This is because 3DGS treats each pixel as an isolated, single point rather than as an area, causing insensitivity to changes in the footprints of pixels.

Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing

no code implementations15 Mar 2024 Tian-Xing Xu, WenBo Hu, Yu-Kun Lai, Ying Shan, Song-Hai Zhang

3D Gaussian splatting, emerging as a groundbreaking approach, has drawn increasing attention for its capabilities of high-fidelity reconstruction and real-time rendering.

Disentanglement

Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series

no code implementations3 Dec 2023 Ying Liu, Peng Cui, WenBo Hu, Richang Hong

Our method not only produces accurate imputations that is robust to high missing rates, but also is computationally efficient due to the fast training of its non-generative model.

Imputation Missing Values +2

DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation

no code implementations19 Oct 2023 Bangbang Yang, Wenqi Dong, Lin Ma, WenBo Hu, Xiao Liu, Zhaopeng Cui, Yuewen Ma

To ensure meaningful and aligned textures to the scene, we develop a novel coarse-to-fine panoramic texture generation approach with dual texture alignment, which both considers the geometry and texture cues of the captured scenes.

3D geometry Texture Synthesis

Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting

no code implementations18 Oct 2023 Guande He, Peng Cui, Jianfei Chen, WenBo Hu, Jun Zhu

Despite the significant progress made in practical applications of aligned language models (LMs), they tend to be overconfident in output answers compared to the corresponding pre-trained LMs.

Multiple-choice

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

no code implementations10 Oct 2023 Wangbo Yu, Li Yuan, Yan-Pei Cao, Xiangjun Gao, Xiaoyu Li, WenBo Hu, Long Quan, Ying Shan, Yonghong Tian

Our contributions are twofold: First, we propose a Reference-Guided Novel View Enhancement (RGNV) technique that significantly improves the fidelity of diffusion-based zero-shot novel view synthesis methods.

3D Generation Image to 3D +1

Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields

no code implementations ICCV 2023 WenBo Hu, Yuling Wang, Lin Ma, Bangbang Yang, Lin Gao, Xiao Liu, Yuewen Ma

Despite the tremendous progress in neural radiance fields (NeRF), we still face a dilemma of the trade-off between quality and efficiency, e. g., MipNeRF presents fine-detailed and anti-aliased renderings but takes days for training, while Instant-ngp can accomplish the reconstruction in a few minutes but suffers from blurring or aliasing when rendering at various distances or resolutions due to ignoring the sampling area.

Client: Cross-variable Linear Integrated Enhanced Transformer for Multivariate Long-Term Time Series Forecasting

1 code implementation30 May 2023 Jiaxin Gao, WenBo Hu, Yuntian Chen

Long-term time series forecasting (LTSF) is a crucial aspect of modern society, playing a pivotal role in facilitating long-term planning and developing early warning systems.

Decoder Time Series +1

Iterative Adversarial Attack on Image-guided Story Ending Generation

no code implementations16 May 2023 Youze Wang, WenBo Hu, Richang Hong

Multimodal learning involves developing models that can integrate information from various sources like images and texts.

Adversarial Robustness Adversarial Text +4

VGTS: Visually Guided Text Spotting for Novel Categories in Historical Manuscripts

no code implementations3 Apr 2023 WenBo Hu, Hongjian Zhan, Xinchen Ma, Cong Liu, Bing Yin, Yue Lu

In the field of historical manuscript research, scholars frequently encounter novel symbols in ancient texts, investing considerable effort in their identification and documentation.

Geometric Matching Metric Learning +4

Uncertainty Calibration for Counterfactual Propensity Estimation in Recommendation

no code implementations23 Mar 2023 WenBo Hu, Xin Sun, Qiang Liu, Le Wu, Liang Wang

To address this, we evaluate the quality of propensity scores from the perspective of uncertainty calibration, proposing the use of expected calibration error (ECE) as a measure of propensity-score quality.

counterfactual Generalization Bounds +3

Disentangled Image Colorization via Global Anchors

1 code implementation SIGGRAPH 2022 Menghan Xia, WenBo Hu, Tien-Tsin Wong, Jue Wang

Our key insight is that several carefully located anchors could approximately represent the color distribution of an image, and conditioned on the anchor colors, we can predict the image color in a deterministic manner by utilizing internal correlation.

Colorization Image Colorization

TgDLF2.0: Theory-guided deep-learning for electrical load forecasting via Transformer and transfer learning

no code implementations5 Oct 2022 Jiaxin Gao, WenBo Hu, Dongxiao Zhang, Yuntian Chen

Accurate electrical load forecasting is beneficial for better scheduling of electricity generation and saving electrical energy.

Deep Learning Load Forecasting +2

SUES-200: A Multi-height Multi-scene Cross-view Image Benchmark Across Drone and Satellite

1 code implementation22 Apr 2022 Runzhe Zhu, Ling Yin, Mingze Yang, Fei Wu, Yuncheng Yang, WenBo Hu

However, existing public datasets do not include images obtained by drones at different heights, and the types of scenes are relatively homogeneous, which yields issues in assessing a model's capability to adapt to complex and changing scenes.

geo-localization

Scale-arbitrary Invertible Image Downscaling

no code implementations29 Jan 2022 Jinbo Xing, WenBo Hu, Tien-Tsin Wong

In this paper, we propose a scale-Arbitrary Invertible image Downscaling Network (AIDN), to natively downscale HR images with arbitrary scale factors.

Super-Resolution

Conditional Directed Graph Convolution for 3D Human Pose Estimation

1 code implementation16 Jul 2021 WenBo Hu, Changgong Zhang, Fangneng Zhan, Lei Zhang, Tien-Tsin Wong

Based on this representation, we further propose a spatial-temporal conditional directed graph convolution to leverage varying non-local dependence for different poses by conditioning the graph topology on input poses.

3D Human Pose Estimation

Sparse Needlets for Lighting Estimation with Spherical Transport Loss

no code implementations ICCV 2021 Fangneng Zhan, Changgong Zhang, WenBo Hu, Shijian Lu, Feiying Ma, Xuansong Xie, Ling Shao

Accurate lighting estimation is challenging yet critical to many computer vision and computer graphics tasks such as high-dynamic-range (HDR) relighting.

Lighting Estimation

Physics-Guided Discovery of Highly Nonlinear Parametric Partial Differential Equations

1 code implementation2 Jun 2021 Yingtao Luo, Qiang Liu, Yuntian Chen, WenBo Hu, Tian Tian, Jun Zhu

Especially, the discovery of PDEs with highly nonlinear coefficients from low-quality data remains largely under-addressed.

Density Estimation Model Optimization

StackVAE-G: An efficient and interpretable model for time series anomaly detection

1 code implementation18 May 2021 Wenkai Li, WenBo Hu, Ting Chen, Ning Chen, Cheng Feng

We also leverage a graph learning module to learn a sparse adjacency matrix to explicitly capture the stable interrelation structure among multiple time series channels for the interpretable pattern reconstruction of interrelated channels.

Anomaly Detection Graph Learning +3

Accurate and Reliable Forecasting using Stochastic Differential Equations

no code implementations28 Mar 2021 Peng Cui, Zhijie Deng, WenBo Hu, Jun Zhu

It is critical yet challenging for deep learning models to properly characterize uncertainty that is pervasive in real-world environments.

Prediction Intervals Uncertainty Quantification

Bidirectional Projection Network for Cross Dimension Scene Understanding

1 code implementation CVPR 2021 WenBo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong

Via the \emph{BPM}, complementary 2D and 3D information can interact with each other in multiple architectural levels, such that advantages in these two visual domains can be combined for better scene recognition.

2D Semantic Segmentation 3D Semantic Segmentation +3

GMLight: Lighting Estimation via Geometric Distribution Approximation

1 code implementation20 Feb 2021 Fangneng Zhan, Yingchen Yu, Changgong Zhang, Rongliang Wu, WenBo Hu, Shijian Lu, Feiying Ma, Xuansong Xie, Ling Shao

This paper presents Geometric Mover's Light (GMLight), a lighting estimation framework that employs a regression network and a generative projector for effective illumination estimation.

Lighting Estimation regression

Deep Halftoning With Reversible Binary Pattern

1 code implementation ICCV 2021 Menghan Xia, WenBo Hu, Xueting Liu, Tien-Tsin Wong

Existing halftoning algorithms usually drop colors and fine details when dithering color images with binary dot patterns, which makes it extremely difficult to recover the original information.

Series Saliency: Temporal Interpretation for Multivariate Time Series Forecasting

no code implementations16 Dec 2020 Qingyi Pan, WenBo Hu, Jun Zhu

Though deep learning methods have recently been developed to give superior forecasting results, it is crucial to improve the interpretability of time series models.

Data Augmentation Multivariate Time Series Forecasting +1

Mononizing Binocular Videos

1 code implementation3 Sep 2020 Wenbo Hu, Menghan Xia, Chi-Wing Fu, Tien-Tsin Wong

This paper presents the idea ofmono-nizingbinocular videos and a frame-work to effectively realize it.

Image and Video Processing Graphics

SAM: Semantic Attribute Modulation for Language Modeling and Style Variation

no code implementations1 Jul 2017 Wenbo Hu, Lifeng Hua, Lei LI, Hang Su, Tian Wang, Ning Chen, Bo Zhang

This paper presents a Semantic Attribute Modulation (SAM) for language modeling and style variation.

Attribute Language Modeling +1

Fast Sampling for Bayesian Max-Margin Models

no code implementations27 Apr 2015 Wenbo Hu, Jun Zhu, Bo Zhang

Bayesian max-margin models have shown superiority in various practical applications, such as text categorization, collaborative prediction, social network link prediction and crowdsourcing, and they conjoin the flexibility of Bayesian modeling and predictive strengths of max-margin learning.

Link Prediction Text Categorization

Cannot find the paper you are looking for? You can Submit a new open access paper.