Search Results for author: Yujin Wang

Found 16 papers, 3 papers with code

SRCB at SemEval-2022 Task 5: Pretraining Based Image to Text Late Sequential Fusion System for Multimodal Misogynous Meme Identification

no code implementations SemEval (NAACL) 2022 Jing Zhang, Yujin Wang

Online misogyny meme detection is an image/text multimodal classification task, the complicated relation of image and text challenges the intelligent system’s modality fusion learning capability.

Image to text

S2R-HDR: A Large-Scale Rendered Dataset for HDR Fusion

no code implementations10 Apr 2025 Yujin Wang, Jiarui Wu, Yichen Bian, Fan Zhang, Tianfan Xue

The generalization of learning-based high dynamic range (HDR) fusion is often limited by the availability of training data, as collecting large-scale HDR images from dynamic scenes is both costly and technically challenging.

Domain Adaptation HDR Reconstruction

EGVD: Event-Guided Video Diffusion Model for Physically Realistic Large-Motion Frame Interpolation

1 code implementation26 Mar 2025 Ziran Zhang, Xiaohui Li, Yihao Liu, Yujin Wang, Yueting Chen, Tianfan Xue, Shi Guo

Video frame interpolation (VFI) in scenarios with large motion remains challenging due to motion ambiguity between frames.

Video Frame Interpolation

RAD: Retrieval-Augmented Decision-Making of Meta-Actions with Vision-Language Models in Autonomous Driving

no code implementations18 Mar 2025 Yujin Wang, Quanfeng Liu, Zhengxin Jiang, Tianyi Wang, Junfeng Jiao, Hongqing Chu, Bingzhao Gao, Hong Chen

Additionally, we fine-tune VLMs on a specifically curated dataset derived from the NuScenes dataset to enhance their spatial perception and bird's-eye view image comprehension capabilities.

Autonomous Driving Decision Making +5

Goal Conditioned Reinforcement Learning for Photo Finishing Tuning

no code implementations10 Mar 2025 Jiarui Wu, Yujin Wang, Lingen Li, Zhang Fan, Tianfan Xue

To overcome these challenges, we propose a novel goal-conditioned reinforcement learning framework for efficiently tuning parameters using a goal image as a condition.

reinforcement-learning Reinforcement Learning

KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits

no code implementations18 Feb 2025 Xin Xia, Yujin Wang, Jun Zhou, Guisheng Zhong, Linning Cai, Chen Zhang

In this paper, we introduce KAPPA, an integrated framework designed to construct keyphrase-based patent portraits and enhance patent analysis.

Keyphrase Generation

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

no code implementations CVPR 2025 Zixuan Chen, Yujin Wang, Xin Cai, Zhiyuan You, Zheming Lu, Fan Zhang, Shi Guo, Tianfan Xue

In this work, we propose UltraFusion, the first exposure fusion technique that can merge input with 9 stops differences.

Tone Mapping

RAC3: Retrieval-Augmented Corner Case Comprehension for Autonomous Driving with Vision-Language Models

no code implementations15 Dec 2024 Yujin Wang, Quanfeng Liu, Jiaqi Fan, Jinlong Hong, Hongqing Chu, Mengjian Tian, Bingzhao Gao, Hong Chen

We evaluate RAC3 through extensive experiments using a curated dataset of corner case scenarios, demonstrating its ability to enhance semantic alignment, improve hallucination mitigation, and achieve superior performance metrics, such as Cosine Similarity and ROUGE-L scores.

Autonomous Driving Contrastive Learning +5

AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection

no code implementations30 Oct 2024 Yujin Wang, Tianyi Xu, Fan Zhang, Tianfan Xue, Jinwei Gu

Based on this, AdaptiveISP utilizes deep reinforcement learning to automatically generate an optimal ISP pipeline and the associated ISP parameters to maximize the detection performance.

Deep Reinforcement Learning object-detection +1

DualDn: Dual-domain Denoising via Differentiable ISP

1 code implementation27 Sep 2024 Ruikang Li, Yujin Wang, Shiqi Chen, Fan Zhang, Jinwei Gu, Tianfan Xue

The raw domain denoising adapts to sensor-specific noise as well as spatially varying noise levels, while the sRGB domain denoising adapts to ISP variations and removes residual noise amplified by the ISP.

Image Denoising

HDRFlow: Real-Time HDR Video Reconstruction with Large Motions

no code implementations CVPR 2024 Gangwei Xu, Yujin Wang, Jinwei Gu, Tianfan Xue, Xin Yang

HDRFlow has three novel designs: an HDR-domain alignment loss (HALoss), an efficient flow network with a multi-size large kernel (MLK), and a new HDR flow training scheme.

Optical Flow Estimation Video Reconstruction

Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising

no code implementations19 Sep 2023 Yujin Wang, Lingen Li, Tianfan Xue, Jinwei Gu

To address the trade-off between visual appeal and fidelity of high-frequency details in denoising tasks, we propose a novel approach called the Reconstruct-and-Generate Diffusion Model (RnG).

Image Denoising Image Restoration

Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition

no code implementations18 Feb 2023 Xie Chen, Ziyang Ma, Changli Tang, Yujin Wang, Zhisheng Zheng

However, the training of SSL models is computationally expensive and a common practice is to fine-tune a released SSL model on the specific task.

Self-Supervised Learning speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.