Search Results for author: Hanting Chen

Found 58 papers, 25 papers with code

Pangu DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning

no code implementations30 May 2025 Wenxuan Shi, Haochen Tan, Chuqiao Kuang, Xiaoguang Li, Xiaozhe Ren, Chen Zhang, Hanting Chen, Yasheng Wang, Lifeng Shang, Fisher Yu, Yunhe Wang

Building on this dataset, we propose DeepDiver, a Reinforcement Learning (RL) framework that promotes SIS by encouraging adaptive search policies through exploration under a real-world open-web environment.

Question Answering Reinforcement Learning (RL)

Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition

no code implementations28 May 2025 Hanting Chen, Yasheng Wang, Kai Han, Dong Li, Lin Li, Zhenni Bi, Jinpeng Li, Haoyu Wang, Fei Mi, Mingjian Zhu, Bin Wang, Kaikai Song, Yifei Fu, Xu He, Yu Luo, Chong Zhu, Quan He, Xueyu Wu, wei he, Hailin Hu, Yehui Tang, DaCheng Tao, Xinghao Chen, Yunhe Wang

This work presents Pangu Embedded, an efficient Large Language Model (LLM) reasoner developed on Ascend Neural Processing Units (NPUs), featuring flexible fast and slow thinking capabilities.

Large Language Model

Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs

no code implementations26 May 2025 Hanting Chen, Jiarui Qin, Jialong Guo, Tao Yuan, Yichun Yin, HuiLing Zhen, Yasheng Wang, Jinpeng Li, Xiaojun Meng, Meng Zhang, Rongju Ruan, Zheyuan Bai, Yehui Tang, Can Chen, Xinghao Chen, Fisher Yu, Ruiming Tang, Yunhe Wang

While structured pruning offers a promising avenue for model compression, existing methods often struggle with the detrimental effects of aggressive, simultaneous width and depth reductions, leading to substantial performance degradation.

Model Compression

DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution

no code implementations21 Apr 2025 Miaomiao Cai, Simiao Li, Wei Li, Xudong Huang, Hanting Chen, Jie Hu, Yunhe Wang

Recent advances in diffusion models have improved Real-World Image Super-Resolution (Real-ISR), but existing methods lack human feedback integration, risking misalignment with human preference and may leading to artifacts, hallucinations and harmful content generation.

Image Super-Resolution

A Physics-guided Multimodal Transformer Path to Weather and Climate Sciences

no code implementations19 Apr 2025 Jing Han, Hanting Chen, Kai Han, Xiaomeng Huang, Yongyun Hu, Wenjun Xu, DaCheng Tao, Ping Zhang

With the rapid development of machine learning in recent years, many problems in meteorology can now be addressed using AI models.

Transferable text data distillation by trajectory matching

no code implementations14 Apr 2025 Rong Yao, Hailin Hu, Yifei Fu, Hanting Chen, Wenyi Fang, Fanyi Du, Kai Han, Yunhe Wang

In the realm of large language model (LLM), as the size of large models increases, it also brings higher training costs.

ARC Large Language Model +2

U-REPA: Aligning Diffusion U-Nets to ViTs

no code implementations24 Mar 2025 Yuchuan Tian, Hanting Chen, Mengyu Zheng, Yuchen Liang, Chao Xu, Yunhe Wang

Representation Alignment (REPA) that aligns Diffusion Transformer (DiT) hidden-states with ViT visual encoders has proven highly effective in DiT training, demonstrating superior convergence properties, but it has not been validated on the canonical diffusion U-Net architecture that shows faster convergence compared to DiTs.

Image Generation

Autoregressive Image Generation Guided by Chains of Thought

no code implementations24 Feb 2025 Miaomiao Cai, Guanjie Wang, Wei Li, Zhijun Tu, Hanting Chen, Shaohui Lin, Jie Hu

In the field of autoregressive (AR) image generation, models based on the 'next-token prediction' paradigm of LLMs have shown comparable performance to diffusion models by reducing inductive biases.

Image Generation Logical Reasoning

Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression

no code implementations20 Feb 2025 Haoyu Wang, Tong Teng, Tianyu Guo, An Xiao, Duyu Tang, Hanting Chen, Yunhe Wang

Handling long-context sequences efficiently remains a significant challenge in large language models (LLMs).

8k

Learning Quantized Adaptive Conditions for Diffusion Models

no code implementations26 Sep 2024 Yuchen Liang, Yuchuan Tian, Lei Yu, Huao Tang, Jie Hu, Xiangzhong Fang, Hanting Chen

The curvature of ODE trajectories in diffusion models hinders their ability to generate high-quality images in a few number of function evaluations (NFE).

One Step Diffusion-based Super-Resolution with Time-Aware Distillation

1 code implementation14 Aug 2024 Xiao He, Huaao Tang, Zhijun Tu, Junchao Zhang, Kun Cheng, Hanting Chen, Yong Guo, Mingrui Zhu, Nannan Wang, Xinbo Gao, Jie Hu

Specifically, we introduce a novel score distillation strategy to align the data distribution between the outputs of the student and teacher models after minor noise perturbation.

Image Super-Resolution Knowledge Distillation

Omni-Dimensional Frequency Learner for General Time Series Analysis

no code implementations15 Jul 2024 Xianing Chen, Hanting Chen, Hailin Hu

Frequency domain representation of time series feature offers a concise representation for handling real-world time series data with inherent complexity and dynamic nature.

Anomaly Detection Diversity +3

Instruct-IPT: All-in-One Image Processing Transformer via Weight Modulation

1 code implementation30 Jun 2024 Yuchuan Tian, Jianhong Han, Hanting Chen, Yuanyuan Xi, Ning Ding, Jie Hu, Chao Xu, Yunhe Wang

Due to the unaffordable size and intensive computation costs of low-level vision models, All-in-One models that are designed to address a handful of low-level vision tasks simultaneously have been popular.

All Deblurring +3

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

1 code implementation24 Jun 2024 Yirui Chen, Xudong Huang, Quan Zhang, Wei Li, Mingjian Zhu, Qiangyu Yan, Simiao Li, Hanting Chen, Hailin Hu, Jie Yang, Wei Liu, Jie Hu

The extraordinary ability of generative models emerges as a new trend in image editing and generating realistic images, posing a serious threat to the trustworthiness of multimedia data and driving the research of image manipulation detection and location (IMDL).

Image Manipulation Image Manipulation Detection

Collaboration of Teachers for Semi-supervised Object Detection

no code implementations22 May 2024 Liyu Chen, Huaao Tang, Yi Wen, Hanting Chen, Wei Li, Junchao Liu, Jie Hu

To address these issues, we propose the Collaboration of Teachers Framework (CTF), which consists of multiple pairs of teacher and student models for training.

Object object-detection +2

U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers

1 code implementation4 May 2024 Yuchuan Tian, Zhijun Tu, Hanting Chen, Jie Hu, Chao Xu, Yunhe Wang

Diffusion Transformers (DiTs) introduce the transformer architecture to diffusion tasks for latent-space image generation.

Image Generation Inductive Bias

LIPT: Latency-aware Image Processing Transformer

1 code implementation9 Apr 2024 Junbo Qiao, Wei Li, Haizhen Xie, Hanting Chen, Yunshuai Zhou, Zhijun Tu, Jie Hu, Shaohui Lin

Extensive experiments on multiple image processing tasks (e. g., image super-resolution (SR), JPEG artifact reduction, and image denoising) demonstrate the superiority of LIPT on both latency and PSNR.

Image Denoising Image Super-Resolution

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

no code implementations3 Apr 2024 Simiao Li, Yun Zhang, Wei Li, Hanting Chen, Wenjia Wang, BingYi Jing, Shaohui Lin, Jie Hu

Knowledge distillation (KD) is a promising yet challenging model compression technique that transfers rich learning representations from a well-performing but cumbersome teacher model to a compact student model.

Image Super-Resolution Knowledge Distillation +1

IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions

no code implementations31 Mar 2024 Zhijun Tu, Kunpeng Du, Hanting Chen, Hailing Wang, Wei Li, Jie Hu, Yunhe Wang

Recent advances have demonstrated the powerful capability of transformer architecture in image restoration.

Deblurring Denoising +3

DiJiang: Efficient Large Language Models through Compact Kernelization

1 code implementation29 Mar 2024 Hanting Chen, Zhicheng Liu, Xutao Wang, Yuchuan Tian, Yunhe Wang

In an effort to reduce the computational load of Transformers, research on linear attention has gained significant momentum.

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

1 code implementation6 Feb 2024 Jianyuan Guo, Hanting Chen, Chengcheng Wang, Kai Han, Chang Xu, Yunhe Wang

Recent advancements in large language models have sparked interest in their extraordinary and near-superhuman capabilities, leading researchers to explore methods for evaluating and optimizing these abilities, which is called superalignment.

Few-Shot Learning Knowledge Distillation +1

Image Processing GNN: Breaking Rigidity in Super-Resolution

1 code implementation CVPR 2024 Yuchuan Tian, Hanting Chen, Chao Xu, Yunhe Wang

Alternatively we leverage the flexibility of graphs and propose the Image Processing GNN (IPG) model to break the rigidity that dominates previous SR methods.

Image Super-Resolution

PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

no code implementations27 Dec 2023 Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao

We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.

Language Modeling Language Modelling

CBQ: Cross-Block Quantization for Large Language Models

no code implementations13 Dec 2023 Xin Ding, Xiaoyu Liu, Zhijun Tu, Yun Zhang, Wei Li, Jie Hu, Hanting Chen, Yehui Tang, Zhiwei Xiong, Baoqun Yin, Yunhe Wang

Post-training quantization (PTQ) has played a key role in compressing large language models (LLMs) with ultra-low costs.

Quantization

GenDet: Towards Good Generalizations for AI-Generated Image Detection

1 code implementation12 Dec 2023 Mingjian Zhu, Hanting Chen, Mouxiao Huang, Wei Li, Hailin Hu, Jie Hu, Yunhe Wang

The misuse of AI imagery can have harmful societal effects, prompting the creation of detectors to combat issues like the spread of fake news.

Anomaly Detection

Towards Higher Ranks via Adversarial Weight Pruning

1 code implementation NeurIPS 2023 Yuchuan Tian, Hanting Chen, Tianyu Guo, Chao Xu, Yunhe Wang

To this end, we propose a Rank-based PruninG (RPG) method to maintain the ranks of sparse weights in an adversarial manner.

Model Compression Network Pruning

IFT: Image Fusion Transformer for Ghost-free High Dynamic Range Imaging

no code implementations26 Sep 2023 Hailing Wang, Wei Li, Yuanyuan Xi, Jie Hu, Hanting Chen, Longyu Li, Yunhe Wang

By matching similar patches between frames, objects with large motion ranges in dynamic scenes can be aligned, which can effectively alleviate the generation of artifacts.

Data Upcycling Knowledge Distillation for Image Super-Resolution

1 code implementation25 Sep 2023 Yun Zhang, Wei Li, Simiao Li, Hanting Chen, Zhijun Tu, Wenjia Wang, BingYi Jing, Shaohui Lin, Jie Hu

Knowledge distillation (KD) compresses deep neural networks by transferring task-related knowledge from cumbersome pre-trained teacher models to compact student models.

Image Super-Resolution Knowledge Distillation +1

Multiscale Positive-Unlabeled Detection of AI-Generated Texts

3 code implementations29 May 2023 Yuchuan Tian, Hanting Chen, Xutao Wang, Zheyuan Bai, Qinghua Zhang, Ruifeng Li, Chao Xu, Yunhe Wang

Recent releases of Large Language Models (LLMs), e. g. ChatGPT, are astonishing at generating human-like texts, but they may impact the authenticity of texts.

Language Modelling text-classification +2

VanillaNet: the Power of Minimalism in Deep Learning

5 code implementations NeurIPS 2023 Hanting Chen, Yunhe Wang, Jianyuan Guo, DaCheng Tao

In this study, we introduce VanillaNet, a neural network architecture that embraces elegance in design.

Deep Learning Philosophy

RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis

1 code implementation CVPR 2023 Xudong Huang, Wei Li, Jie Hu, Hanting Chen, Yunhe Wang

We present Reference-guided Super-Resolution Neural Radiance Field (RefSR-NeRF) that extends NeRF to super resolution and photorealistic novel view synthesis.

NeRF Neural Rendering +2

Toward Accurate Post-Training Quantization for Image Super Resolution

2 code implementations CVPR 2023 Zhijun Tu, Jie Hu, Hanting Chen, Yunhe Wang

In this paper, we study post-training quantization(PTQ) for image super resolution using only a few unlabeled calibration images.

Image Super-Resolution Quantization

Brain-inspired Multilayer Perceptron with Spiking Neurons

4 code implementations CVPR 2022 Wenshuo Li, Hanting Chen, Jianyuan Guo, Ziyang Zhang, Yunhe Wang

However, due to the simplicity of their structures, the performance highly depends on the local features communication machenism.

Inductive Bias

Adder Attention for Vision Transformer

4 code implementations NeurIPS 2021 Han Shu, Jiahao Wang, Hanting Chen, Lin Li, Yujiu Yang, Yunhe Wang

With the new operation, vision transformers constructed using additions can also provide powerful feature representations.

Diversity

Positive and Unlabeled Federated Learning

no code implementations29 Sep 2021 Lin Xinyang, Hanting Chen, Yixing Xu, Chao Xu, Xiaolin Gui, Yiping Deng, Yunhe Wang

We study the problem of learning from positive and unlabeled (PU) data in the federated setting, where each client only labels a little part of their dataset due to the limitation of resources and time.

Federated Learning

Federated Learning with Positive and Unlabeled Data

1 code implementation21 Jun 2021 Xinyang Lin, Hanting Chen, Yixing Xu, Chao Xu, Xiaolin Gui, Yiping Deng, Yunhe Wang

We study the problem of learning from positive and unlabeled (PU) data in the federated setting, where each client only labels a little part of their dataset due to the limitation of resources and time.

Federated Learning

Learning Student Networks in the Wild

1 code implementation CVPR 2021 Hanting Chen, Tianyu Guo, Chang Xu, Wenshuo Li, Chunjing Xu, Chao Xu, Yunhe Wang

Experiments on various datasets demonstrate that the student networks learned by the proposed method can achieve comparable performance with those using the original dataset.

Knowledge Distillation Model Compression

Data-Free Knowledge Distillation for Image Super-Resolution

no code implementations CVPR 2021 Yiman Zhang, Hanting Chen, Xinghao Chen, Yiping Deng, Chunjing Xu, Yunhe Wang

Experiments on various datasets and architectures demonstrate that the proposed method is able to be utilized for effectively learning portable student networks without the original data, e. g., with 0. 16dB PSNR drop on Set5 for x2 super resolution.

Data-free Knowledge Distillation Image Super-Resolution +1

Universal Adder Neural Networks

no code implementations29 May 2021 Hanting Chen, Yunhe Wang, Chang Xu, Chao Xu, Chunjing Xu, Tong Zhang

The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values.

Winograd Algorithm for AdderNet

no code implementations12 May 2021 Wenshuo Li, Hanting Chen, Mingqiang Huang, Xinghao Chen, Chunjing Xu, Yunhe Wang

Adder neural network (AdderNet) is a new kind of deep model that replaces the original massive multiplications in convolutions by additions while preserving the high performance.

valid

AdderNet and its Minimalist Hardware Design for Energy-Efficient Artificial Intelligence

no code implementations25 Jan 2021 Yunhe Wang, Mingqiang Huang, Kai Han, Hanting Chen, Wei zhang, Chunjing Xu, DaCheng Tao

With a comprehensive comparison on the performance, power consumption, hardware resource consumption and network generalization capability, we conclude the AdderNet is able to surpass all the other competitors including the classical CNN, novel memristor-network, XNOR-Net and the shift-kernel based network, indicating its great potential in future high performance and energy-efficient artificial intelligence applications.

Quantization

A Survey on Visual Transformer

no code implementations23 Dec 2020 Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, Zhaohui Yang, Yiman Zhang, DaCheng Tao

Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism.

Image Classification Inductive Bias +1

Pre-Trained Image Processing Transformer

6 code implementations CVPR 2021 Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao

To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.

 Ranked #1 on Single Image Deraining on Rain100L (using extra training data)

Color Image Denoising Contrastive Learning +2

AdderSR: Towards Energy Efficient Image Super-Resolution

no code implementations CVPR 2021 Dehua Song, Yunhe Wang, Hanting Chen, Chang Xu, Chunjing Xu, DaCheng Tao

To this end, we thoroughly analyze the relationship between an adder operation and the identity mapping and insert shortcuts to enhance the performance of SR models using adder networks.

image-classification Image Classification +1

A Semi-Supervised Assessor of Neural Architectures

no code implementations CVPR 2020 Yehui Tang, Yunhe Wang, Yixing Xu, Hanting Chen, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu

A graph convolutional neural network is introduced to predict the performance of architectures based on the learned representations and their relation modeled by the graph.

Neural Architecture Search

Distilling portable Generative Adversarial Networks for Image Translation

no code implementations7 Mar 2020 Hanting Chen, Yunhe Wang, Han Shu, Changyuan Wen, Chunjing Xu, Boxin Shi, Chao Xu, Chang Xu

To promote the capability of student generator, we include a student discriminator to measure the distances between real images, and images generated by student and teacher generators.

Image-to-Image Translation Knowledge Distillation +1

Widening and Squeezing: Towards Accurate and Efficient QNNs

no code implementations3 Feb 2020 Chuanjian Liu, Kai Han, Yunhe Wang, Hanting Chen, Qi Tian, Chunjing Xu

Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.

Quantization

AdderNet: Do We Really Need Multiplications in Deep Learning?

7 code implementations CVPR 2020 Hanting Chen, Yunhe Wang, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu

The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values.

Deep Learning

Positive-Unlabeled Compression on the Cloud

2 code implementations NeurIPS 2019 Yixing Xu, Yunhe Wang, Hanting Chen, Kai Han, Chunjing Xu, DaCheng Tao, Chang Xu

In practice, only a small portion of the original training set is required as positive examples and more useful training examples can be obtained from the massive unlabeled data on the cloud through a PU classifier with an attention based multi-scale feature extractor.

Knowledge Distillation

Co-Evolutionary Compression for Unpaired Image Translation

2 code implementations ICCV 2019 Han Shu, Yunhe Wang, Xu Jia, Kai Han, Hanting Chen, Chunjing Xu, Qi Tian, Chang Xu

Generative adversarial networks (GANs) have been successfully used for considerable computer vision tasks, especially the image-to-image translation.

Image-to-Image Translation Translation

Data-Free Learning of Student Networks

3 code implementations ICCV 2019 Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, Qi Tian

Learning portable neural networks is very essential for computer vision for the purpose that pre-trained heavy deep models can be well applied on edge devices such as mobile phones and micro sensors.

Neural Network Compression

Learning Student Networks via Feature Embedding

no code implementations17 Dec 2018 Hanting Chen, Yunhe Wang, Chang Xu, Chao Xu, DaCheng Tao

Experiments on benchmark datasets and well-trained networks suggest that the proposed algorithm is superior to state-of-the-art teacher-student learning methods in terms of computational and storage complexity.

Knowledge Distillation

Cannot find the paper you are looking for? You can Submit a new open access paper.