Search Results for author: Ming Sun

Found 59 papers, 13 papers with code

Compact Generalized Non-local Network

2 code implementations • NeurIPS 2018 • Kaiyu Yue, Ming Sun, Yuchen Yuan, Feng Zhou, Errui Ding, Fuxin Xu

The non-local module is designed for capturing long-range spatio-temporal dependencies in images and videos.

260

Paper
Code

Inception Convolution with Efficient Dilation Search

1 code implementation • CVPR 2021 • Jie Liu, Chuming Li, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu

To develop a practical method for learning complex inception convolution based on the data, a simple but effective search algorithm, referred to as efficient dilation optimization (EDO), is developed.

Human Detection Instance Segmentation +4

112

Paper
Code

Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition

1 code implementation • ECCV 2018 • Ming Sun, Yuchen Yuan, Feng Zhou, Errui Ding

Attention-based learning for fine-grained image recognition remains a challenging task, where most of the existing methods treat each object part in isolation, while neglecting the correlations among them.

Ranked #59 on Fine-Grained Image Classification on Stanford Cars

Fine-Grained Image Recognition Metric Learning

Paper
Code

Improving Auto-Augment via Augmentation-Wise Weight Sharing

1 code implementation • NeurIPS 2020 • Keyu Tian, Chen Lin, Ming Sun, Luping Zhou, Junjie Yan, Wanli Ouyang

On CIFAR-10, we achieve a top-1 error rate of 1. 24%, which is currently the best performing single model without extra training data.

Paper
Code

GLiT: Neural Architecture Search for Global and Local Image Transformer

2 code implementations • ICCV 2021 • BoYu Chen, Peixia Li, Chuming Li, Baopu Li, Lei Bai, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang

We introduce the first Neural Architecture Search (NAS) method to find a better transformer architecture for image recognition.

Ranked #498 on Image Classification on ImageNet

Image Classification Neural Architecture Search

Paper
Code

DETR for Crowd Pedestrian Detection

1 code implementation • 12 Dec 2020 • Matthieu Lin, Chuming Li, Xingyuan Bu, Ming Sun, Chen Lin, Junjie Yan, Wanli Ouyang, Zhidong Deng

Furthermore, the bipartite match of ED harms the training efficiency due to the large ground truth number in crowd scenes.

Pedestrian Detection

Paper
Code

BN-NAS: Neural Architecture Search with Batch Normalization

1 code implementation • ICCV 2021 • BoYu Chen, Peixia Li, Baopu Li, Chen Lin, Chuming Li, Ming Sun, Junjie Yan, Wanli Ouyang

We present BN-NAS, neural architecture search with Batch Normalization (BN-NAS), to accelerate neural architecture search (NAS).

Neural Architecture Search

Paper
Code

Zoom-VQA: Patches, Frames and Clips Integration for Video Quality Assessment

1 code implementation • 13 Apr 2023 • Kai Zhao, Kun Yuan, Ming Sun, Xing Wen

Video quality assessment (VQA) aims to simulate the human perception of video quality, which is influenced by factors ranging from low-level color and texture details to high-level semantic content.

Video Quality Assessment Visual Question Answering (VQA)

Paper
Code

NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu

This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.

valid Video Quality Assessment +1

Paper
Code

Reconstructed Convolution Module Based Look-Up Tables for Efficient Image Super-Resolution

1 code implementation • ICCV 2023 • Guandu Liu, Yukang Ding, Mading Li, Ming Sun, Xing Wen, Bin Wang

To enlarge RF with contained LUT sizes, we propose a novel Reconstructed Convolution(RC) module, which decouples channel-wise and spatial calculation.

Image Super-Resolution

Paper
Code

Evolving Search Space for Neural Architecture Search

1 code implementation • ICCV 2021 • Yuanzheng Ci, Chen Lin, Ming Sun, BoYu Chen, Hongwen Zhang, Wanli Ouyang

The automation of neural architecture design has been a coveted alternative to human experts.

Knowledge Distillation Neural Architecture Search

Paper
Code

Acoustic scene analysis with multi-head attention networks

1 code implementation • 16 Sep 2019 • Weimin Wang, Weiran Wang, Ming Sun, Chao Wang

Acoustic Scene Classification (ASC) is a challenging task, as a single scene may involve multiple events that contain complex sound patterns.

Acoustic Scene Classification General Classification +1

Paper
Code

A Galaxy-Scale Fountain of Cold Molecular Gas Pumped by a Black Hole

1 code implementation • 1 Aug 2018 • Grant R. Tremblay, Françoise Combes, J. B. Raymond Oonk, Helen R. Russell, Michael A. McDonald, Massimo Gaspari, Bernd Husemann, Paul E. J. Nulsen, Brian R. McNamara, Stephen L. Hamer, Christopher P. O'Dea, Stefi A. Baum, Timothy A. Davis, Megan Donahue, G. Mark Voit, Alastair C. Edge, Elizabeth L. Blanton, Malcolm N. Bremer, Esra Bulbul, Tracy E. Clarke, Laurence P. David, Louise O. V. Edwards, Dominic A. Eggerman, Andrew C. Fabian, William R. Forman, Christine Jones, Nathaniel Kerman, Ralph P. Kraft, Yuan Li, Meredith C. Powell, Scott W. Randall, Philippe Salomé, Aurora Simionescu, Yuanyuan Su, Ming Sun, C. Megan Urry, Adrian N. Vantyghem, Belinda J. Wilkes, John A. ZuHone

The entire scenario is therefore consistent with a galaxy-spanning "fountain", wherein cold gas clouds drain into the black hole accretion reservoir, powering jets and bubbles that uplift a cooling plume of low-entropy multiphase gas, which may stimulate additional cooling and accretion as part of a self-regulating feedback loop.

Astrophysics of Galaxies

Paper
Code

Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting

no code implementations • 5 May 2017 • Ming Sun, Anirudh Raju, George Tucker, Sankaran Panchapagesan, Geng-Shen Fu, Arindam Mandal, Spyros Matsoukas, Nikko Strom, Shiv Vitaladevuni

Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields $67. 6\%$ relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.

Small-Footprint Keyword Spotting

Paper
Add Code

An Empirical Evaluation of Zero Resource Acoustic Unit Discovery

no code implementations • 5 Feb 2017 • Chunxi Liu, Jinyi Yang, Ming Sun, Santosh Kesiraju, Alena Rott, Lucas Ondel, Pegah Ghahremani, Najim Dehak, Lukas Burget, Sanjeev Khudanpur

Acoustic unit discovery (AUD) is a process of automatically identifying a categorical acoustic unit inventory from speech and producing corresponding acoustic unit tokenizations.

Acoustic Unit Discovery

Paper
Add Code

Generalized Canonical Correlation Analysis for Classification

no code implementations • 30 Apr 2013 • Cencheng Shen, Ming Sun, Minh Tang, Carey E. Priebe

For multiple multivariate data sets, we derive conditions under which Generalized Canonical Correlation Analysis (GCCA) improves classification performance of the projected datasets, compared to standard Canonical Correlation Analysis (CCA) using only two data sets.

Classification General Classification

Paper
Add Code

Conversational Strategies for Robustly Managing Dialog in Public Spaces

no code implementations • WS 2014 • Aasish Pappu, Ming Sun, Seshadri Sridharan, Alex Rudnicky, er

Paper
Add Code

Semi-supervised Acoustic Event Detection based on tri-training

no code implementations • 29 Apr 2019 • Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang

This paper presents our work of training acoustic event detection (AED) models using unlabeled dataset.

Event Detection Knowledge Distillation

Paper
Add Code

Compression of Acoustic Event Detection Models with Low-rank Matrix Factorization and Quantization Training

no code implementations • NIPS Workshop CDNNRIA 2018 • Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang

In this paper, we present a compression approach based on the combination of low-rank matrix factorization and quantization training, to reduce complexity for neural network based acoustic event detection (AED) models.

Event Detection Quantization

Paper
Add Code

Neural Text Normalization with Subword Units

no code implementations • NAACL 2019 • Courtney Mansfield, Ming Sun, Yuzong Liu, G, Ankur he, Bj{\"o}rn Hoffmeister

We find subword models with additional linguistic features yield the best performance (with a word error rate of 0. 17{\%}).

Machine Translation Natural Language Understanding +5

Paper
Add Code

AppDialogue: Multi-App Dialogues for Intelligent Assistants

no code implementations • LREC 2016 • Ming Sun, Yun-Nung Chen, Zhenhao Hua, Yulian Tamres-Rudnicky, Arnab Dash, Alex Rudnicky, er

Users will interact with an individual app on smart devices (e. g., phone, TV, car) to fulfill a specific goal (e. g. find a photographer), but users may also pursue more complex tasks that will span multiple domains and apps (e. g. plan a wedding ceremony).

Paper
Add Code

Compression of Acoustic Event Detection Models With Quantized Distillation

no code implementations • 1 Jul 2019 • Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang

Acoustic Event Detection (AED), aiming at detecting categories of events based on audio signals, has found application in many intelligent systems.

Event Detection Knowledge Distillation +1

Paper
Add Code

Efficient Neural Architecture Transformation Searchin Channel-Level for Object Detection

no code implementations • 5 Sep 2019 • Junran Peng, Ming Sun, Zhao-Xiang Zhang, Tieniu Tan, Junjie Yan

With the combination of these two designs, an architecture transformation scheme could be discovered to adapt a network designed for image classification to task of object detection.

Image Classification Neural Architecture Search +3

Paper
Add Code

POD: Practical Object Detection with Scale-Sensitive Network

no code implementations • ICCV 2019 • Junran Peng, Ming Sun, Zhao-Xiang Zhang, Tieniu Tan, Junjie Yan

Scale-sensitive object detection remains a challenging task, where most of the existing methods could not learn it explicitly and are not robust to scale variance.

Object object-detection +1

Paper
Add Code

Improving One-shot NAS by Suppressing the Posterior Fading

no code implementations • CVPR 2020 • Xiang Li, Chen Lin, Chuming Li, Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang

In this paper, we analyse existing weight sharing one-shot NAS approaches from a Bayesian point of view and identify the posterior fading problem, which compromises the effectiveness of shared weights.

Neural Architecture Search object-detection +2

Paper
Add Code

Efficient Neural Architecture Transformation Search in Channel-Level for Object Detection

no code implementations • NeurIPS 2019 • Junran Peng, Ming Sun, Zhao-Xiang Zhang, Tieniu Tan, Junjie Yan

Instead of searching and constructing an entire network, NATS explores the architecture space on the base of existing network and reusing its weights.

Image Classification Neural Architecture Search +3

Paper
Add Code

Computation Reallocation for Object Detection

no code implementations • ICLR 2020 • Feng Liang, Chen Lin, Ronghao Guo, Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang

However, classification allocation pattern is usually adopted directly to object detector, which is proved to be sub-optimal.

Instance Segmentation Neural Architecture Search +4

Paper
Add Code

Few-shot acoustic event detection via meta-learning

no code implementations • 21 Feb 2020 • Bowen Shi, Ming Sun, Krishna C. Puvvada, Chieh-Chi Kao, Spyros Matsoukas, Chao Wang

We study few-shot acoustic event detection (AED) in this paper.

Event Detection Few-Shot Learning

Paper
Add Code

Large-Scale Object Detection in the Wild from Imbalanced Multi-Labels

no code implementations • CVPR 2020 • Junran Peng, Xingyuan Bu, Ming Sun, Zhao-Xiang Zhang, Tieniu Tan, Junjie Yan

Training with more data has always been the most stable and effective way of improving performance in deep learning era.

Long-tail Learning Object +2

Paper
Add Code

Powering One-shot Topological NAS with Stabilized Share-parameter Proxy

no code implementations • ECCV 2020 • Ronghao Guo, Chen Lin, Chuming Li, Keyu Tian, Ming Sun, Lu Sheng, Junjie Yan

Specifically, the difficulties for architecture searching in such a complex space has been eliminated by the proposed stabilized share-parameter proxy, which employs Stochastic Gradient Langevin Dynamics to enable fast shared parameter sampling, so as to achieve stabilized measurement of architecture performance even in search space with complex topological structures.

Neural Architecture Search

Paper
Add Code

Atacama Compact Array Measurements of the Molecular Mass in the NGC 5044 Cooling Flow Group

no code implementations • 3 Apr 2020 • Gerrit Schellenberger, Laurence P. David, Jan Vrtilek, Ewan O'Sullivan, Jeremy Lim, William Forman, Ming Sun, Francoise Combes, Philippe Salome, Christine Jones, Simona Giacintucci, Alastair Edge, Fabio Gastaldello, Pasquale Temi, Fabrizio Brighenti, Sandro Bardelli

This indicates that the two giant molecular clouds seen in absorption are most likely within the sphere of influence of the supermassive black hole.

Astrophysics of Galaxies

Paper
Add Code

On Front-end Gain Invariant Modeling for Wake Word Spotting

no code implementations • 13 Oct 2020 • Yixin Gao, Noah D. Stein, Chieh-Chi Kao, Yunliang Cai, Ming Sun, Tao Zhang, Shiv Vitaladevuni

Since the WW model is trained with the AFE-processed audio data, its performance is sensitive to AFE variations, such as gain changes.

Paper
Add Code

Adaptive Gradient Method with Resilience and Momentum

no code implementations • 21 Oct 2020 • Jie Liu, Chen Lin, Chuming Li, Lu Sheng, Ming Sun, Junjie Yan, Wanli Ouyang

Several variants of stochastic gradient descent (SGD) have been proposed to improve the learning effectiveness and efficiency when training deep neural networks, among which some recent influential attempts would like to adaptively control the parameter-wise learning rate (e. g., Adam and RMSProp).

Paper
Add Code

Multi-Task Self-Supervised Pre-Training for Music Classification

no code implementations • 5 Feb 2021 • Ho-Hsiang Wu, Chieh-Chi Kao, Qingming Tang, Ming Sun, Brian McFee, Juan Pablo Bello, Chao Wang

Deep learning is very data hungry, and supervised learning especially requires massive labeled data to work well.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Add Code

Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight

no code implementations • ECCV 2020 • Ming Sun, Haoxuan Dou, Junjie Yan

Transfer learning can boost the performance on the targettask by leveraging the knowledge of the source domain.

Neural Architecture Search object-detection +2

Paper
Add Code

AutoSampling: Search for Effective Data Sampling Schedules

no code implementations • 28 May 2021 • Ming Sun, Haoxuan Dou, Baopu Li, Lei Cui, Junjie Yan, Wanli Ouyang

Data sampling acts as a pivotal role in training deep learning models.

Image Classification

Paper
Add Code

PSViT: Better Vision Transformer via Token Pooling and Attention Sharing

no code implementations • 7 Aug 2021 • BoYu Chen, Peixia Li, Baopu Li, Chuming Li, Lei Bai, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang

Then, a compact set of the possible combinations for different token pooling and attention sharing mechanisms are constructed.

Paper
Add Code

Once Quantized for All: Progressively Searching for Quantized Compact Models

no code implementations • 28 Sep 2020 • Mingzhu Shen, Feng Liang, Chuming Li, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang

Automatic search of Quantized Neural Networks (QNN) has attracted a lot of attention.

Neural Architecture Search Quantization

Paper
Add Code

Low-bit quantization and quantization-aware training for small-footprint keyword spotting

no code implementations • 19 Oct 2018 • Yuriy Mishchenko, Yusuf Goren, Ming Sun, Chris Beauchene, Spyros Matsoukas, Oleg Rybakov, Shiv Naga Prasad Vitaladevuni

We investigate low-bit quantization to reduce computational cost of deep neural network (DNN) based keyword spotting (KWS).

Quantization Small-Footprint Keyword Spotting

Paper
Add Code

Federated Self-Supervised Learning for Acoustic Event Classification

no code implementations • 22 Mar 2022 • Meng Feng, Chieh-Chi Kao, Qingming Tang, Ming Sun, Viktor Rozgic, Spyros Matsoukas, Chao Wang

Standard acoustic event classification (AEC) solutions require large-scale collection of data from client devices for model optimization.

Classification Continual Learning +3

Paper
Add Code

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

no code implementations • 25 May 2022 • Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park

The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).

Image Restoration Vocal Bursts Intensity Prediction

Paper
Add Code

Impact of Acoustic Event Tagging on Scene Classification in a Multi-Task Learning Framework

no code implementations • 27 Jun 2022 • Rahil Parikh, Harshavardhan Sundar, Ming Sun, Chao Wang, Spyros Matsoukas

We conclude that this improvement in ASC performance comes from the regularization effect of using AET and not from the network's improved ability to discern between acoustic events.

Acoustic Scene Classification Multi-Task Learning +1

Paper
Add Code

Global Priors Guided Modulation Network for Joint Super-Resolution and Inverse Tone-Mapping

no code implementations • 14 Aug 2022 • Gang He, Shaoyi Long, Li Xu, Chang Wu, Jinjia Zhou, Ming Sun, Xing Wen, Yurong Dai

Joint super-resolution and inverse tone-mapping (SR-ITM) aims to enhance the visual quality of videos that have quality deficiencies in resolution and dynamic range.

4k inverse tone mapping +3

Paper
Add Code

SDRTV-to-HDRTV Conversion via Spatial-Temporal Feature Fusion

no code implementations • 4 Nov 2022 • Kepeng Xu, Li Xu, Gang He, Chang Wu, Zijia Ma, Ming Sun, Yu-Wing Tai

To evaluate the performance of the proposed method, we construct a corresponding multi-frame dataset using HDR video of the HDR10 standard to conduct a comprehensive evaluation of different methods.

Paper
Add Code

LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting

no code implementations • 9 Nov 2022 • Haichuan Yang, Zhaojun Yang, Li Wan, Biqiao Zhang, Yangyang Shi, Yiteng Huang, Ivaylo Enchev, Limin Tang, Raziel Alvarez, Ming Sun, Xin Lei, Raghuraman Krishnamoorthi, Vikas Chandra

This paper proposes a hardware-efficient architecture, Linearized Convolution Network (LiCo-Net) for keyword spotting.

Keyword Spotting

Paper
Add Code

Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches

no code implementations • 17 Feb 2023 • Vinicius Ribeiro, Yiteng Huang, Yuan Shangguan, Zhaojun Yang, Li Wan, Ming Sun

The third, proposed by us, is a hybrid solution in which the model is trained with a small set of aligned data and then tuned with a sizeable unaligned dataset.

Paper
Add Code

Quality-aware Pre-trained Models for Blind Image Quality Assessment

no code implementations • CVPR 2023 • Kai Zhao, Kun Yuan, Ming Sun, Mading Li, Xing Wen

Blind image quality assessment (BIQA) aims to automatically evaluate the perceived quality of a single image, whose performance has been improved by deep learning-based methods in recent years.

Blind Image Quality Assessment Self-Supervised Learning

Paper
Add Code

NTIRE 2023 Quality Assessment of Video Enhancement Challenge

no code implementations • 19 Jul 2023 • Xiaohong Liu, Xiongkuo Min, Wei Sun, Yulun Zhang, Kai Zhang, Radu Timofte, Guangtao Zhai, Yixuan Gao, Yuqin Cao, Tengchuan Kou, Yunlong Dong, Ziheng Jia, Yilin Li, Wei Wu, Shuming Hu, Sibin Deng, Pengxiang Xiao, Ying Chen, Kai Li, Kai Zhao, Kun Yuan, Ming Sun, Heng Cong, Hao Wang, Lingzhi Fu, Yusheng Zhang, Rongyu Zhang, Hang Shi, Qihang Xu, Longan Xiao, Zhiliang Ma, Mirko Agarla, Luigi Celona, Claudio Rota, Raimondo Schettini, Zhiwei Huang, Yanan Li, Xiaotao Wang, Lei Lei, Hongye Liu, Wei Hong, Ironhead Chuang, Allen Lin, Drake Guan, Iris Chen, Kae Lou, Willy Huang, Yachun Tasi, Yvonne Kao, Haotian Fan, Fangyuan Kong, Shiqi Zhou, Hao liu, Yu Lai, Shanshan Chen, Wenqi Wang, HaoNing Wu, Chaofeng Chen, Chunzheng Zhu, Zekun Guo, Shiling Zhao, Haibing Yin, Hongkui Wang, Hanene Brachemi Meftah, Sid Ahmed Fezza, Wassim Hamidouche, Olivier Déforges, Tengfei Shi, Azadeh Mansouri, Hossein Motamednia, Amir Hossein Bakhtiari, Ahmad Mahmoudi Aznaveh

61 participating teams submitted their prediction results during the development phase, with a total of 3168 submissions.

Deblurring Image Restoration +3

Paper
Add Code

Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment

no code implementations • 31 Jul 2023 • Kun Yuan, Zishang Kong, Chuanchuan Zheng, Ming Sun, Xing Wen

\textit{Second}, the perceptual quality of a video exhibits a multi-distortion distribution, due to the differences in the duration and probability of occurrence for various distortions.

Action Recognition Blocking +2

Paper
Add Code

Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment

no code implementations • 1 Aug 2023 • Hongbo Liu, Mingda Wu, Kun Yuan, Ming Sun, Yansong Tang, Chuanchuan Zheng, Xing Wen, Xiu Li

Video quality assessment (VQA) has attracted growing attention in recent years.

Knowledge Distillation Video Quality Assessment +1

Paper
Add Code

Blind Image Super-resolution with Rich Texture-Aware Codebooks

no code implementations • 26 Oct 2023 • Rui Qin, Ming Sun, Fangyuan Zhang, Xing Wen, Bin Wang

However, we find that a codebook based on HR reconstruction may not effectively capture the complex correlations between low-resolution (LR) and HR images.

Blind Super-Resolution Image Super-Resolution

Paper
Add Code

FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

no code implementations • 8 Jan 2024 • Yang Liu, Li Wan, Yun Li, Yiteng Huang, Ming Sun, James Luan, Yangyang Shi, Xin Lei

Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted.

Acoustic echo cancellation Speech Enhancement

Paper
Add Code

AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition

no code implementations • 18 Jan 2024 • Ju Lin, Niko Moritz, Yiteng Huang, Ruiming Xie, Ming Sun, Christian Fuegen, Frank Seide

Wearable devices like smart glasses are approaching the compute capability to seamlessly generate real-time closed captions for live conversations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

KVQ: Kwai Video Quality Assessment for Short-form Videos

no code implementations • 11 Feb 2024 • Yiting Lu, Xin Li, Yajing Pei, Kun Yuan, Qizhi Xie, Yunpeng Qu, Ming Sun, Chao Zhou, Zhibo Chen

Short-form UGC video platforms, like Kwai and TikTok, have been an emerging and irreplaceable mainstream media form, thriving on user-friendly engagement, and kaleidoscope creation, etc.

Video Quality Assessment Visual Question Answering (VQA)

Paper
Add Code

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution

no code implementations • 8 Mar 2024 • Yunpeng Qu, Kun Yuan, Kai Zhao, Qizhi Xie, Jinhua Hao, Ming Sun, Chao Zhou

Diffusion-based methods, endowed with a formidable generative prior, have received increasing attention in Image Super-Resolution (ISR) recently.

Image Super-Resolution

Paper
Add Code

CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement

no code implementations • 15 Mar 2024 • Qiang Zhu, Jinhua Hao, Yukang Ding, Yu Liu, Qiao Mo, Ming Sun, Chao Zhou, Shuyuan Zhu

Specifically, the ITA module aggregates temporal information from consecutive frames and coding priors, while the MNA module globally captures spatial information guided by residual frames.