Search Results for author: Xinyu Zhou

Found 40 papers, 13 papers with code

ICDAR 2015 Text Reading in the Wild Competition

no code implementations10 Jun 2015 Xinyu Zhou, Shuchang Zhou, Cong Yao, Zhimin Cao, Qi Yin

Recently, text detection and recognition in natural scenes are becoming increasing popular in the computer vision community as well as the document analysis community.

Text Detection

Incidental Scene Text Understanding: Recent Progresses on ICDAR 2015 Robust Reading Competition Challenge 4

no code implementations30 Nov 2015 Cong Yao, Jia-Nan Wu, Xinyu Zhou, Chi Zhang, Shuchang Zhou, Zhimin Cao, Qi Yin

Different from focused texts present in natural images, which are captured with user's intention and intervention, incidental texts usually exhibit much more diversity, variability and complexity, thus posing significant difficulties and challenges for scene text detection and recognition algorithms.

Scene Text Detection Text Detection

Exploiting Local Structures with the Kronecker Layer in Convolutional Networks

no code implementations31 Dec 2015 Shuchang Zhou, Jia-Nan Wu, Yuxin Wu, Xinyu Zhou

In this paper, we propose and study a technique to reduce the number of parameters and computation time in convolutional neural networks.

Scene Text Recognition

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

12 code implementations20 Jun 2016 Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou

We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients.

Quantization

Scene Text Detection via Holistic, Multi-Channel Prediction

no code implementations29 Jun 2016 Cong Yao, Xiang Bai, Nong Sang, Xinyu Zhou, Shuchang Zhou, Zhimin Cao

Recently, scene text detection has become an active research topic in computer vision and document analysis, because of its great importance and significant challenge.

Scene Text Detection Semantic Segmentation +1

Effective Quantization Methods for Recurrent Neural Networks

2 code implementations30 Nov 2016 Qinyao He, He Wen, Shuchang Zhou, Yuxin Wu, Cong Yao, Xinyu Zhou, Yuheng Zou

In addition, we propose balanced quantization methods for weights to further reduce performance degradation.

Quantization

Training Bit Fully Convolutional Network for Fast Semantic Segmentation

no code implementations1 Dec 2016 He Wen, Shuchang Zhou, Zhe Liang, Yuxiang Zhang, Dieqiao Feng, Xinyu Zhou, Cong Yao

Fully convolutional neural networks give accurate, per-pixel prediction for input images and have applications like semantic segmentation.

Segmentation Semantic Segmentation

ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

37 code implementations CVPR 2018 Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun

We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e. g., 10-150 MFLOPs).

General Classification Image Classification +2

Learning to Run with Actor-Critic Ensemble

2 code implementations25 Dec 2017 Zhewei Huang, Shuchang Zhou, BoEr Zhuang, Xinyu Zhou

We introduce an Actor-Critic Ensemble(ACE) method for improving the performance of Deep Deterministic Policy Gradient(DDPG) algorithm.

Learning Delicate Local Representations for Multi-Person Pose Estimation

4 code implementations ECCV 2020 Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun

To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations.

Keypoint Detection Multi-Person Pose Estimation

DPGN: Distribution Propagation Graph Network for Few-shot Learning

1 code implementation CVPR 2020 Ling Yang, Liangliang Li, Zilun Zhang, Xinyu Zhou, Erjin Zhou, Yu Liu

To combine the distribution-level relations and instance-level relations for all examples, we construct a dual complete graph network which consists of a point graph and a distribution graph with each node standing for an example.

Few-Shot Learning Relation

Component-wise Adaptive Trimming For Robust Mixture Regression

no code implementations23 May 2020 Wennan Chang, Xinyu Zhou, Yong Zang, Chi Zhang, Sha Cao

Existing robust mixture regression methods suffer from outliers as they either conduct parameter estimation in the presence of outliers, or rely on prior knowledge of the level of outlier contamination.

Outlier Detection regression

A Review of Automated Diagnosis of COVID-19 Based on Scanning Images

no code implementations9 Jun 2020 Delong Chen, Shunhui Ji, Fan Liu, Zewen Li, Xinyu Zhou

The pandemic of COVID-19 has caused millions of infections, which has led to a great loss all over the world, socially and economically.

Computed Tomography (CT) Domain Adaptation +1

RPPLNS: Pay-per-last-N-shares with a Randomised Twist

no code implementations15 Feb 2021 Jonathan Katz, Philip Lazos, Francisco J. Marmolejo-Cossío, Xinyu Zhou

"Pay-per-last-$N$-shares" (PPLNS) is one of the most common payout strategies used by mining pools in Proof-of-Work (PoW) cryptocurrencies.

Fairness Computer Science and Game Theory Cryptography and Security

EventZoom: Learning To Denoise and Super Resolve Neuromorphic Events

no code implementations CVPR 2021 Peiqi Duan, Zihao W. Wang, Xinyu Zhou, Yi Ma, Boxin Shi

EventZoom is trained in a noise-to-noise fashion where the two ends of the network are unfiltered noisy events, enforcing noise-free event restoration.

Denoising Image Reconstruction +1

Nanorobot queue: Cooperative treatment of cancer based on team member communication and image processing

no code implementations22 Nov 2021 Xinyu Zhou

Although nanorobots have been used as clinical prescriptions for work such as gastroscopy, and even photoacoustic tomography technology has been proposed to control nanorobots to deliver drugs at designated delivery points in real time, and there are cases of eliminating "superbacteria" in blood through nanorobots, most technologies are immature, either with low efficiency or low accuracy, Either it can not be mass produced, so the most effective way to treat cancer diseases at this stage is through chemotherapy and radiotherapy.

Image Classification

EvUnroll: Neuromorphic Events Based Rolling Shutter Image Correction

1 code implementation CVPR 2022 Xinyu Zhou, Peiqi Duan, Yi Ma, Boxin Shi

This paper proposes to use neuromorphic events for correcting rolling shutter (RS) images as consecutive global shutter (GS) frames.

A New Learning Paradigm for Stochastic Configuration Network: SCN+

no code implementations11 Mar 2022 Yanshuang Ao, Xinyu Zhou, Wei Dai

This novel algorithm can leverage privileged information into SCN in the training stage, which provides a new method to train SCN.

Incremental Learning

Injecting Image Details into CLIP's Feature Space

no code implementations31 Aug 2022 Zilun Zhang, Cuifeng Shen, Yuan Shen, Huixin Xiong, Xinyu Zhou

Although CLIP-like Visual Language Models provide a functional joint feature space for image and text, due to the limitation of the CILP-like model's image input size (e. g., 224), subtle details are lost in the feature representation if we input high-resolution images (e. g., 2240).

Retrieval

Joint Optimization of Energy Consumption and Completion Time in Federated Learning

no code implementations29 Sep 2022 Xinyu Zhou, Jun Zhao, Huimei Han, Claude Guet

Federated Learning (FL) is an intriguing distributed machine learning approach due to its privacy-preserving characteristics.

Federated Learning Privacy Preserving +1

Mobile Augmented Reality with Federated Learning in the Metaverse

no code implementations16 Dec 2022 Xinyu Zhou, Jun Zhao

The Metaverse is deemed the next evolution of the Internet and has received much attention recently.

Federated Learning object-detection +2

NBMOD: Find It and Grasp It in Noisy Background

1 code implementation17 Jun 2023 Boyuan Cao, Xinyu Zhou, Congmin Guo, Baohua Zhang, Yuchen Liu, Qianqiu Tan

In the past few years, researchers have proposed many methods to address the above-mentioned issues and achieved very good results on publicly available datasets such as the Cornell dataset and the Jacquard dataset.

 Ranked #1 on Robotic Grasping on NBMOD (using extra training data)

Robotic Grasping

Few shot font generation via transferring similarity guided global style and quantization local style

1 code implementation ICCV 2023 Wei Pan, Anna Zhu, Xinyu Zhou, Brian Kenji Iwana, Shilin Li

To better capture the local styles, a cross-attention-based style transfer module is adopted to transfer the styles of reference glyphs to the components, where the components are self-learned discrete latent codes through vector quantization without manual definition.

Disentanglement Font Generation +2

Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model

1 code implementation20 Sep 2023 Xinyu Zhou, Delong Chen, Yudong Chen

This paper explores the potential of constructing an AI spoken dialogue system that "thinks how to respond" and "thinks how to speak" simultaneously, which more closely aligns with the human speech production process compared to the current cascade pipeline of independent chatbot and Text-to-Speech (TTS) modules.

Chatbot Language Modelling +3

Logical Bias Learning for Object Relation Prediction

no code implementations1 Oct 2023 Xinyu Zhou, Zihan Ji, Anna Zhu

Scene graph generation (SGG) aims to automatically map an image into a semantic structural graph for better scene understanding.

Causal Inference Decision Making +5

Resource Allocation for Semantic Communication under Physical-layer Security

no code implementations7 Dec 2023 Yang Li, Xinyu Zhou, Jun Zhao

The secrecy rate is the communication rate at which no information is disclosed to an eavesdropper.

EvPlug: Learn a Plug-and-Play Module for Event and Image Fusion

no code implementations28 Dec 2023 Jianping Jiang, Xinyu Zhou, Peiqi Duan, Boxin Shi

The learned fusion module integrates event streams with image features in the form of a plug-in, endowing the RGB-based model to be robust to HDR and fast motion scenes while enabling high temporal resolution inference.

3D Hand Pose Estimation object-detection +2

Me LLaMA: Foundation Large Language Models for Medical Applications

1 code implementation20 Feb 2024 Qianqian Xie, Qingyu Chen, Aokun Chen, Cheng Peng, Yan Hu, Fongci Lin, Xueqing Peng, Jimin Huang, Jeffrey Zhang, Vipina Keloth, Xinyu Zhou, Huan He, Lucila Ohno-Machado, Yonghui Wu, Hua Xu, Jiang Bian

In response to this challenge, this study introduces Me-LLaMA, a novel medical LLM family that includes foundation models - Me-LLaMA 13/70B, along with their chat-enhanced versions - Me-LLaMA 13/70B-chat, developed through continual pre-training and instruction tuning of LLaMA2 using large medical datasets.

Few-Shot Learning

Learning to Deblur Polarized Images

no code implementations28 Feb 2024 Chu Zhou, Minggui Teng, Xinyu Zhou, Chao Xu, Boxin Sh

However, since the on-chip micro-polarizers block part of the light so that the sensor often requires a longer exposure time, the captured polarized images are prone to motion blur caused by camera shakes, leading to noticeable degradation in the computed DoP and AoP.

Deblurring Image Deblurring +2

Differentially Private Worst-group Risk Minimization

no code implementations29 Feb 2024 Xinyu Zhou, Raef Bassily

We first present a new algorithm that achieves excess worst-group population risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon} + \sqrt{\frac{p}{K}})$, where $K$ is the total number of samples drawn from all groups and $d$ is the problem dimension.

ClickVOS: Click Video Object Segmentation

no code implementations10 Mar 2024 Pinxue Guo, Lingyi Hong, Xinyu Zhou, Shuyong Gao, Wanyun Li, Jinglun Li, Zhaoyu Chen, Xiaoqiang Li, Wei zhang, Wenqiang Zhang

To address these limitations, we propose the setting named Click Video Object Segmentation (ClickVOS) which segments objects of interest across the whole video according to a single click per object in the first frame.

Object Segmentation +3

Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction

no code implementations12 Mar 2024 Jianping Jiang, Xinyu Zhou, Bingxuan Wang, Xiaoming Deng, Chao Xu, Boxin Shi

Experiments on real-world data demonstrate that EvRGBHand can effectively solve the challenging issues when using either type of camera alone via retaining the merits of both, and shows the potential of generalization to outdoor scenes and another type of event camera.

OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework

no code implementations13 Mar 2024 Wanyun Li, Pinxue Guo, Xinyu Zhou, Lingyi Hong, Yangji He, Xiangyu Zheng, Wei zhang, Wenqiang Zhang

Contemporary Video Object Segmentation (VOS) approaches typically consist stages of feature extraction, matching, memory management, and multiple objects aggregation.

Management Semantic Segmentation +2

OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

no code implementations14 Mar 2024 Lingyi Hong, Shilin Yan, Renrui Zhang, Wanyun Li, Xinyu Zhou, Pinxue Guo, Kaixun Jiang, Yiting Chen, Jinglun Li, Zhaoyu Chen, Wenqiang Zhang

To evaluate the effectiveness of our general framework OneTracker, which is consisted of Foundation Tracker and Prompt Tracker, we conduct extensive experiments on 6 popular tracking tasks across 11 benchmarks and our OneTracker outperforms other models and achieves state-of-the-art performance.

Object Visual Object Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.