Search Results for author: Xinyu Zhou

Found 40 papers, 13 papers with code

OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

no code implementations • 14 Mar 2024 • Lingyi Hong, Shilin Yan, Renrui Zhang, Wanyun Li, Xinyu Zhou, Pinxue Guo, Kaixun Jiang, Yiting Chen, Jinglun Li, Zhaoyu Chen, Wenqiang Zhang

To evaluate the effectiveness of our general framework OneTracker, which is consisted of Foundation Tracker and Prompt Tracker, we conduct extensive experiments on 6 popular tracking tasks across 11 benchmarks and our OneTracker outperforms other models and achieves state-of-the-art performance.

Object Visual Object Tracking

Paper
Add Code

OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework

no code implementations • 13 Mar 2024 • Wanyun Li, Pinxue Guo, Xinyu Zhou, Lingyi Hong, Yangji He, Xiangyu Zheng, Wei zhang, Wenqiang Zhang

Contemporary Video Object Segmentation (VOS) approaches typically consist stages of feature extraction, matching, memory management, and multiple objects aggregation.

Management Semantic Segmentation +2

Paper
Add Code

Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction

no code implementations • 12 Mar 2024 • Jianping Jiang, Xinyu Zhou, Bingxuan Wang, Xiaoming Deng, Chao Xu, Boxin Shi

Experiments on real-world data demonstrate that EvRGBHand can effectively solve the challenging issues when using either type of camera alone via retaining the merits of both, and shows the potential of generalization to outdoor scenes and another type of event camera.

Paper
Add Code

ClickVOS: Click Video Object Segmentation

no code implementations • 10 Mar 2024 • Pinxue Guo, Lingyi Hong, Xinyu Zhou, Shuyong Gao, Wanyun Li, Jinglun Li, Zhaoyu Chen, Xiaoqiang Li, Wei zhang, Wenqiang Zhang

To address these limitations, we propose the setting named Click Video Object Segmentation (ClickVOS) which segments objects of interest across the whole video according to a single click per object in the first frame.

Object Segmentation +3

Paper
Add Code

Differentially Private Worst-group Risk Minimization

no code implementations • 29 Feb 2024 • Xinyu Zhou, Raef Bassily

We first present a new algorithm that achieves excess worst-group population risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon} + \sqrt{\frac{p}{K}})$, where $K$ is the total number of samples drawn from all groups and $d$ is the problem dimension.

Paper
Add Code

Learning to Deblur Polarized Images

no code implementations • 28 Feb 2024 • Chu Zhou, Minggui Teng, Xinyu Zhou, Chao Xu, Boxin Sh

However, since the on-chip micro-polarizers block part of the light so that the sensor often requires a longer exposure time, the captured polarized images are prone to motion blur caused by camera shakes, leading to noticeable degradation in the computed DoP and AoP.

Deblurring Image Deblurring +2

Paper
Add Code

Reading Relevant Feature from Global Representation Memory for Visual Object Tracking

no code implementations • NeurIPS 2023 • Xinyu Zhou, Pinxue Guo, Lingyi Hong, Jinglun Li, Wei zhang, Weifeng Ge, Wenqiang Zhang

Therefore, using all features in the template and memory can lead to redundancy and impair tracking performance.

Visual Object Tracking

Paper
Add Code

Me LLaMA: Foundation Large Language Models for Medical Applications

1 code implementation • 20 Feb 2024 • Qianqian Xie, Qingyu Chen, Aokun Chen, Cheng Peng, Yan Hu, Fongci Lin, Xueqing Peng, Jimin Huang, Jeffrey Zhang, Vipina Keloth, Xinyu Zhou, Huan He, Lucila Ohno-Machado, Yonghui Wu, Hua Xu, Jiang Bian

In response to this challenge, this study introduces Me-LLaMA, a novel medical LLM family that includes foundation models - Me-LLaMA 13/70B, along with their chat-enhanced versions - Me-LLaMA 13/70B-chat, developed through continual pre-training and instruction tuning of LLaMA2 using large medical datasets.

Few-Shot Learning

Paper
Code

EvPlug: Learn a Plug-and-Play Module for Event and Image Fusion

no code implementations • 28 Dec 2023 • Jianping Jiang, Xinyu Zhou, Peiqi Duan, Boxin Shi

The learned fusion module integrates event streams with image features in the form of a plug-in, endowing the RGB-based model to be robust to HDR and fast motion scenes while enabling high temporal resolution inference.

3D Hand Pose Estimation object-detection +2

Paper
Add Code

Resource Allocation for Semantic Communication under Physical-layer Security

no code implementations • 7 Dec 2023 • Yang Li, Xinyu Zhou, Jun Zhao

The secrecy rate is the communication rate at which no information is disclosed to an eavesdropper.

Paper
Add Code

Logical Bias Learning for Object Relation Prediction

no code implementations • 1 Oct 2023 • Xinyu Zhou, Zihan Ji, Anna Zhu

Scene graph generation (SGG) aims to automatically map an image into a semantic structural graph for better scene understanding.

Causal Inference Decision Making +5

Paper
Add Code

Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model

1 code implementation • 20 Sep 2023 • Xinyu Zhou, Delong Chen, Yudong Chen

This paper explores the potential of constructing an AI spoken dialogue system that "thinks how to respond" and "thinks how to speak" simultaneously, which more closely aligns with the human speech production process compared to the current cascade pipeline of independent chatbot and Text-to-Speech (TTS) modules.

Chatbot Language Modelling +3

Paper
Code

Few shot font generation via transferring similarity guided global style and quantization local style

1 code implementation • ICCV 2023 • Wei Pan, Anna Zhu, Xinyu Zhou, Brian Kenji Iwana, Shilin Li

To better capture the local styles, a cross-attention-based style transfer module is adopted to transfer the styles of reference glyphs to the components, where the components are self-learned discrete latent codes through vector quantization without manual definition.

Disentanglement Font Generation +2

113

Paper
Code

NBMOD: Find It and Grasp It in Noisy Background

1 code implementation • 17 Jun 2023 • Boyuan Cao, Xinyu Zhou, Congmin Guo, Baohua Zhang, Yuchen Liu, Qianqiu Tan

In the past few years, researchers have proposed many methods to address the above-mentioned issues and achieved very good results on publicly available datasets such as the Cornell dataset and the Jacquard dataset.

Ranked #1 on Robotic Grasping on NBMOD (using extra training data)

Robotic Grasping

Paper
Code

Hierarchical Visual Categories Modeling: A Joint Representation Learning and Density Estimation Framework for Out-of-Distribution Detection

no code implementations • ICCV 2023 • Jinglun Li, Xinyu Zhou, Pinxue Guo, Yixuan Sun, Yiwen Huang, Weifeng Ge, Wenqiang Zhang

We use one fold as the in-distribution dataset and the others as out-of-distribution datasets to evaluate the proposed method.

Density Estimation Out-of-Distribution Detection +1

Paper
Add Code

Mobile Augmented Reality with Federated Learning in the Metaverse

no code implementations • 16 Dec 2022 • Xinyu Zhou, Jun Zhao

The Metaverse is deemed the next evolution of the Internet and has received much attention recently.

Federated Learning object-detection +2

Paper
Add Code

Resource Allocation of Federated Learning for the Metaverse with Mobile Augmented Reality

no code implementations • 16 Nov 2022 • Xinyu Zhou, Chang Liu, Jun Zhao

The Metaverse has received much attention recently.

Federated Learning object-detection +3

Paper
Add Code

Joint Optimization of Energy Consumption and Completion Time in Federated Learning

no code implementations • 29 Sep 2022 • Xinyu Zhou, Jun Zhao, Huimei Han, Claude Guet

Federated Learning (FL) is an intriguing distributed machine learning approach due to its privacy-preserving characteristics.

Federated Learning Privacy Preserving +1

Paper
Add Code

Injecting Image Details into CLIP's Feature Space

no code implementations • 31 Aug 2022 • Zilun Zhang, Cuifeng Shen, Yuan Shen, Huixin Xiong, Xinyu Zhou

Although CLIP-like Visual Language Models provide a functional joint feature space for image and text, due to the limitation of the CILP-like model's image input size (e. g., 224), subtle details are lost in the feature representation if we input high-resolution images (e. g., 2240).

Retrieval

Paper
Add Code

A New Learning Paradigm for Stochastic Configuration Network: SCN+

no code implementations • 11 Mar 2022 • Yanshuang Ao, Xinyu Zhou, Wei Dai

This novel algorithm can leverage privileged information into SCN in the training stage, which provides a new method to train SCN.

Incremental Learning

Paper
Add Code

EvUnroll: Neuromorphic Events Based Rolling Shutter Image Correction

1 code implementation • CVPR 2022 • Xinyu Zhou, Peiqi Duan, Yi Ma, Boxin Shi

This paper proposes to use neuromorphic events for correcting rolling shutter (RS) images as consecutive global shutter (GS) frames.

Paper
Code

Nanorobot queue: Cooperative treatment of cancer based on team member communication and image processing

no code implementations • 22 Nov 2021 • Xinyu Zhou

Although nanorobots have been used as clinical prescriptions for work such as gastroscopy, and even photoacoustic tomography technology has been proposed to control nanorobots to deliver drugs at designated delivery points in real time, and there are cases of eliminating "superbacteria" in blood through nanorobots, most technologies are immature, either with low efficiency or low accuracy, Either it can not be mass produced, so the most effective way to treat cancer diseases at this stage is through chemotherapy and radiotherapy.

Image Classification

Paper
Add Code

Machine Learning Applications in Forecasting of COVID-19 Based on Patients' Individual Symptoms

no code implementations • 29 Sep 2021 • Zhanyang Sun, Rui Ding, Xinyu Zhou

Predicting the COVID-19 outbreak has been studied by many researchers in recent years.

BIG-bench Machine Learning

Paper
Add Code

EventZoom: Learning To Denoise and Super Resolve Neuromorphic Events

no code implementations • CVPR 2021 • Peiqi Duan, Zihao W. Wang, Xinyu Zhou, Yi Ma, Boxin Shi

EventZoom is trained in a noise-to-noise fashion where the two ends of the network are unfiltered noisy events, enforcing noise-free event restoration.

Denoising Image Reconstruction +1

Paper
Add Code

RPPLNS: Pay-per-last-N-shares with a Randomised Twist

no code implementations • 15 Feb 2021 • Jonathan Katz, Philip Lazos, Francisco J. Marmolejo-Cossío, Xinyu Zhou

"Pay-per-last-$N$-shares" (PPLNS) is one of the most common payout strategies used by mining pools in Proof-of-Work (PoW) cryptocurrencies.

Fairness Computer Science and Game Theory Cryptography and Security

Paper
Add Code

Feature Space Singularity for Out-of-Distribution Detection

1 code implementation • 30 Nov 2020 • Haiwen Huang, Zhihan Li, Lulu Wang, Sishuo Chen, Bin Dong, Xinyu Zhou

Our analysis of the phenomenon reveals why our algorithm works.

Ranked #1 on Out-of-Distribution Detection on MS-1M vs. IJB-C

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Code

A Review of Automated Diagnosis of COVID-19 Based on Scanning Images

no code implementations • 9 Jun 2020 • Delong Chen, Shunhui Ji, Fan Liu, Zewen Li, Xinyu Zhou

The pandemic of COVID-19 has caused millions of infections, which has led to a great loss all over the world, socially and economically.

Computed Tomography (CT) Domain Adaptation +1

Paper
Add Code

Component-wise Adaptive Trimming For Robust Mixture Regression

no code implementations • 23 May 2020 • Wennan Chang, Xinyu Zhou, Yong Zang, Chi Zhang, Sha Cao

Existing robust mixture regression methods suffer from outliers as they either conduct parameter estimation in the presence of outliers, or rely on prior knowledge of the level of outlier contamination.

Outlier Detection regression

Paper
Add Code

DPGN: Distribution Propagation Graph Network for Few-shot Learning

1 code implementation • CVPR 2020 • Ling Yang, Liangliang Li, Zilun Zhang, Xinyu Zhou, Erjin Zhou, Yu Liu

To combine the distribution-level relations and instance-level relations for all examples, we construct a dual complete graph network which consists of a point graph and a distribution graph with each node standing for an example.

Ranked #2 on Few-Shot Learning on Mini-ImageNet - 1-Shot Learning

Few-Shot Learning Relation

175

Paper
Code

Learning Delicate Local Representations for Multi-Person Pose Estimation

4 code implementations • ECCV 2020 • Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun

To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations.

Ranked #1 on Keypoint Detection on COCO test-challenge

Keypoint Detection Multi-Person Pose Estimation

5,006

Paper
Code

Learning to Run with Actor-Critic Ensemble

2 code implementations • 25 Dec 2017 • Zhewei Huang, Shuchang Zhou, BoEr Zhuang, Xinyu Zhou

We introduce an Actor-Critic Ensemble(ACE) method for improving the performance of Deep Deterministic Policy Gradient(DDPG) algorithm.

124

Paper
Code

ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

37 code implementations • CVPR 2018 • Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun

We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e. g., 10-150 MFLOPs).

Ranked #79 on Person Re-Identification on DukeMTMC-reID

General Classification Image Classification +2

6,298

Paper
Code

EAST: An Efficient and Accurate Scene Text Detector

32 code implementations • CVPR 2017 • Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, Jiajun Liang

Previous approaches for scene text detection have already achieved promising performances across various benchmarks.

Ranked #3 on Scene Text Detection on COCO-Text

Curved Text Detection Optical Character Recognition (OCR) +1

38,490

Paper
Code

Training Bit Fully Convolutional Network for Fast Semantic Segmentation

no code implementations • 1 Dec 2016 • He Wen, Shuchang Zhou, Zhe Liang, Yuxiang Zhang, Dieqiao Feng, Xinyu Zhou, Cong Yao

Fully convolutional neural networks give accurate, per-pixel prediction for input images and have applications like semantic segmentation.

Segmentation Semantic Segmentation

Paper
Add Code

Effective Quantization Methods for Recurrent Neural Networks

2 code implementations • 30 Nov 2016 • Qinyao He, He Wen, Shuchang Zhou, Yuxin Wu, Cong Yao, Xinyu Zhou, Yuheng Zou

In addition, we propose balanced quantization methods for weights to further reduce performance degradation.

Quantization

Paper
Code

Scene Text Detection via Holistic, Multi-Channel Prediction

no code implementations • 29 Jun 2016 • Cong Yao, Xiang Bai, Nong Sang, Xinyu Zhou, Shuchang Zhou, Zhimin Cao

Recently, scene text detection has become an active research topic in computer vision and document analysis, because of its great importance and significant challenge.

Ranked #6 on Scene Text Detection on COCO-Text

Scene Text Detection Semantic Segmentation +1

Paper
Add Code

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

12 code implementations • 20 Jun 2016 • Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou

We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients.

Quantization

6,298

Paper
Code

Exploiting Local Structures with the Kronecker Layer in Convolutional Networks

no code implementations • 31 Dec 2015 • Shuchang Zhou, Jia-Nan Wu, Yuxin Wu, Xinyu Zhou

In this paper, we propose and study a technique to reduce the number of parameters and computation time in convolutional neural networks.

Scene Text Recognition

Paper
Add Code

Incidental Scene Text Understanding: Recent Progresses on ICDAR 2015 Robust Reading Competition Challenge 4

no code implementations • 30 Nov 2015 • Cong Yao, Jia-Nan Wu, Xinyu Zhou, Chi Zhang, Shuchang Zhou, Zhimin Cao, Qi Yin

Different from focused texts present in natural images, which are captured with user's intention and intervention, incidental texts usually exhibit much more diversity, variability and complexity, thus posing significant difficulties and challenges for scene text detection and recognition algorithms.

Scene Text Detection Text Detection

Paper
Add Code

ICDAR 2015 Text Reading in the Wild Competition

no code implementations • 10 Jun 2015 • Xinyu Zhou, Shuchang Zhou, Cong Yao, Zhimin Cao, Qi Yin

Recently, text detection and recognition in natural scenes are becoming increasing popular in the computer vision community as well as the document analysis community.

Text Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.