Search Results for author: Jun Tang

Found 17 papers, 6 papers with code

OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models

1 code implementation22 Feb 2025 Wenwen Yu, Zhibo Yang, Jianqiang Wan, Sibo Song, Jun Tang, Wenqing Cheng, Yuliang Liu, Xiang Bai

In this paper, we introduce OmniParser V2, a universal model that unifies VsTP typical tasks, including text spotting, key information extraction, table recognition, and layout analysis, into a unified framework.

document understanding Key Information Extraction +4

Cooperative ISAC-empowered Low-Altitude Economy

no code implementations29 Dec 2024 Jun Tang, Yiming Yu, Cunhua Pan, Hong Ren, Dongming Wang, Jiangzhou Wang, Xiaohu You

This paper proposes a cooperative integrated sensing and communication (ISAC) scheme for the low-altitude sensing scenario, aiming at estimating the parameters of the unmanned aerial vehicles (UAVs) and enhancing the sensing performance via cooperation.

Integrated sensing and communication ISAC +2

An Event-centric Framework for Predicting Crime Hotspots with Flexible Time Intervals

no code implementations2 Nov 2024 Jiahui Jin, Yi Hong, Guandong Xu, Jinghui Zhang, Jun Tang, Hancheng Wang

Furthermore, we introduce a type-aware spatiotemporal point process that learns crime-evolving features, measuring the risk of specific crime types at a given time and location by considering the frequency of past crime events.

VL-Reader: Vision and Language Reconstructor is an Effective Scene Text Recognizer

no code implementations18 Sep 2024 Humen Zhong, Zhibo Yang, Zhaohai Li, Peng Wang, Jun Tang, Wenqing Cheng, Cong Yao

Text recognition is an inherent integration of vision and language, encompassing the visual texture in stroke patterns and the semantic context among the character sequences.

Decoder Scene Text Recognition

Platypus: A Generalized Specialist Model for Reading Text in Various Forms

1 code implementation27 Aug 2024 Peng Wang, Zhaohai Li, Jun Tang, Humen Zhong, Fei Huang, Zhibo Yang, Cong Yao

Recently, generalist models (such as GPT-4V), trained on tremendous data in a unified way, have shown enormous potential in reading text in various scenarios, but with the drawbacks of limited accuracy and low efficiency.

Handwritten Text Recognition Scene Text Recognition

A dataset of primary nasopharyngeal carcinoma MRI with multi-modalities segmentation

no code implementations4 Apr 2024 Yin Li, Qi Chen, Kai Wang, Meige Li, Liping Si, Yingwei Guo, Yu Xiong, Qixing Wang, Yang Qin, Ling Xu, Patrick van der Smagt, Jun Tang, Nutan Chen

Multi-modality magnetic resonance imaging data with various sequences facilitate the early diagnosis, tumor segmentation, and disease staging in the management of nasopharyngeal carcinoma (NPC).

Management Tumor Segmentation

Sequential Model for Predicting Patient Adherence in Subcutaneous Immunotherapy for Allergic Rhinitis

1 code implementation21 Jan 2024 Yin Li, Yu Xiong, Wenxin Fan, Kai Wang, Qingqing Yu, Liping Si, Patrick van der Smagt, Jun Tang, Nutan Chen

How to enhance the adherence of patients to maximize the benefit of allergen immunotherapy (AIT) plays a crucial role in the management of AIT.

Management Prediction

PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection

1 code implementation6 Sep 2022 Han Wang, Jun Tang, Xiaodong Liu, Shanyan Guan, Rong Xie, Li Song

The temporal information is introduced by the temporal feature aggregation model (TFAM), by conducting an attention mechanism between the context frames and the target frame (i. e., the frame to be detected).

object-detection Video Object Detection

Vision-Language Pre-Training for Boosting Scene Text Detectors

2 code implementations CVPR 2022 Sibo Song, Jianqiang Wan, Zhibo Yang, Jun Tang, Wenqing Cheng, Xiang Bai, Cong Yao

In this paper, we specifically adapt vision-language joint learning for scene text detection, a task that intrinsically involves cross-modal interaction between the two modalities: vision and language, since text is the written form of language.

Contrastive Learning Language Modeling +5

MOST: A Multi-Oriented Scene Text Detector with Localization Refinement

no code implementations CVPR 2021 Minghang He, Minghui Liao, Zhibo Yang, Humen Zhong, Jun Tang, Wenqing Cheng, Cong Yao, Yongpan Wang, Xiang Bai

Over the past few years, the field of scene text detection has progressed rapidly that modern text detectors are able to hunt text in various challenging scenarios.

Scene Text Detection Text Detection

Temporally Object-based Video Co-Segmentation

no code implementations9 Feb 2018 Michael Ying Yang, Matthias Reso, Jun Tang, Wentong Liao, Bodo Rosenhahn

Therefore, we formulate a graphical model to select a proposal stream for each object in which the pairwise potentials consist of the appearance dissimilarity between different streams in the same video and also the similarity between the streams in different videos.

Object Segmentation

Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12

no code implementations8 Sep 2017 Jun Tang, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, Xiao-Feng Wang

We discover and describe Apple's set-up for differentially private data processing, including the overall data pipeline, the parameters used for differentially private perturbation of each piece of data, and the frequency with which such data is sent to Apple's servers.

Management

Cannot find the paper you are looking for? You can Submit a new open access paper.