Search Results for author: Wangmeng Xiang

Found 18 papers, 13 papers with code

WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope

no code implementations3 Jan 2024 Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Wangmeng Xiang, Yusen Hu, Xianhui Lin, Xiaoyang Kang, Zengke Jin, Bin Luo, Yifeng Geng, Xuansong Xie, Jingren Zhou

This paper introduces the WordArt Designer API, a novel framework for user-driven artistic typography synthesis utilizing Large Language Models (LLMs) on ModelScope.

AnyText: Multilingual Visual Text Generation And Editing

1 code implementation6 Nov 2023 Yuxiang Tuo, Wangmeng Xiang, Jun-Yan He, Yifeng Geng, Xuansong Xie

Based on AnyWord-3M dataset, we propose AnyText-benchmark for the evaluation of visual text generation accuracy and quality.

Optical Character Recognition (OCR) Text Generation

A Benchmark for Chinese-English Scene Text Image Super-resolution

1 code implementation ICCV 2023 jianqi ma, Zhetong Liang, Wangmeng Xiang, Xi Yang, Lei Zhang

Scene Text Image Super-resolution (STISR) aims to recover high-resolution (HR) scene text images with visually pleasant and readable text content from the given low-resolution (LR) input.

Image Super-Resolution

DAMO-StreamNet: Optimizing Streaming Perception in Autonomous Driving

1 code implementation30 Mar 2023 Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Wangmeng Xiang, Binghui Chen, Bin Luo, Yifeng Geng, Xuansong Xie

Real-time perception, or streaming perception, is a crucial aspect of autonomous driving that has yet to be thoroughly explored in existing research.

Autonomous Driving

MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos

1 code implementation CVPR 2023 Minghan Li, Shuai Li, Wangmeng Xiang, Lei Zhang

The proposed MDQE is the first VIS method with per-clip input that achieves state-of-the-art results on challenging videos and competitive performance on simple videos.

Instance Segmentation Semantic Segmentation +1

Generative Action Description Prompts for Skeleton-based Action Recognition

3 code implementations ICCV 2023 Wangmeng Xiang, Chao Li, Yuxuan Zhou, Biao Wang, Lei Zhang

More specifically, we employ a pre-trained large-scale language model as the knowledge engine to automatically generate text descriptions for body parts movements of actions, and propose a multi-modal training scheme by utilizing the text encoder to generate feature vectors for different body parts and supervise the skeleton encoder for action representation learning.

Action Recognition Language Modelling +2

Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition

1 code implementation27 Jul 2022 Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Xian-Sheng Hua, Lei Zhang

For 3D video-based tasks such as action recognition, however, directly applying spatiotemporal transformers on video data will bring heavy computation and memory burdens due to the largely increased number of patches and the quadratic complexity of self-attention computation.

Action Classification Action Recognition

SP-ViT: Learning 2D Spatial Priors for Vision Transformers

1 code implementation15 Jun 2022 Yuxuan Zhou, Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Lei Zhang, Margret Keuper, Xiansheng Hua

Unlike convolutional inductive biases, which are forced to focus exclusively on hard-coded local regions, our proposed SPs are learned by the model itself and take a variety of spatial relations into account.

Image Classification

Real-World Video Super-Resolution: A Benchmark Dataset and a Decomposition Based Learning Scheme

1 code implementation ICCV 2021 Xi Yang, Wangmeng Xiang, Hui Zeng, Lei Zhang

Existing VSR methods are mostly trained and evaluated on synthetic datasets, where the LR videos are uniformly downsampled from their high-resolution (HR) counterparts by some simple operators (e. g., bicubic downsampling).

Video Super-Resolution

Homocentric Hypersphere Feature Embedding for Person Re-identification

no code implementations24 Apr 2018 Wangmeng Xiang, Jianqiang Huang, Xianbiao Qi, Xian-Sheng Hua, Lei Zhang

Person re-identification (Person ReID) is a challenging task due to the large variations in camera viewpoint, lighting, resolution, and human pose.

Person Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.