no code implementations • 4 Dec 2024 • Andreas Steiner, André Susano Pinto, Michael Tschannen, Daniel Keysers, Xiao Wang, Yonatan Bitton, Alexey Gritsenko, Matthias Minderer, Anthony Sherbondy, Shangbang Long, Siyang Qin, Reeve Ingle, Emanuele Bugliarello, Sahar Kazemzadeh, Thomas Mesnard, Ibrahim Alabdulmohsin, Lucas Beyer, Xiaohua Zhai
PaliGemma 2 is an upgrade of the PaliGemma open Vision-Language Model (VLM) based on the Gemma 2 family of language models.
no code implementations • 6 Jun 2024 • JieLin Qiu, William Han, Xuandong Zhao, Shangbang Long, Christos Faloutsos, Lei LI
With the development of large models, watermarks are increasingly employed to assert copyright, verify authenticity, or monitor content distribution.
1 code implementation • 25 Oct 2023 • Shangbang Long, Siyang Qin, Yasuhisa Fujii, Alessandro Bissacco, Michalis Raptis
We propose Hierarchical Text Spotter (HTS), a novel method for the joint task of word-level text spotting and geometric layout analysis.
1 code implementation • 16 May 2023 • Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michalis Raptis
We organize a competition on hierarchical text detection and recognition.
no code implementations • 4 May 2023 • Chen-Yu Lee, Chun-Liang Li, Hao Zhang, Timothy Dozat, Vincent Perot, Guolong Su, Xiang Zhang, Kihyuk Sohn, Nikolai Glushnev, Renshen Wang, Joshua Ainslie, Shangbang Long, Siyang Qin, Yasuhisa Fujii, Nan Hua, Tomas Pfister
In FormNetV2, we introduce a centralized multimodal graph contrastive learning strategy to unify self-supervised pre-training for all modalities in one loss.
2 code implementations • CVPR 2022 • Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michalis Raptis
In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.
3 code implementations • CVPR 2020 • Shangbang Long, Cong Yao
Synthetic data has been a critical tool for training scene text detection and recognition models.
no code implementations • 10 Feb 2020 • Shangbang Long, Yushuo Guan, Kaigui Bian, Cong Yao
Irregular scene text recognition has attracted much attention from the research community, mainly due to the complexity of shapes of text in natural scene.
1 code implementation • 30 Aug 2019 • Shangbang Long, Yushuo Guan, Bingxuan Wang, Kaigui Bian, Cong Yao
Reading text from natural images is challenging due to the great variety in text font, color, size, complex background and etc..
1 code implementation • 13 Jul 2019 • Minghui Liao, Boyu Song, Shangbang Long, Minghang He, Cong Yao, Xiang Bai
Different from the previous methods which paste the rendered text on static 2D images, our method can render the 3D virtual scene and text instances as an entirety.
1 code implementation • 10 Nov 2018 • Shangbang Long, Xin He, Cong Yao
As an important research area in computer vision, scene text detection and recognition has been inescapably influenced by this wave of revolution, consequentially entering the era of deep learning.
no code implementations • 18 Sep 2018 • Shangbang Long, Cunchao Tu, Zhiyuan Liu, Maosong Sun
It has been studied for several decades mainly by lawyers and judges, considered as a novel and prospective application of artificial intelligence techniques in the legal field.
3 code implementations • ECCV 2018 • Shangbang Long, Jiaqiang Ruan, Wenjie Zhang, Xin He, Wenhao Wu, Cong Yao
Driven by deep neural networks and large scale datasets, scene text detection methods have progressed substantially over the past years, continuously refreshing the performance records on various standard benchmarks.
Ranked #2 on Curved Text Detection on SCUT-CTW1500