Search Results for author: Weilin Huang

Found 48 papers, 23 papers with code

UniFL: Improve Stable Diffusion via Unified Feedback Learning

no code implementations8 Apr 2024 Jiacheng Zhang, Jie Wu, Yuxi Ren, Xin Xia, Huafeng Kuang, Pan Xie, Jiashi Li, Xuefeng Xiao, Weilin Huang, Min Zheng, Lean Fu, Guanbin Li

Diffusion models have revolutionized the field of image generation, leading to the proliferation of high-quality models and diverse downstream applications.

Image Generation

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models

no code implementations12 Dec 2023 Chen Ju, Haicheng Wang, Zeqian Li, Xu Chen, Zhonghua Zhai, Weilin Huang, Shuai Xiao

Vision-Language Large Models (VLMs) have become primary backbone of AI, due to the impressive performance.

Enhancing Cross-domain Click-Through Rate Prediction via Explicit Feature Augmentation

no code implementations30 Nov 2023 Xu Chen, Zida Cheng, Jiangchao Yao, Chen Ju, Weilin Huang, Jinsong Lan, Xiaoyi Zeng, Shuai Xiao

Later the augmentation network employs the explicit cross-domain knowledge as augmented information to boost the target domain CTR prediction.

Click-Through Rate Prediction Transfer Learning

Forgedit: Text Guided Image Editing via Learning and Forgetting

1 code implementation19 Sep 2023 Shiwen Zhang, Shuai Xiao, Weilin Huang

Text-guided image editing on real or synthetic images, given only the original image itself and the target text prompt as inputs, is a very general and challenging task.


Cross-domain Augmentation Networks for Click-Through Rate Prediction

no code implementations6 May 2023 Xu Chen, Zida Cheng, Shuai Xiao, Xiaoyi Zeng, Weilin Huang

The translation network is able to compute features from two domains with heterogeneous inputs separately by designing two independent branches, and then learn meaningful cross-domain knowledge using a designed cross-supervised feature translator.

Click-Through Rate Prediction Transfer Learning +1

Image to Multi-Modal Retrieval for Industrial Scenarios

no code implementations6 May 2023 Zida Cheng, Chen Ju, Xu Chen, Zhonghua Zhai, Shuai Xiao, Xiaoyi Zeng, Weilin Huang

We formally define a novel valuable information retrieval task: image-to-multi-modal-retrieval (IMMR), where the query is an image and the doc is an entity with both image and textual description.

Cross-Modal Retrieval Information Retrieval +2

Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding

no code implementations28 Sep 2022 Fengyuan Shi, Ruopeng Gao, Weilin Huang, LiMin Wang

The sampling module aims to select these informative patches by predicting the offsets with respect to a reference point, while the decoding module works for extracting the grounded object information by performing cross attention between image features and text features.

Visual Grounding

Cross-Architecture Self-supervised Video Representation Learning

1 code implementation CVPR 2022 Sheng Guo, Zihua Xiong, Yujie Zhong, LiMin Wang, Xiaobo Guo, Bing Han, Weilin Huang

In this paper, we present a new cross-architecture contrastive learning (CACL) framework for self-supervised video representation learning.

Action Recognition Contrastive Learning +4

InsCLR: Improving Instance Retrieval with Self-Supervision

1 code implementation2 Dec 2021 Zelu Deng, Yujie Zhong, Sheng Guo, Weilin Huang

This work aims at improving instance retrieval with self-supervision.


End-to-End Dense Video Grounding via Parallel Regression

no code implementations23 Sep 2021 Fengyuan Shi, Weilin Huang, LiMin Wang

In this paper, we tackle a new problem of dense video grounding, by simultaneously localizing multiple moments with a paragraph as input.

regression Sentence +1

TOOD: Task-aligned One-stage Object Detection

5 code implementations ICCV 2021 Chengjian Feng, Yujie Zhong, Yu Gao, Matthew R. Scott, Weilin Huang

One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two tasks.

Object object-detection +1

Exploring Classification Equilibrium in Long-Tailed Object Detection

1 code implementation ICCV 2021 Chengjian Feng, Yujie Zhong, Weilin Huang

Specifically, EBL increases the intensity of the adjustment of the decision boundary for the weak classes by a designed score-guided loss margin between any two classes.

Classification imbalanced classification +5

Mutually-aware Sub-Graphs Differentiable Architecture Search

no code implementations9 Jul 2021 Haoxian Tan, Sheng Guo, Yujie Zhong, Matthew R. Scott, Weilin Huang

In this paper, we propose a conceptually simple yet efficient method to bridge these two paradigms, referred as Mutually-aware Sub-Graphs Differentiable Architecture Search (MSG-DAS).

Brain Image Synthesis With Unsupervised Multivariate Canonical CSCl4Net

no code implementations CVPR 2021 Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao

Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition.

Image Generation

Rethinking Deep Contrastive Learning with Embedding Memory

no code implementations25 Mar 2021 Haozhi Zhang, Xun Wang, Weilin Huang, Matthew R. Scott

Pair-wise loss functions have been extensively studied and shown to continuously improve the performance of deep metric learning (DML).

Contrastive Learning Metric Learning +1

Brain Image Synthesis with Unsupervised Multivariate Canonical CSC$\ell_4$Net

no code implementations22 Mar 2021 Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao

Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition.

Image Generation

Unchain the Search Space with Hierarchical Differentiable Architecture Search

1 code implementation11 Jan 2021 Guanting Liu, Yujie Zhong, Sheng Guo, Matthew R. Scott, Weilin Huang

To overcome this limitation, in this paper, we propose a Hierarchical Differentiable Architecture Search (H-DAS) that performs architecture search both at the cell level and at the stage level.

V4D: 4D Convolutional Neural Networks for Video-level Representation Learning

no code implementations ICLR 2020 Shiwen Zhang, Sheng Guo, Weilin Huang, Matthew R. Scott, Li-Min Wang

Most existing 3D CNN structures for video representation learning are clip-based methods, and do not consider video-level temporal evolution of spatio-temporal features.

Representation Learning Video Recognition

Deformable Siamese Attention Networks for Visual Object Tracking

1 code implementation CVPR 2020 Yuechen Yu, Yilei Xiong, Weilin Huang, Matthew R. Scott

In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that computes deformable self-attention and cross-attention.

Object Visual Object Tracking

Channel Interaction Networks for Fine-Grained Image Categorization

no code implementations AAAI-2020 2020 Yu Gao, Xintong Han, Xun Wang, Weilin Huang, Matthew R. Scott

Fine-grained image categorization is challenging due to the subtle inter-class differences. We posit that exploiting the rich relationships between channels can help capture such differences since different channels correspond to different semantics.

Image Categorization Metric Learning

iFAN: Image-Instance Full Alignment Networks for Adaptive Object Detection

no code implementations9 Mar 2020 Chenfan Zhuang, Xintong Han, Weilin Huang, Matthew R. Scott

We propose Image-Instance Full Alignment Networks (iFAN) to tackle this problem by precisely aligning feature distributions on both image and instance levels: 1) Image-level alignment: multi-scale features are roughly aligned by training adversarial domain classifiers in a hierarchically-nested fashion.

Domain Adaptation Metric Learning +3

V4D:4D Convolutional Neural Networks for Video-level Representation Learning

1 code implementation18 Feb 2020 Shiwen Zhang, Sheng Guo, Weilin Huang, Matthew R. Scott, Li-Min Wang

Most existing 3D CNNs for video representation learning are clip-based methods, and thus do not consider video-level temporal evolution of spatio-temporal features.

Long-range modeling Representation Learning +1

Knowledge Integration Networks for Action Recognition

no code implementations18 Feb 2020 Shiwen Zhang, Sheng Guo, Li-Min Wang, Weilin Huang, Matthew R. Scott

We design a three-branch architecture consisting of a main branch for action recognition, and two auxiliary branches for human parsing and scene recognition which allow the model to encode the knowledge of human and scene for action recognition.

Action Recognition Human Parsing +2

Cross-Batch Memory for Embedding Learning

5 code implementations CVPR 2020 Xun Wang, Haozhi Zhang, Weilin Huang, Matthew R. Scott

This suggests that the features of instances computed at preceding iterations can be used to considerably approximate their features extracted by the current model.

Image Retrieval Metric Learning +1

Convolutional Character Networks

1 code implementation ICCV 2019 Linjie Xing, Zhi Tian, Weilin Huang, Matthew R. Scott

We evaluate CharNet on three standard benchmarks, where it consistently outperforms the state-of-the-art approaches [25, 24] by a large margin, e. g., with improvements of 65. 33%->71. 08% (with generic lexicon) on ICDAR 2015, and 54. 0%->69. 23% on Total-Text, on end-to-end text recognition.

Scene Text Detection Text Detection

Dual-Stream Pyramid Registration Network

2 code implementations26 Sep 2019 Miao Kang, Xiaojun Hu, Weilin Huang, Matthew R. Scott, Mauricio Reyes

We propose a Dual-Stream Pyramid Registration Network (referred as Dual-PRNet) for unsupervised 3D medical image registration.

Image Registration Medical Image Registration

The iMaterialist Fashion Attribute Dataset

1 code implementation13 Jun 2019 Sheng Guo, Weilin Huang, Xiao Zhang, Prasanna Srikhanta, Yin Cui, Yuan Li, Matthew R. Scott, Hartwig Adam, Serge Belongie

The dataset is constructed from over one million fashion images with a label space that includes 8 groups of 228 fine-grained attributes in total.

Attribute General Classification +2

Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning

2 code implementations CVPR 2019 Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, Matthew R. Scott

A family of loss functions built on pair-based computation have been proposed in the literature which provide a myriad of solutions for deep metric learning.

Image Retrieval Metric Learning +1

Compatible and Diverse Fashion Image Inpainting

no code implementations4 Feb 2019 Xintong Han, Zuxuan Wu, Weilin Huang, Matthew R. Scott, Larry S. Davis

The latent representations are jointly optimized with the corresponding generation network to condition the synthesis process, encouraging a diverse set of generated results that are visually compatible with existing fashion garments.

Fashion Synthesis Image Inpainting

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

1 code implementation5 Nov 2018 Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko, Arash Nazeri, Marc-Andre Weber, Abhishek Mahajan, Ujjwal Baid, Elizabeth Gerstner, Dongjin Kwon, Gagan Acharya, Manu Agarwal, Mahbubul Alam, Alberto Albiol, Antonio Albiol, Francisco J. Albiol, Varghese Alex, Nigel Allinson, Pedro H. A. Amorim, Abhijit Amrutkar, Ganesh Anand, Simon Andermatt, Tal Arbel, Pablo Arbelaez, Aaron Avery, Muneeza Azmat, Pranjal B., W Bai, Subhashis Banerjee, Bill Barth, Thomas Batchelder, Kayhan Batmanghelich, Enzo Battistella, Andrew Beers, Mikhail Belyaev, Martin Bendszus, Eze Benson, Jose Bernal, Halandur Nagaraja Bharath, George Biros, Sotirios Bisdas, James Brown, Mariano Cabezas, Shilei Cao, Jorge M. Cardoso, Eric N Carver, Adrià Casamitjana, Laura Silvana Castillo, Marcel Catà, Philippe Cattin, Albert Cerigues, Vinicius S. Chagas, Siddhartha Chandra, Yi-Ju Chang, Shiyu Chang, Ken Chang, Joseph Chazalon, Shengcong Chen, Wei Chen, Jefferson W. Chen, Zhaolin Chen, Kun Cheng, Ahana Roy Choudhury, Roger Chylla, Albert Clérigues, Steven Colleman, Ramiro German Rodriguez Colmeiro, Marc Combalia, Anthony Costa, Xiaomeng Cui, Zhenzhen Dai, Lutao Dai, Laura Alexandra Daza, Eric Deutsch, Changxing Ding, Chao Dong, Shidu Dong, Wojciech Dudzik, Zach Eaton-Rosen, Gary Egan, Guilherme Escudero, Théo Estienne, Richard Everson, Jonathan Fabrizio, Yong Fan, Longwei Fang, Xue Feng, Enzo Ferrante, Lucas Fidon, Martin Fischer, Andrew P. French, Naomi Fridman, Huan Fu, David Fuentes, Yaozong Gao, Evan Gates, David Gering, Amir Gholami, Willi Gierke, Ben Glocker, Mingming Gong, Sandra González-Villá, T. Grosges, Yuanfang Guan, Sheng Guo, Sudeep Gupta, Woo-Sup Han, Il Song Han, Konstantin Harmuth, Huiguang He, Aura Hernández-Sabaté, Evelyn Herrmann, Naveen Himthani, Winston Hsu, Cheyu Hsu, Xiaojun Hu, Xiaobin Hu, Yan Hu, Yifan Hu, Rui Hua, Teng-Yi Huang, Weilin Huang, Sabine Van Huffel, Quan Huo, Vivek HV, Khan M. Iftekharuddin, Fabian Isensee, Mobarakol Islam, Aaron S. Jackson, Sachin R. Jambawalikar, Andrew Jesson, Weijian Jian, Peter Jin, V Jeya Maria Jose, Alain Jungo, B Kainz, Konstantinos Kamnitsas, Po-Yu Kao, Ayush Karnawat, Thomas Kellermeier, Adel Kermi, Kurt Keutzer, Mohamed Tarek Khadir, Mahendra Khened, Philipp Kickingereder, Geena Kim, Nik King, Haley Knapp, Urspeter Knecht, Lisa Kohli, Deren Kong, Xiangmao Kong, Simon Koppers, Avinash Kori, Ganapathy Krishnamurthi, Egor Krivov, Piyush Kumar, Kaisar Kushibar, Dmitrii Lachinov, Tryphon Lambrou, Joon Lee, Chengen Lee, Yuehchou Lee, M Lee, Szidonia Lefkovits, Laszlo Lefkovits, James Levitt, Tengfei Li, Hongwei Li, Hongyang Li, Xiaochuan Li, Yuexiang Li, Heng Li, Zhenye Li, Xiaoyu Li, Zeju Li, Xiaogang Li, Wenqi Li, Zheng-Shen Lin, Fengming Lin, Pietro Lio, Chang Liu, Boqiang Liu, Xiang Liu, Mingyuan Liu, Ju Liu, Luyan Liu, Xavier Llado, Marc Moreno Lopez, Pablo Ribalta Lorenzo, Zhentai Lu, Lin Luo, Zhigang Luo, Jun Ma, Kai Ma, Thomas Mackie, Anant Madabushi, Issam Mahmoudi, Klaus H. Maier-Hein, Pradipta Maji, CP Mammen, Andreas Mang, B. S. Manjunath, Michal Marcinkiewicz, S McDonagh, Stephen McKenna, Richard McKinley, Miriam Mehl, Sachin Mehta, Raghav Mehta, Raphael Meier, Christoph Meinel, Dorit Merhof, Craig Meyer, Robert Miller, Sushmita Mitra, Aliasgar Moiyadi, David Molina-Garcia, Miguel A. B. Monteiro, Grzegorz Mrukwa, Andriy Myronenko, Jakub Nalepa, Thuyen Ngo, Dong Nie, Holly Ning, Chen Niu, Nicholas K Nuechterlein, Eric Oermann, Arlindo Oliveira, Diego D. C. Oliveira, Arnau Oliver, Alexander F. I. Osman, Yu-Nian Ou, Sebastien Ourselin, Nikos Paragios, Moo Sung Park, Brad Paschke, J. Gregory Pauloski, Kamlesh Pawar, Nick Pawlowski, Linmin Pei, Suting Peng, Silvio M. Pereira, Julian Perez-Beteta, Victor M. Perez-Garcia, Simon Pezold, Bao Pham, Ashish Phophalia, Gemma Piella, G. N. Pillai, Marie Piraud, Maxim Pisov, Anmol Popli, Michael P. Pound, Reza Pourreza, Prateek Prasanna, Vesna Prkovska, Tony P. Pridmore, Santi Puch, Élodie Puybareau, Buyue Qian, Xu Qiao, Martin Rajchl, Swapnil Rane, Michael Rebsamen, Hongliang Ren, Xuhua Ren, Karthik Revanuru, Mina Rezaei, Oliver Rippel, Luis Carlos Rivera, Charlotte Robert, Bruce Rosen, Daniel Rueckert, Mohammed Safwan, Mostafa Salem, Joaquim Salvi, Irina Sanchez, Irina Sánchez, Heitor M. Santos, Emmett Sartor, Dawid Schellingerhout, Klaudius Scheufele, Matthew R. Scott, Artur A. Scussel, Sara Sedlar, Juan Pablo Serrano-Rubio, N. Jon Shah, Nameetha Shah, Mazhar Shaikh, B. Uma Shankar, Zeina Shboul, Haipeng Shen, Dinggang Shen, Linlin Shen, Haocheng Shen, Varun Shenoy, Feng Shi, Hyung Eun Shin, Hai Shu, Diana Sima, M Sinclair, Orjan Smedby, James M. Snyder, Mohammadreza Soltaninejad, Guidong Song, Mehul Soni, Jean Stawiaski, Shashank Subramanian, Li Sun, Roger Sun, Jiawei Sun, Kay Sun, Yu Sun, Guoxia Sun, Shuang Sun, Yannick R Suter, Laszlo Szilagyi, Sanjay Talbar, DaCheng Tao, Zhongzhao Teng, Siddhesh Thakur, Meenakshi H Thakur, Sameer Tharakan, Pallavi Tiwari, Guillaume Tochon, Tuan Tran, Yuhsiang M. Tsai, Kuan-Lun Tseng, Tran Anh Tuan, Vadim Turlapov, Nicholas Tustison, Maria Vakalopoulou, Sergi Valverde, Rami Vanguri, Evgeny Vasiliev, Jonathan Ventura, Luis Vera, Tom Vercauteren, C. A. Verrastro, Lasitha Vidyaratne, Veronica Vilaplana, Ajeet Vivekanandan, Qian Wang, Chiatse J. Wang, Wei-Chung Wang, Duo Wang, Ruixuan Wang, Yuanyuan Wang, Chunliang Wang, Guotai Wang, Ning Wen, Xin Wen, Leon Weninger, Wolfgang Wick, Shaocheng Wu, Qiang Wu, Yihong Wu, Yong Xia, Yanwu Xu, Xiaowen Xu, Peiyuan Xu, Tsai-Ling Yang, Xiaoping Yang, Hao-Yu Yang, Junlin Yang, Haojin Yang, Guang Yang, Hongdou Yao, Xujiong Ye, Changchang Yin, Brett Young-Moxon, Jinhua Yu, Xiangyu Yue, Songtao Zhang, Angela Zhang, Kun Zhang, Xue-jie Zhang, Lichi Zhang, Xiaoyue Zhang, Yazhuo Zhang, Lei Zhang, Jian-Guo Zhang, Xiang Zhang, Tianhao Zhang, Sicheng Zhao, Yu Zhao, Xiaomei Zhao, Liang Zhao, Yefeng Zheng, Liming Zhong, Chenhong Zhou, Xiaobing Zhou, Fan Zhou, Hongtu Zhu, Jin Zhu, Ying Zhuge, Weiwei Zong, Jayashree Kalpathy-Cramer, Keyvan Farahani, Christos Davatzikos, Koen van Leemput, Bjoern Menze

This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i. e., 2012-2018.

Brain Tumor Segmentation Survival Prediction +1

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

2 code implementations ECCV 2018 Sheng Guo, Weilin Huang, Haozhi Zhang, Chenfan Zhuang, Dengke Dong, Matthew R. Scott, Dinglong Huang

We present a simple yet efficient approach capable of training deep neural networks on large-scale weakly-supervised web images, which are crawled raw from the Internet by using text queries, without any human annotation.

 Ranked #1 on Image Classification on Clothing1M (using clean data) (using extra training data)

Image Classification Weakly-supervised Learning

An end-to-end TextSpotter with Explicit Alignment and Attention

2 code implementations CVPR 2018 Tong He, Zhi Tian, Weilin Huang, Chunhua Shen, Yu Qiao, Changming Sun

This allows the two tasks to work collaboratively by shar- ing convolutional features, which is critical to identify challenging text instances.

Text Detection

Single Shot Text Detector with Regional Attention

1 code implementation ICCV 2017 Pan He, Weilin Huang, Tong He, Qile Zhu, Yu Qiao, Xiaolin Li

Our text detector achieves an F-measure of 77% on the ICDAR 2015 bench- mark, advancing the state-of-the-art results in [18, 28].

Scene Text Detection

Temporal HeartNet: Towards Human-Level Automatic Analysis of Fetal Cardiac Screening Video

no code implementations3 Jul 2017 Weilin Huang, Christopher P. Bridge, J. Alison Noble, Andrew Zisserman

We present an automatic method to describe clinically useful information about scanning, and to guide image interpretation in ultrasound (US) videos of the fetal heart.

Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs

2 code implementations4 Oct 2016 Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, Yu Qiao

Convolutional Neural Networks (CNNs) have made remarkable progress on scene recognition, partially due to these recent large-scale scene datasets, such as the Places and Places2.

General Classification Scene Classification +1

Detecting Text in Natural Image with Connectionist Text Proposal Network

27 code implementations12 Sep 2016 Zhi Tian, Weilin Huang, Tong He, Pan He, Yu Qiao

We propose a novel Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image.

Scene Text Detection

Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network

1 code implementation31 Mar 2016 Tong He, Weilin Huang, Yu Qiao, Jian Yao

We propose a novel Cascaded Convolutional Text Network (CCTN) that joints two customized convolutional networks for coarse-to-fine text localization.

Scene Text Detection Text Detection

Locally-Supervised Deep Hybrid Model for Scene Recognition

no code implementations27 Jan 2016 Sheng Guo, Weilin Huang, Li-Min Wang, Yu Qiao

Secondly, we propose a new Local Convolutional Supervision (LCS) layer to enhance the local structure of the image by directly propagating the label information to the convolutional layers.

General Classification Image Classification +1

Text-Attentional Convolutional Neural Networks for Scene Text Detection

no code implementations12 Oct 2015 Tong He, Weilin Huang, Yu Qiao, Jian Yao

The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components.

Multi-Task Learning Scene Text Detection +3

Local Multi-Grouped Binary Descriptor with Ring-based Pooling Configuration and Optimization

no code implementations22 Sep 2015 Yongqiang Gao, Weilin Huang, Yu Qiao

The performance of RMGD was evaluated on a number of publicly available benchmarks, where the RMGD outperforms the state-of-the-art binary descriptors significantly.

Places205-VGGNet Models for Scene Recognition

2 code implementations7 Aug 2015 Limin Wang, Sheng Guo, Weilin Huang, Yu Qiao

We verify the performance of trained Places205-VGGNet models on three datasets: MIT67, SUN397, and Places205.

Computational Efficiency Object Recognition +1

Local Color Contrastive Descriptor for Image Classification

no code implementations3 Aug 2015 Sheng Guo, Weilin Huang, Yu Qiao

Our descriptor enriches local image representation with both color and contrast information.

Classification General Classification +2

Reading Scene Text in Deep Convolutional Sequences

1 code implementation14 Jun 2015 Pan He, Weilin Huang, Yu Qiao, Chen Change Loy, Xiaoou Tang

We develop a Deep-Text Recurrent Network (DTRN) that regards scene text reading as a sequence labelling problem.

Scene Text Recognition

Robust Face Recognition with Structural Binary Gradient Patterns

no code implementations1 Jun 2015 Weilin Huang, Hujun Yin

To discover underlying local structures in the gradient domain, we compute image gradients from multiple directions and simplify them into a set of binary strings.

Face Recognition Robust Face Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.