Search Results for author: Shi-Yang Yan

Found 7 papers, 0 papers with code

Off-Policy Self-Critical Training for Transformer in Visual Paragraph Generation

no code implementations21 Jun 2020 Shi-Yang Yan, Yang Hua, Neil M. Robertson

We tackle this problem by proposing an off-policy RL learning algorithm where a behaviour policy represented by GRUs performs the sampling.

Image Captioning Reinforcement Learning (RL) +1

ParaCNN: Visual Paragraph Generation via Adversarial Twin Contextual CNNs

no code implementations21 Apr 2020 Shi-Yang Yan, Yang Hua, Neil Robertson

Furthermore, to enable the ParaCNN to model paragraph comprehensively, we also propose an adversarial twin net training scheme.

Image Captioning Image Retrieval +2

HorNet: A Hierarchical Offshoot Recurrent Network for Improving Person Re-ID via Image Captioning

no code implementations14 Aug 2019 Shi-Yang Yan, Jun Xu, Yuai Liu, Lin Xu

Then the proposed HorNet can learn the visual and language representation from both the images and captions jointly, and thus enhance the performance of person re-ID.

Generative Adversarial Network Image Captioning +1

Image Captioning Based on a Hierarchical Attention Mechanism and Policy Gradient Optimization

no code implementations13 Nov 2018 Shi-Yang Yan, Yuan Xie, Fang-Yu Wu, Jeremy S. Smith, Wenjin Lu, Bai-Ling Zhang

Automatically generating the descriptions of an image, i. e., image captioning, is an important and fundamental topic in artificial intelligence, which bridges the gap between computer vision and natural language processing.

Generative Adversarial Network Image Captioning +1

Hierarchical Multi-scale Attention Networks for Action Recognition

no code implementations25 Aug 2017 Shi-Yang Yan, Jeremy S. Smith, Wenjin Lu, Bai-Ling Zhang

Through visualization of what have been learnt by the networks, it can be observed that both the attention regions of images and the hierarchical temporal structure can be captured by HM-AN.

Action Recognition Hard Attention +1

Traffic scene recognition based on deep cnn and vlad spatial pyramids

no code implementations24 Jul 2017 Fang-Yu Wu, Shi-Yang Yan, Jeremy S. Smith, Bai-Ling Zhang

In this paper, we attempted to solve the traffic scene recognition problem by combining the features representational capabilities of CNN with the VLAD encoding scheme.

Region Proposal Scene Classification +1

CHAM: action recognition using convolutional hierarchical attention model

no code implementations9 May 2017 Shi-Yang Yan, Jeremy S. Smith, Wenjin Lu, Bai-Ling Zhang

This paper presents improvements to the soft attention model by combining a convolutional LSTM with a hierarchical system architecture to recognize action categories in videos.

Action Recognition Image Captioning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.