Search Results for author: Yuan-Fang Wang

Found 16 papers, 6 papers with code

Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer

2 code implementations • CVPR 2017 • Xin Wang, Geoffrey Oxholm, Da Zhang, Yuan-Fang Wang

That is, our scheme can generate results that are visually pleasing and more similar to multiple desired artistic styles with color and texture cues at multiple scales.

Style Transfer

Paper
Code

Deep Reinforcement Learning for Visual Object Tracking in Videos

no code implementations • 31 Jan 2017 • Da Zhang, Hamid Maei, Xin Wang, Yuan-Fang Wang

In this paper we introduce a fully end-to-end approach for visual tracking in videos that learns to predict the bounding box locations of a target object at every frame.

Decision Making Object +4

Paper
Add Code

Video Captioning via Hierarchical Reinforcement Learning

no code implementations • CVPR 2018 • Xin Wang, Wenhu Chen, Jiawei Wu, Yuan-Fang Wang, William Yang Wang

Video captioning is the task of automatically generating a textual description of the actions in a video.

Hierarchical Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning

1 code implementation • NAACL 2018 • Xin Wang, Yuan-Fang Wang, William Yang Wang

Furthermore, for the first time, we validate the superior performance of the deep audio features on the video captioning task.

Video Captioning Video Understanding

Paper
Code

No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling

2 code implementations • ACL 2018 • Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang

Though impressive results have been achieved in visual captioning, the task of generating abstract stories from photo streams is still a little-tapped problem.

Ranked #13 on Visual Storytelling on VIST

Image Captioning Visual Storytelling

136

Paper
Code

S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

1 code implementation • 21 Jul 2018 • Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang

In this paper, we present a novel Single Shot multi-Span Detector for temporal activity detection in long, untrimmed videos using a simple end-to-end fully three-dimensional convolutional (Conv3D) network.

Action Detection Activity Detection

Paper
Code

Dynamic Temporal Pyramid Network: A Closer Look at Multi-Scale Modeling for Activity Detection

no code implementations • 7 Aug 2018 • Da Zhang, Xiyang Dai, Yuan-Fang Wang

(3) We further exploit the temporal context of activities by appropriately fusing multi-scale feature maps, and demonstrate that both local and global temporal contexts are important.

Action Detection Activity Detection +2

Paper
Add Code

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

no code implementations • CVPR 2019 • Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Yang Wang, Lei Zhang

Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments.

Ranked #2 on Vision-Language Navigation on Room2Room

Imitation Learning Reinforcement Learning (RL) +2

Paper
Add Code

MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment

no code implementations • CVPR 2019 • Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang, Larry S. Davis

In this paper, we present Moment Alignment Network (MAN), a novel framework that unifies the candidate moment encoding and temporal structural reasoning in a single-shot feed-forward network.

Moment Retrieval Natural Language Moment Retrieval +1

Paper
Add Code

VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research

2 code implementations • ICCV 2019 • Xin Wang, Jiawei Wu, Junkun Chen, Lei LI, Yuan-Fang Wang, William Yang Wang

We also introduce two tasks for video-and-language research based on VATEX: (1) Multilingual Video Captioning, aimed at describing a video in various languages with a compact unified captioning model, and (2) Video-guided Machine Translation, to translate a source language description into the target language using the video information as additional spatiotemporal context.

Machine Translation Translation +3

Paper
Code

Multi-View Non-negative Matrix Factorization Discriminant Learning via Cross Entropy Loss

no code implementations • 8 Jan 2022 • Jian-wei Liu, Yuan-Fang Wang, Run-kun Lu, Xionglin Luo

But not all of this information is useful for classification tasks.

Classification MULTI-VIEW LEARNING

Paper
Add Code

Auto-Encoder based Co-Training Multi-View Representation Learning

no code implementations • 9 Jan 2022 • Run-kun Lu, Jian-wei Liu, Yuan-Fang Wang, Hao-jie Xie, Xin Zuo

As we known, auto-encoder is a method of deep learning, which can learn the latent feature of raw data by reconstructing the input, and based on this, we propose a novel algorithm called Auto-encoder based Co-training Multi-View Learning (ACMVL), which utilizes both complementarity and consistency and finds a joint latent feature representation of multiple views.

MULTI-VIEW LEARNING Representation Learning

Paper
Add Code

VREN: Volleyball Rally Dataset with Expression Notation Language

no code implementations • 28 Sep 2022 • Haotian Xia, Rhys Tracy, Yun Zhao, Erwan Fraisse, Yuan-Fang Wang, Linda Petzold

The second goal is to introduce a volleyball descriptive language to fully describe the rally processes in the games and apply the language to our dataset.

Decision Making Descriptive +1

Paper
Add Code

Graph Encoding and Neural Network Approaches for Volleyball Analytics: From Game Outcome to Individual Play Predictions

no code implementations • 22 Aug 2023 • Rhys Tracy, Haotian Xia, Alex Rasla, Yuan-Fang Wang, Ambuj Singh

Our results show that the use of GNNs with our graph encoding yields a much more advanced analysis of the data, which noticeably improves prediction results overall.

Type prediction

Paper
Add Code

Advanced Volleyball Stats for All Levels: Automatic Setting Tactic Detection and Classification with a Single Camera

1 code implementation • 26 Sep 2023 • Haotian Xia, Rhys Tracy, Yun Zhao, Yuqing Wang, Yuan-Fang Wang, Weining Shen

Our frameworks combine setting ball trajectory recognition with a novel set trajectory classifier to generate comprehensive and advanced statistical data.

Computational Efficiency Pathfinder

Paper
Code

SportQA: A Benchmark for Sports Understanding in Large Language Models

no code implementations • 24 Feb 2024 • Haotian Xia, Zhengbang Yang, Yuqing Wang, Rhys Tracy, Yun Zhao, Dongdong Huang, Zezhi Chen, Yan Zhu, Yuan-Fang Wang, Weining Shen

A deep understanding of sports, a field rich in strategic and dynamic content, is crucial for advancing Natural Language Processing (NLP).

Few-Shot Learning Multiple-choice +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.