Search Results for author: Wang Zhu

Found 10 papers, 3 papers with code

Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

no code implementations16 Jul 2018 Guiliang Liu, Oliver Schulte, Wang Zhu, Qingcan Li

An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment.

reinforcement-learning Reinforcement Learning (RL)

BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps

1 code implementation ACL 2020 Wang Zhu, Hexiang Hu, Jiacheng Chen, Zhiwei Deng, Vihan Jain, Eugene Ie, Fei Sha

To this end, we propose BabyWalk, a new VLN agent that is learned to navigate by decomposing long instructions into shorter ones (BabySteps) and completing them sequentially.

Imitation Learning Navigate +1

Iterative Vision-and-Language Navigation

no code implementations CVPR 2023 Jacob Krantz, Shurjo Banerjee, Wang Zhu, Jason Corso, Peter Anderson, Stefan Lee, Jesse Thomason

We present Iterative Vision-and-Language Navigation (IVLN), a paradigm for evaluating language-guided agents navigating in a persistent environment over time.

Instruction Following Vision and Language Navigation

Navigating Memory Construction by Global Pseudo-Task Simulation for Continual Learning

1 code implementation16 Oct 2022 Yejia Liu, Wang Zhu, Shaolei Ren

To provide an approximate solution to this problem in the online continual learning setting, we further propose the Global Pseudo-task Simulation (GPS), which mimics future catastrophic forgetting of the current task by permutation.

Combinatorial Optimization Continual Learning

Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems

no code implementations26 Oct 2022 Wang Zhu, Jesse Thomason, Robin Jia

For vision-and-language reasoning tasks, both fully connectionist, end-to-end methods and hybrid, neuro-symbolic methods have achieved high in-distribution performance.

Question Answering Visual Question Answering

Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering

no code implementations24 May 2023 Wang Zhu, Jesse Thomason, Robin Jia

We train a language model (LM) to robustly answer multistep questions by generating and answering sub-questions.

Language Modelling Question Answering

Efficient End-to-End Visual Document Understanding with Rationale Distillation

no code implementations16 Nov 2023 Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova

Pre-processing tools, such as optical character recognition (OCR), can map document image inputs to textual tokens, then large language models (LLMs) can reason over text.

document understanding Optical Character Recognition +1

Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions?

no code implementations28 Nov 2023 Wang Zhu, Ishika Singh, Yuan Huang, Robin Jia, Jesse Thomason

Data augmentation via back-translation is common when pretraining Vision-and-Language Navigation (VLN) models, even though the generated instructions are noisy.

Data Augmentation Translation +1

Hybrid Transformer and Spatial-Temporal Self-Supervised Learning for Long-term Traffic Prediction

no code implementations29 Jan 2024 Wang Zhu, Doudou Zhang, Baichao Long, Jianli Xiao

Long-term traffic prediction has always been a challenging task due to its dynamic temporal dependencies and complex spatial dependencies.

Data Augmentation Self-Supervised Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.