Search Results for author: Maria Attarian

Found 5 papers, 1 papers with code

Combining Learned Lyrical Structures and Vocabulary for Improved Lyric Generation

no code implementations • 12 Nov 2018 • Pablo Samuel Castro, Maria Attarian

The use of language models for generating lyrics and poetry has received an increased interest in the last few years.

Paper
Add Code

Transforming Neural Network Visual Representations to Predict Human Judgments of Similarity

no code implementations • 13 Oct 2020 • Maria Attarian, Brett D. Roads, Michael C. Mozer

Deep-learning vision models have shown intriguing similarities and differences with respect to human vision.

Paper
Add Code

See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction

no code implementations • 7 Oct 2022 • Maria Attarian, Advaya Gupta, Ziyi Zhou, Wei Yu, Igor Gilitschenski, Animesh Garg

Cognitive planning is the structural decomposition of complex tasks into a sequence of future behaviors.

Video Generation Video Prediction

Paper
Add Code

Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

no code implementations • 19 Mar 2024 • Vidhi Jain, Maria Attarian, Nikhil J Joshi, Ayzaan Wahid, Danny Driess, Quan Vuong, Pannag R Sanketi, Pierre Sermanet, Stefan Welker, Christine Chan, Igor Gilitschenski, Yonatan Bisk, Debidatta Dwibedi

Given a video demonstration of a manipulation task and current visual observations, Vid2Robot directly produces robot actions.

Paper
Add Code

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

1 code implementation • 1 Apr 2022 • Andy Zeng, Maria Attarian, Brian Ichter, Krzysztof Choromanski, Adrian Wong, Stefan Welker, Federico Tombari, Aveek Purohit, Michael Ryoo, Vikas Sindhwani, Johnny Lee, Vincent Vanhoucke, Pete Florence

Large pretrained (e. g., "foundation") models exhibit distinct capabilities depending on the domain of data they are trained on.

Ranked #21 on Video Retrieval on MSR-VTT-1kA (video-to-text R@1 metric)

Image Captioning Multimodal Reasoning +5

32,875

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.