Search Results for author: Alireza Zareian

Found 17 papers, 9 papers with code

GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation Learning

1 code implementation20 Jul 2022 Huseyin Coskun, Alireza Zareian, Joshua L. Moore, Federico Tombari, Chen Wang

Specifically, we outperform the state of the art by 7% on UCF and 4% on HMDB for video retrieval, and 5% on UCF and 6% on HMDB for video classification

Action Recognition Clustering +6

SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning

no code implementations16 Dec 2021 Zhecan Wang, Haoxuan You, Liunian Harold Li, Alireza Zareian, Suji Park, Yiqing Liang, Kai-Wei Chang, Shih-Fu Chang

As for pre-training, a scene-graph-aware pre-training method is proposed to leverage structure knowledge extracted in the visual scene graph.

Visual Commonsense Reasoning

Analogical Reasoning for Visually Grounded Compositional Generalization

no code implementations1 Jan 2021 Bo Wu, Haoyu Qin, Alireza Zareian, Carl Vondrick, Shih-Fu Chang

Children acquire language subconsciously by observing the surrounding world and listening to descriptions.

Language Acquisition

Open-Vocabulary Object Detection Using Captions

1 code implementation CVPR 2021 Alireza Zareian, Kevin Dela Rosa, Derek Hao Hu, Shih-Fu Chang

Weakly supervised and zero-shot learning techniques have been explored to scale object detectors to more categories with less supervision, but they have not been as successful and widely adopted as supervised models.

Object object-detection +2

Analogical Reasoning for Visually Grounded Language Acquisition

no code implementations22 Jul 2020 Bo Wu, Haoyu Qin, Alireza Zareian, Carl Vondrick, Shih-Fu Chang

Children acquire language subconsciously by observing the surrounding world and listening to descriptions.

Language Acquisition

GAIA: A Fine-grained Multimedia Knowledge Extraction System

no code implementations ACL 2020 Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski, Marjorie Freedman

We present the first comprehensive, open source multimedia knowledge extraction system that takes a massive stream of unstructured, heterogeneous multimedia data from various sources and languages as input, and creates a coherent, structured knowledge base, indexing entities, relations, and events, following a rich, fine-grained ontology.

Learning Visual Commonsense for Robust Scene Graph Generation

2 code implementations ECCV 2020 Alireza Zareian, Zhecan Wang, Haoxuan You, Shih-Fu Chang

Scene graph generation models understand the scene through object and predicate recognition, but are prone to mistakes due to the challenges of perception in the wild.

Graph Generation Scene Graph Generation +1

Cross-media Structured Common Space for Multimedia Event Extraction

no code implementations ACL 2020 Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang

We introduce a new task, MultiMedia Event Extraction (M2E2), which aims to extract events and their arguments from multimedia documents.

Event Extraction

Weakly Supervised Visual Semantic Parsing

1 code implementation CVPR 2020 Alireza Zareian, Svebor Karaman, Shih-Fu Chang

Scene Graph Generation (SGG) aims to extract entities, predicates and their semantic structure from images, enabling deep understanding of visual content, with many applications such as visual reasoning and image retrieval.

Graph Generation Image Retrieval +5

Bridging Knowledge Graphs to Generate Scene Graphs

1 code implementation ECCV 2020 Alireza Zareian, Svebor Karaman, Shih-Fu Chang

Scene graphs are powerful representations that parse images into their abstract semantic elements, i. e., objects and their interactions, which facilitates visual comprehension and explainable reasoning.

Graph Generation Knowledge Graphs +1

General Partial Label Learning via Dual Bipartite Graph Autoencoder

no code implementations5 Jan 2020 Brian Chen, Bo Wu, Alireza Zareian, Hanwang Zhang, Shih-Fu Chang

Compared to the traditional Partial Label Learning (PLL) problem, GPLL relaxes the supervision assumption from instance-level -- a label set partially labels an instance -- to group-level: 1) a label set partially labels a group of instances, where the within-group instance-label link annotations are missing, and 2) cross-group links are allowed -- instances in a group may be partially linked to the label set from another group.

Partial Label Learning

Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation

no code implementations ICLR 2020 Jiawei Ma*, Zheng Shou*, Alireza Zareian, Hassan Mansour, Anthony Vetro, Shih-Fu Chang

In order to impute the missing values, state-of-the-art methods are built on Recurrent Neural Networks (RNN), which process each time stamp sequentially, prohibiting the direct modeling of the relationship between distant time stamps.

Imputation Machine Translation +2

CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation

2 code implementations23 May 2019 Jiawei Ma, Zheng Shou, Alireza Zareian, Hassan Mansour, Anthony Vetro, Shih-Fu Chang

In order to jointly capture the self-attention across multiple dimensions, including time, location and the sensor measurements, while maintain low computational complexity, we propose a novel approach called Cross-Dimensional Self-Attention (CDSA) to process each dimension sequentially, yet in an order-independent manner.

Imputation Machine Translation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.