Multimodal Association

3 papers with code • 0 benchmarks • 1 datasets

Multimodal association refers to the process of associating multiple modalities or types of data in time series analysis. In time series analysis, multiple modalities or types of data can be collected, such as sensor data, images, audio, and text. Multimodal association aims to integrate these different types of data to improve the understanding and prediction of the time series.

For example, in a smart home application, sensor data from temperature, humidity, and motion sensors can be combined with images from cameras to monitor the activities of residents. By analyzing the multimodal data together, the system can detect anomalies or patterns that may not be visible in individual modalities alone.

Multimodal association can be achieved using various techniques, including deep learning models, statistical models, and graph-based models. These models can be trained on the multimodal data to learn the associations and dependencies between the different types of data.

Benchmarks

Add a Result

These leaderboards are used to track progress in Multimodal Association

You can find evaluation results in the subtasks. You can also submitting evaluation metrics for this task.

Datasets

Vi-Fi Multi-modal Dataset

Subtasks

multimodal generation

Most implemented papers

Most implemented Social Latest No code

Vi-Fi: Associating Moving Subjects across Vision and Wireless Sensors

vifi2021/Vi-Fi • • ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN) 2022

In this paper, we present Vi-Fi, a multi-modal system that leverages a user’s smartphone WiFi Fine Timing Measurements (FTM) and inertial measurement unit (IMU) sensor data to associate the user detected on a camera footage with their corresponding smartphone identifier (e. g. WiFi MAC address).

Paper
Code

WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models

winogavil/winogavil-experiments • • 25 Jul 2022

While vision-and-language models perform well on tasks such as visual question answering, they struggle when it comes to basic human commonsense reasoning skills.

Paper
Code

ViTag: Online WiFi Fine Time Measurements Aided Vision-Motion Identity Association in Multi-person Environments

bryanbocao/vitag • • IEEE International Conference on Sensing, Communication, and Networking 2022

ViTag associates a sequence of vision tracker generated bounding boxes with Inertial Measurement Unit (IMU) data and Wi-Fi Fine Time Measurements (FTM) from smartphones.

Paper
Code

Multimodal Association

Benchmarks Add a Result

Datasets

Subtasks

Most implemented papers

Vi-Fi: Associating Moving Subjects across Vision and Wireless Sensors

WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models

ViTag: Online WiFi Fine Time Measurements Aided Vision-Motion Identity Association in Multi-person Environments

Content

Benchmarks

Add a Result