We sampled, modified, and recorded 2, 541 dialogues from the open-domain dialogue dataset DailyDialog which are adequately long to represent context of each dialogue.
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences.
In our model, medical text annotation is introduced to compensate for the quality deficiency in image data.
Ranked #1 on Medical Image Segmentation on MoNuSeg
We introduce the problem of disentangling time-lapse sequences in a way that allows separate, after-the-fact control of overall trends, cyclic effects, and random effects in the images, and describe a technique based on data-driven generative models that achieves this goal.
In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation.
The goal of multi-object tracking (MOT) is detecting and tracking all the objects in a scene, while keeping a unique identifier for each object.
Ranked #1 on Multi-Object Tracking on MOT20 (using extra training data)
We test language models on our forecasting task and find that performance is far below a human expert baseline.