Search Results for author: Jean Oh

Found 41 papers, 14 papers with code

StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Synthesis

1 code implementation • 4 Nov 2021 • Peter Schaldenbrand, Zhixuan Liu, Jean Oh

Generating images that fit a given text description using machine learning has improved greatly with the release of technologies such as the CLIP image-text encoder model; however, current methods lack artistic control of the style of image to be generated.

Style Transfer

272

Paper
Code

StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation

1 code implementation • 24 Feb 2022 • Peter Schaldenbrand, Zhixuan Liu, Jean Oh

Style Transfer Translation

272

Paper
Code

Social Attention: Modeling Attention in Human Crowds

2 code implementations • 12 Oct 2017 • Anirudh Vemula, Katharina Muelling, Jean Oh

In this work, we propose Social Attention, a novel trajectory prediction model that captures the relative importance of each person when navigating in the crowd, irrespective of their proximity.

Navigate Trajectory Prediction

184

Paper
Code

EigenTrajectory: Low-Rank Descriptors for Multi-Modal Trajectory Forecasting

1 code implementation • ICCV 2023 • Inhwan Bae, Jean Oh, Hae-Gon Jeon

In this paper, we present EigenTrajectory ($\mathbb{ET}$), a trajectory prediction approach that uses a novel trajectory descriptor to form a compact space, known here as $\mathbb{ET}$ space, in place of Euclidean space, for representing pedestrian movements.

Human Dynamics Trajectory Forecasting

Paper
Code

Robot Synesthesia: A Sound and Emotion Guided AI Painter

1 code implementation • 9 Feb 2023 • Vihaan Misra, Peter Schaldenbrand, Jean Oh

For speech, we decouple speech into its transcribed text and the tone of the speech.

Image Generation Image Manipulation

Paper
Code

Complementary Random Masking for RGB-Thermal Semantic Segmentation

1 code implementation • 30 Mar 2023 • Ukcheol Shin, Kyunghyun Lee, In So Kweon, Jean Oh

Also, the proposed self-distillation loss encourages the network to extract complementary and meaningful representations from a single modality or complementary masked modalities.

Ranked #2 on Thermal Image Segmentation on MFN Dataset

Scene Understanding Semantic Segmentation +1

Paper
Code

Trajformer: Trajectory Prediction with Local Self-Attentive Contexts for Autonomous Driving

2 code implementations • 30 Nov 2020 • Manoj Bhat, Jonathan Francis, Jean Oh

Effective feature-extraction is critical to models' contextual understanding, particularly for applications to robotics and autonomous driving, such as multimodal trajectory prediction.

Autonomous Driving Trajectory Prediction

Paper
Code

Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization

1 code implementation • 23 Oct 2022 • Peter Schaldenbrand, Zhixuan Liu, Jean Oh

We introduce an approach to generating videos based on a series of given language descriptions.

Paper
Code

Path Planning in Dynamic Environments with Adaptive Dimensionality

1 code implementation • 22 May 2016 • Anirudh Vemula, Katharina Muelling, Jean Oh

In this paper, we apply the idea of adaptive dimensionality to speed up path planning in dynamic environments for a robot with no assumptions on its dynamic model.

Robotics

Paper
Code

Content Masked Loss: Human-Like Brush Stroke Planning in a Reinforcement Learning Painting Agent

1 code implementation • 18 Dec 2020 • Peter Schaldenbrand, Jean Oh

The objective of most Reinforcement Learning painting agents is to minimize the loss between a target image and the paint canvas.

object-detection Object Detection +1

Paper
Code

T2FPV: Dataset and Method for Correcting First-Person View Errors in Pedestrian Trajectory Prediction

1 code implementation • 22 Sep 2022 • Benjamin Stoler, Meghdeep Jana, Soonmin Hwang, Jean Oh

To support first-person view trajectory prediction research, we present T2FPV, a method for constructing high-fidelity first-person view (FPV) datasets given a real-world, top-down trajectory dataset; we showcase our approach on the ETH/UCY pedestrian dataset to generate the egocentric visual data of all interacting pedestrians, creating the T2FPV-ETH dataset.

Imputation Pedestrian Trajectory Prediction +1

Paper
Code

Artistic Style in Robotic Painting; a Machine Learning Approach to Learning Brushstroke from Human Artists

1 code implementation • 7 Jul 2020 • Ardavan Bidgoli, Manuel Ladron De Guevara, Cinnie Hsiung, Jean Oh, Eunsu Kang

We propose a method to integrate an artistic style to the brushstrokes and the painting process through collaboration with a human artist.

BIG-bench Machine Learning Style Transfer

Paper
Code

Towards Equitable Representation in Text-to-Image Synthesis Models with the Cross-Cultural Understanding Benchmark (CCUB) Dataset

1 code implementation • 28 Jan 2023 • Zhixuan Liu, Youeun Shin, Beverley-Claire Okogwu, Youngsik Yun, Lia Coleman, Peter Schaldenbrand, Jihie Kim, Jean Oh

It has been shown that accurate representation in media improves the well-being of the people who consume it.

Cultural Vocal Bursts Intensity Prediction Image Generation +3

Paper
Code

Learning Lexical Entries for Robotic Commands using Crowdsourcing

no code implementations • 8 Sep 2016 • Junjie Hu, Jean Oh, Anatole Gershman

Robotic commands in natural language usually contain various spatial descriptions that are semantically similar but syntactically different.

Machine Translation Translation

Paper
Add Code

Attention-based Multimodal Neural Machine Translation

no code implementations • WS 2016 • Po-Yao Huang, Frederick Liu, Sz-Rung Shiang, Jean Oh, Chris Dyer

Image Captioning Machine Translation +1

Paper
Add Code

Explainable Semantic Mapping for First Responders

no code implementations • 15 Oct 2019 • Jean Oh, Martial Hebert, Hae-Gon Jeon, Xavier Perez, Chia Dai, Yeeho Song

One of the key challenges in the semantic mapping problem in postdisaster environments is how to analyze a large amount of data efficiently with minimal supervision.

Semantic Segmentation

Paper
Add Code

Following Social Groups: Socially Compliant Autonomous Navigation in Dense Crowds

no code implementations • 27 Nov 2019 • Xinjie Yao, Ji Zhang, Jean Oh

The underlying system incorporates a deep neural network to track social groups and join the flow of a social group in facilitating the navigation.

Autonomous Navigation Collision Avoidance +1

Paper
Add Code

Adjusting Image Attributes of Localized Regions with Low-level Dialogue

1 code implementation • LREC 2020 • Tzu-Hsiang Lin, Alexander Rudnicky, Trung Bui, Doo Soon Kim, Jean Oh

Our system grounds language on the level of edit operations, and suggests options for a user to choose from.

Semantic Segmentation

Paper
Code

A Multimodal Dialogue System for Conversational Image Editing

no code implementations • 16 Feb 2020 • Tzu-Hsiang Lin, Trung Bui, Doo Soon Kim, Jean Oh

In this paper, we present a multimodal dialogue system for Conversational Image Editing.

Paper
Add Code

Noticing Motion Patterns: Temporal CNN with a Novel Convolution Operator for Human Trajectory Prediction

no code implementations • 2 Jul 2020 • Dapeng Zhao, Jean Oh

We propose a Convolutional Neural Network-based approach to learn, detect, and extract patterns in sequential trajectory data, known here as Social Pattern Extraction Convolution (Social-PEC).

Decision Making Trajectory Prediction

Paper
Add Code

Image Captioning with Compositional Neural Module Networks

no code implementations • 10 Jul 2020 • Junjiao Tian, Jean Oh

In image captioning where fluency is an important factor in evaluation, e. g., $n$-gram metrics, sequential models are commonly used; however, sequential models generally result in overgeneralized expressions that lack the details that may be present in an input image.

Image Captioning Question Answering +2

Paper
Add Code

Anytime 3D Object Reconstruction using Multi-modal Variational Autoencoder

no code implementations • 25 Jan 2021 • Hyeonwoo Yu, Jean Oh

In this context, we propose a method for imputation of latent variables whose elements are partially lost.

3D Object Reconstruction 3D Shape Reconstruction +4

Paper
Add Code

Anchor Distance for 3D Multi-Object Distance Estimation from 2D Single Shot

no code implementations • 25 Jan 2021 • Hyeonwoo Yu, Jean Oh

Given a 2D Bounding Box (BBox) and object parameters, a 3D distance to the object can be calculated directly using 3D reprojection; however, such methods are prone to significant errors because an error from the 2D detection can be amplified in 3D.

Autonomous Driving Object +4

Paper
Add Code

Domain Adaptive Monocular Depth Estimation With Semantic Information

no code implementations • 12 Apr 2021 • Fei Lu, Hyeonwoo Yu, Jean Oh

The advent of deep learning has brought an impressive advance to monocular depth estimation, e. g., supervised monocular depth estimation has been thoroughly investigated.

Image Classification Monocular Depth Estimation

Paper
Add Code

Self-supervised Learning of 3D Object Understanding by Data Association and Landmark Estimation for Image Sequence

no code implementations • 14 Apr 2021 • Hyeonwoo Yu, Jean Oh

Therefore, we propose a strategy to exploit multipleobservations of the object in the image sequence in orderto surpass the self-performance: first, the landmarks for theglobal object map are estimated through network predic-tion and data association, and the corrected annotation fora single frame is obtained.

Object Pose Estimation +1

Paper
Add Code

Language Understanding for Field and Service Robots in a Priori Unknown Environments

no code implementations • 21 May 2021 • Matthew R. Walter, Siddharth Patki, Andrea F. Daniele, Ethan Fahnestock, Felix Duvallet, Sachithra Hemachandra, Jean Oh, Anthony Stentz, Nicholas Roy, Thomas M. Howard

This progress now creates an opportunity for robots to operate not only in isolation, but also with and alongside humans in our complex environments.

Imitation Learning Natural Language Understanding

Paper
Add Code

Core Challenges in Embodied Vision-Language Planning

no code implementations • 26 Jun 2021 • Jonathan Francis, Nariaki Kitamura, Felix Labelle, Xiaopeng Lu, Ingrid Navarro, Jean Oh

Recent advances in the areas of multimodal machine learning and artificial intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Embodied AI.

Paper
Add Code

Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling

no code implementations • 20 Aug 2021 • Xiaopeng Lu, Zhen Fan, Yansen Wang, Jean Oh, Carolyn P. Rose

LOGOS leverages two grounding tasks to better localize the key information of the image, utilizes scene text clustering to group individual OCR tokens, and learns to select the best answer from different sources of OCR (Optical Character Recognition) texts.

Data Ablation Optical Character Recognition +4

Paper
Add Code

Unsupervised Domain Adaptation Via Pseudo-labels And Objectness Constraints

no code implementations • 29 Sep 2021 • Rajshekhar Das, Jonathan Francis, Sanket Vaibhav Mehta, Jean Oh, Emma Strubell, Jose Moura

Crucially, the objectness constraint is agnostic to the ground-truth semantic segmentation labels and, therefore, remains appropriate for unsupervised adaptation settings.

Object Pseudo Label +4

Paper
Add Code

Translating Robot Skills: Learning Unsupervised Skill Correspondences Across Robots

no code implementations • 29 Sep 2021 • Tanmay Shankar, Yixin Lin, Aravind Rajeswaran, Vikash Kumar, Stuart Anderson, Jean Oh

In this paper, we explore how we can endow robots with the ability to learn correspondences between their own skills, and those of morphologically different robots in different domains, in an entirely unsupervised manner.

Translation Unsupervised Machine Translation

Paper
Add Code

Safe Autonomous Racing via Approximate Reachability on Ego-vision

no code implementations • 14 Oct 2021 • Bingqing Chen, Jonathan Francis, Jean Oh, Eric Nyberg, Sylvia L. Herbert

Given the nature of the task, autonomous agents need to be able to 1) identify and avoid unsafe scenarios under the complex vehicle dynamics, and 2) make sub-second decision in a fast-changing environment.

Autonomous Driving Reinforcement Learning (RL) +1

Paper
Add Code

Learn-to-Race Challenge 2022: Benchmarking Safe Learning and Cross-domain Generalisation in Autonomous Racing

no code implementations • 5 May 2022 • Jonathan Francis, Bingqing Chen, Siddha Ganju, Sidharth Kathpal, Jyotish Poonganam, Ayush Shivani, Vrushank Vyas, Sahika Genc, Ivan Zhukov, Max Kumskoy, Anirudh Koul, Jean Oh, Eric Nyberg

In the first stage of the challenge, we evaluate an autonomous agent's ability to drive as fast as possible, while adhering to safety constraints.

Autonomous Driving Benchmarking +1

Paper
Add Code

RCA: Ride Comfort-Aware Visual Navigation via Self-Supervised Learning

no code implementations • 29 Jul 2022 • Xinjie Yao, Ji Zhang, Jean Oh

Under shared autonomy, wheelchair users expect vehicles to provide safe and comfortable rides while following users high-level navigation plans.

Self-Supervised Learning Visual Navigation

Paper
Add Code

Distribution-aware Goal Prediction and Conformant Model-based Planning for Safe Autonomous Driving

no code implementations • 16 Dec 2022 • Jonathan Francis, Bingqing Chen, Weiran Yao, Eric Nyberg, Jean Oh

The feasibility of collecting a large amount of expert demonstrations has inspired growing research interests in learning-to-drive settings, where models learn by imitating the driving behaviour from experts.

Autonomous Driving Density Estimation +1

Paper
Add Code

Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation

no code implementations • 21 Dec 2022 • Gyan Tatiya, Jonathan Francis, Luca Bondi, Ingrid Navarro, Eric Nyberg, Jivko Sinapov, Jean Oh

We also define a new audio-visual navigation sub-task, where agents are evaluated on novel sounding objects, as opposed to unheard clips of known objects.

Visual Navigation

Paper
Add Code

Learned Tree Search for Long-Horizon Social Robot Navigation in Shared Airspace

no code implementations • 4 Apr 2023 • Ingrid Navarro, Jay Patrikar, Joao P. A. Dantas, Rohan Baijal, Ian Higgins, Sebastian Scherer, Jean Oh

In this work, we propose Social Robot Tree Search (SoRTS), an algorithm for the safe navigation of mobile robots in social domains.

Navigate Social Navigation +1

Paper
Add Code

Core Challenges in Embodied Vision-Language Planning

no code implementations • 5 Apr 2023 • Jonathan Francis, Nariaki Kitamura, Felix Labelle, Xiaopeng Lu, Ingrid Navarro, Jean Oh

Recent advances in the areas of Multimodal Machine Learning and Artificial Intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Robotics.

Paper
Add Code

Regularizing Self-training for Unsupervised Domain Adaptation via Structural Constraints

no code implementations • 29 Apr 2023 • Rajshekhar Das, Jonathan Francis, Sanket Vaibhav Mehta, Jean Oh, Emma Strubell, Jose Moura

Self-training based on pseudo-labels has emerged as a dominant approach for addressing conditional distribution shifts in unsupervised domain adaptation (UDA) for semantic segmentation problems.

Object Semantic Segmentation +1

Paper
Add Code

FishRecGAN: An End to End GAN Based Network for Fisheye Rectification and Calibration

no code implementations • 9 May 2023 • Xin Shen, Kyungdon Joo, Jean Oh

We propose an end-to-end deep learning approach to rectify fisheye images and simultaneously calibrate camera intrinsic and distortion parameters.

Paper
Add Code

SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation

no code implementations • 16 Jan 2024 • Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu, Wenxuan Peng, Youngsik Yun, Andrew Hundt, Jihie Kim, Jean Oh

Accurate representation in media is known to improve the well-being of the people who consume it.

Image Generation

Paper
Add Code

Towards Human-Centered Construction Robotics: An RL-Driven Companion Robot For Contextually Assisting Carpentry Workers

no code implementations • 27 Mar 2024 • Yuning Wu, Jiaying Wei, Jean Oh, Daniel Cardoso Llach

In the dynamic construction industry, traditional robotic integration has primarily focused on automating specific tasks, often overlooking the complexity and variability of human aspects in construction workflows.

Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.