Search Results for author: Fan-Yun Sun

Found 17 papers, 5 papers with code

Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images

no code implementations19 Feb 2025 Shengguang Wu, Fan-Yun Sun, Kaiyue Wen, Nick Haber

Recent studies have shown that Large Vision-Language Models (VLMs) tend to neglect image content and over-rely on language-model priors, resulting in errors in visually grounded tasks and hallucinations.

counterfactual

LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models

no code implementations3 Dec 2024 Fan-Yun Sun, Weiyu Liu, Siyi Gu, Dylan Lim, Goutam Bhat, Federico Tombari, Manling Li, Nick Haber, Jiajun Wu

We introduce LayoutVLM, a framework and scene layout representation that exploits the semantic knowledge of Vision-Language Models (VLMs) and supports differentiable optimization to ensure physical plausibility.

Layout Generation

GRS: Generating Robotic Simulation Tasks from Real-World Images

no code implementations20 Oct 2024 Alex Zook, Fan-Yun Sun, Josef Spjut, Valts Blukis, Stan Birchfield, Jonathan Tremblay

We introduce GRS (Generating Robotic Simulation tasks), a system addressing real-to-sim for robotic simulations.

Object Semantic Segmentation

FactorSim: Generative Simulation via Factorized Representation

no code implementations26 Sep 2024 Fan-Yun Sun, S. I. Harini, Angela Yi, Yihan Zhou, Alex Zook, Jonathan Tremblay, Logan Cross, Jiajun Wu, Nick Haber

Generating simulations to train intelligent agents in game-playing and robotics from natural language input, from user input or task documentation, remains an open-ended challenge.

Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition

no code implementations5 Jun 2024 Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, Hung-Yi Lee

Synthetic data is widely used in speech recognition due to the availability of text-to-speech models, which facilitate adapting models to previously unseen text domains.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Partial-View Object View Synthesis via Filtered Inversion

no code implementations3 Apr 2023 Fan-Yun Sun, Jonathan Tremblay, Valts Blukis, Kevin Lin, Danfei Xu, Boris Ivanovic, Peter Karkus, Stan Birchfield, Dieter Fox, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Marco Pavone, Nick Haber

At inference, given one or more views of a novel real-world object, FINV first finds a set of latent codes for the object by inverting the generative model from multiple initial seeds.

Object

Equivariant Neural Network for Factor Graphs

no code implementations29 Sep 2021 Fan-Yun Sun, Jonathan Kuck, Hao Tang, Stefano Ermon

Several indices used in a factor graph data structure can be permuted without changing the underlying probability distribution.

Inductive Bias

Organ At Risk Segmentation with Multiple Modality

no code implementations17 Oct 2019 Kuan-Lun Tseng, Winston Hsu, Chun-ting Wu, Ya-Fang Shih, Fan-Yun Sun

To better leverage different modalities, we have collected a large dataset consists of 136 cases with CT and MR images which diagnosed with nasopharyngeal cancer.

Brain Tumor Segmentation Computed Tomography (CT) +5

InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization

5 code implementations ICLR 2020 Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, Jian Tang

There are also some recent methods based on language models (e. g. graph2vec) but they tend to only consider certain substructures (e. g. subtrees) as graph representatives.

Graph Classification Molecular Property Prediction +2

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

1 code implementation NeurIPS 2019 Fan-Yun Sun, Meng Qu, Jordan Hoffmann, Chin-wei Huang, Jian Tang

Experimental results on multiple real-world graphs show that vGraph is very effective in both community detection and node representation learning, outperforming many competitive baselines in both tasks.

Community Detection Representation Learning +1

A Regulation Enforcement Solution for Multi-agent Reinforcement Learning

no code implementations29 Jan 2019 Fan-Yun Sun, Yen-Yu Chang, Yueh-Hua Wu, Shou-De Lin

If artificially intelligent (AI) agents make decisions on behalf of human beings, we would hope they can also follow established regulations while interacting with humans or other AI agents.

AI Agent Management +4

A Memory-Network Based Solution for Multivariate Time-Series Forecasting

2 code implementations6 Sep 2018 Yen-Yu Chang, Fan-Yun Sun, Yueh-Hua Wu, Shou-De Lin

Inspired by Memory Network proposed for solving the question-answering task, we propose a deep learning based model named Memory Time-series network (MTNet) for time series forecasting.

Multivariate Time Series Forecasting Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.