no code implementations • 19 Feb 2025 • Shengguang Wu, Fan-Yun Sun, Kaiyue Wen, Nick Haber
Recent studies have shown that Large Vision-Language Models (VLMs) tend to neglect image content and over-rely on language-model priors, resulting in errors in visually grounded tasks and hallucinations.
no code implementations • 3 Dec 2024 • Fan-Yun Sun, Weiyu Liu, Siyi Gu, Dylan Lim, Goutam Bhat, Federico Tombari, Manling Li, Nick Haber, Jiajun Wu
We introduce LayoutVLM, a framework and scene layout representation that exploits the semantic knowledge of Vision-Language Models (VLMs) and supports differentiable optimization to ensure physical plausibility.
no code implementations • 20 Oct 2024 • Alex Zook, Fan-Yun Sun, Josef Spjut, Valts Blukis, Stan Birchfield, Jonathan Tremblay
We introduce GRS (Generating Robotic Simulation tasks), a system addressing real-to-sim for robotic simulations.
no code implementations • 26 Sep 2024 • Fan-Yun Sun, S. I. Harini, Angela Yi, Yihan Zhou, Alex Zook, Jonathan Tremblay, Logan Cross, Jiajun Wu, Nick Haber
Generating simulations to train intelligent agents in game-playing and robotics from natural language input, from user input or task documentation, remains an open-ended challenge.
no code implementations • 5 Jun 2024 • Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, Hung-Yi Lee
Synthetic data is widely used in speech recognition due to the availability of text-to-speech models, which facilitate adapting models to previously unseen text domains.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • CVPR 2024 • Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, Chris Callison-Burch, Mark Yatskar, Aniruddha Kembhavi, Christopher Clark
3D simulated environments play a critical role in Embodied AI, but their creation requires expertise and extensive manual effort, restricting their diversity and scope.
no code implementations • 3 Apr 2023 • Fan-Yun Sun, Jonathan Tremblay, Valts Blukis, Kevin Lin, Danfei Xu, Boris Ivanovic, Peter Karkus, Stan Birchfield, Dieter Fox, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Marco Pavone, Nick Haber
At inference, given one or more views of a novel real-world object, FINV first finds a set of latent codes for the object by inverting the generative model from multiple initial seeds.
no code implementations • 23 Aug 2022 • Fan-Yun Sun, Isaac Kauvar, Ruohan Zhang, Jiachen Li, Mykel Kochenderfer, Jiajun Wu, Nick Haber
Modeling multi-agent systems requires understanding how agents interact.
no code implementations • 29 Sep 2021 • Fan-Yun Sun, Jonathan Kuck, Hao Tang, Stefano Ermon
Several indices used in a factor graph data structure can be permuted without changing the underlying probability distribution.
3 code implementations • 15 Jun 2021 • Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Yu Fish Tung, R. T. Pramod, Cameron Holdaway, Sirui Tao, Kevin Smith, Fan-Yun Sun, Li Fei-Fei, Nancy Kanwisher, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith E. Fan
While current vision algorithms excel at many challenging tasks, it is unclear how well they understand the physical dynamics of real-world environments.
no code implementations • NAACL 2021 • Hsuan Su, Jiun-Hao Jhan, Fan-Yun Sun, Saurav Sahay, Hung-Yi Lee
Our framework includes a guiding chatbot and an interlocutor model that plays the role of humans.
no code implementations • 17 Oct 2019 • Kuan-Lun Tseng, Winston Hsu, Chun-ting Wu, Ya-Fang Shih, Fan-Yun Sun
To better leverage different modalities, we have collected a large dataset consists of 136 cases with CT and MR images which diagnosed with nasopharyngeal cancer.
5 code implementations • ICLR 2020 • Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, Jian Tang
There are also some recent methods based on language models (e. g. graph2vec) but they tend to only consider certain substructures (e. g. subtrees) as graph representatives.
Ranked #26 on
Graph Classification
on IMDb-M
1 code implementation • NeurIPS 2019 • Fan-Yun Sun, Meng Qu, Jordan Hoffmann, Chin-wei Huang, Jian Tang
Experimental results on multiple real-world graphs show that vGraph is very effective in both community detection and node representation learning, outperforming many competitive baselines in both tasks.
no code implementations • 29 Jan 2019 • Fan-Yun Sun, Yen-Yu Chang, Yueh-Hua Wu, Shou-De Lin
If artificially intelligent (AI) agents make decisions on behalf of human beings, we would hope they can also follow established regulations while interacting with humans or other AI agents.
no code implementations • 6 Sep 2018 • Yueh-Hua Wu, Fan-Yun Sun, Yen-Yu Chang, Shou-De Lin
This work provides a thorough study on how reward scaling can affect performance of deep reinforcement learning agents.
2 code implementations • 6 Sep 2018 • Yen-Yu Chang, Fan-Yun Sun, Yueh-Hua Wu, Shou-De Lin
Inspired by Memory Network proposed for solving the question-answering task, we propose a deep learning based model named Memory Time-series network (MTNet) for time series forecasting.