Search Results for author: Hiroki Furuta

Found 18 papers, 9 papers with code

Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search

no code implementations31 Jan 2025 Yuta Oshima, Masahiro Suzuki, Yutaka Matsuo, Hiroki Furuta

The remarkable progress in text-to-video diffusion models enables photorealistic generations, although the contents of the generated video often include unnatural movement or deformation, reverse playback, and motionless scenes.

Denoising Video Alignment +1

Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words

1 code implementation9 Jan 2025 Gouki Minegishi, Hiroki Furuta, Yusuke Iwasawa, Yutaka Matsuo

Sparse autoencoders (SAEs) have gained a lot of attention as a promising tool to improve the interpretability of large language models (LLMs) by mapping the complex superposition of polysemantic neurons into monosemantic features and composing a sparse dictionary of words.

Geometric-Averaged Preference Optimization for Soft Preference Labels

no code implementations10 Sep 2024 Hiroki Furuta, Kuang-Huei Lee, Shixiang Shane Gu, Yutaka Matsuo, Aleksandra Faust, Heiga Zen, Izzeddin Gur

In this work, we introduce the distributional soft preference labels and improve Direct Preference Optimization (DPO) with a weighted geometric average of the LLM output likelihood in the loss function.

Towards Empirical Interpretation of Internal Circuits and Properties in Grokked Transformers on Modular Polynomials

1 code implementation26 Feb 2024 Hiroki Furuta, Gouki Minegishi, Yusuke Iwasawa, Yutaka Matsuo

Grokking on modular addition has been known to implement Fourier representation and its calculation circuits with trigonometric identities in Transformers.

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

no code implementations15 Feb 2024 Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, Ian Fischer

Current Large Language Models (LLMs) are not only limited to some maximum context length, but also are not able to robustly consume long inputs.

Reading Comprehension Retrieval

Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web

1 code implementation30 Nov 2023 Hiroki Furuta, Yutaka Matsuo, Aleksandra Faust, Izzeddin Gur

We show that while existing prompted LMAs (gpt-3. 5-turbo or gpt-4) achieve 94. 0% average success rate on base tasks, their performance degrades to 24. 9% success rate on compositional tasks.

Decision Making Language Modeling +1

A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

no code implementations24 Jul 2023 Izzeddin Gur, Hiroki Furuta, Austin Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust

Pre-trained large language models (LLMs) have recently achieved better generalization and sample efficiency in autonomous web automation.

 Ranked #1 on on Mind2Web

Code Generation Denoising +4

Multimodal Web Navigation with Instruction-Finetuned Foundation Models

no code implementations19 May 2023 Hiroki Furuta, Kuang-Huei Lee, Ofir Nachum, Yutaka Matsuo, Aleksandra Faust, Shixiang Shane Gu, Izzeddin Gur

The progress of autonomous web navigation has been hindered by the dependence on billions of exploratory interactions via online reinforcement learning, and domain-specific model designs that make it difficult to leverage generalization from rich out-of-domain data.

Autonomous Web Navigation Instruction Following +1

Collective Intelligence for 2D Push Manipulations with Mobile Robots

1 code implementation28 Nov 2022 So Kuroki, Tatsuya Matsushima, Jumpei Arima, Hiroki Furuta, Yutaka Matsuo, Shixiang Shane Gu, Yujin Tang

While natural systems often present collective intelligence that allows them to self-organize and adapt to changes, the equivalent is missing in most artificial systems.

Robot Manipulation

A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation

1 code implementation25 Nov 2022 Hiroki Furuta, Yusuke Iwasawa, Yutaka Matsuo, Shixiang Shane Gu

The rise of generalist large-scale models in natural language and vision has made us expect that a massive data-driven approach could achieve broader generalization in other domains such as continuous control.

continuous-control Continuous Control +1

Generalized Decision Transformer for Offline Hindsight Information Matching

1 code implementation19 Nov 2021 Hiroki Furuta, Yutaka Matsuo, Shixiang Shane Gu

We present Generalized Decision Transformer (GDT) for solving any HIM problem, and show how different choices for the feature function and the anti-causal aggregator not only recover DT as a special case, but also lead to novel Categorical DT (CDT) and Bi-directional DT (BDT) for matching different statistics of the future.

continuous-control Continuous Control +3

Distributional Decision Transformer for Hindsight Information Matching

no code implementations ICLR 2022 Hiroki Furuta, Yutaka Matsuo, Shixiang Shane Gu

Inspired by distributional and state-marginal matching literatures in RL, we demonstrate that all these approaches are essentially doing hindsight information matching (HIM) -- training policies that can output the rest of trajectory that matches a given future state information statistics.

continuous-control Continuous Control +4

Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning

1 code implementation NeurIPS 2021 Hiroki Furuta, Tadashi Kozuno, Tatsuya Matsushima, Yutaka Matsuo, Shixiang Shane Gu

These results show which implementation or code details are co-adapted and co-evolved with algorithms, and which are transferable across algorithms: as examples, we identified that tanh Gaussian policy and network sizes are highly adapted to algorithmic types, while layer normalization and ELU are critical for MPO's performances but also transfer to noticeable gains in SAC.

Deep Reinforcement Learning reinforcement-learning +1

Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization

4 code implementations ICLR 2021 Tatsuya Matsushima, Hiroki Furuta, Yutaka Matsuo, Ofir Nachum, Shixiang Gu

We propose a novel model-based algorithm, Behavior-Regularized Model-ENsemble (BREMEN) that can effectively optimize a policy offline using 10-20 times fewer data than prior works.

Offline RL reinforcement-learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.