no code implementations • 29 Dec 2024 • Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana, Yongyi Yang, Maya Okawa, Kento Nishi, Martin Wattenberg, Hidenori Tanaka
Specifically, if we provide in-context exemplars wherein a concept plays a different role than what the pretraining data suggests, do models reorganize their representations in accordance with these novel semantics?
no code implementations • 22 Oct 2024 • Kento Nishi, Maya Okawa, Rahul Ramesh, Mikail Khona, Hidenori Tanaka, Ekdeep Singh Lubana
We call this phenomenon representation shattering and demonstrate that it results in degradation of factual recall and reasoning performance more broadly.
no code implementations • 10 Oct 2024 • Yongyi Yang, Core Francisco Park, Ekdeep Singh Lubana, Maya Okawa, Wei Hu, Hidenori Tanaka
We mathematically analyze the learning dynamics of neural networks trained on this SIM task and show that, despite its simplicity, SIM's learning dynamics capture and help explain key empirical observations on compositional generalization with diffusion models identified in prior work.
1 code implementation • 27 Jun 2024 • Core Francisco Park, Maya Okawa, Andrew Lee, Hidenori Tanaka, Ekdeep Singh Lubana
Modern generative models demonstrate impressive capabilities, likely stemming from an ability to identify and manipulate abstract concepts underlying their training data.
no code implementations • 12 Feb 2024 • Mikail Khona, Maya Okawa, Jan Hula, Rahul Ramesh, Kento Nishi, Robert Dick, Ekdeep Singh Lubana, Hidenori Tanaka
Stepwise inference protocols, such as scratchpads and chain-of-thought, help language models solve complex problems by decomposing them into a sequence of simpler subproblems.
no code implementations • 29 Jan 2024 • Yoshiaki Takimoto, Yusuke Tanaka, Tomoharu Iwata, Maya Okawa, Hideaki Kim, Hiroyuki Toda, Takeshi Kurashima
The point process is widely used in many applications to predict such events related to human activities.
1 code implementation • NeurIPS 2023 • Maya Okawa, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka
Motivated by this, we perform a controlled study for understanding compositional generalization in conditional diffusion models in a synthetic setting, varying different attributes of the training data and measuring the model's ability to generate samples out-of-distribution.
1 code implementation • 7 Jul 2022 • Maya Okawa, Tomoharu Iwata
Traditionally, theoretical models of opinion dynamics have been proposed to describe the interactions between individuals (i. e., social interaction) and their impact on the evolution of collective opinions.
no code implementations • 24 Jun 2022 • Yusuke Tanaka, Toshiyuki Tanaka, Tomoharu Iwata, Takeshi Kurashima, Maya Okawa, Yasunori Akagi, Hiroyuki Toda
Since the supports may have various granularities depending on attributes (e. g., poverty rate and crime rate), modeling such data is not straightforward.
no code implementations • 24 May 2021 • Maya Okawa, Tomoharu Iwata, Yusuke Tanaka, Hiroyuki Toda, Takeshi Kurashima, Hisashi Kashima
Hawkes processes offer a central tool for modeling the diffusion processes, in which the influence from the past events is described by the triggering kernel.
no code implementations • NeurIPS 2019 • Yusuke Tanaka, Toshiyuki Tanaka, Tomoharu Iwata, Takeshi Kurashima, Maya Okawa, Yasunori Akagi, Hiroyuki Toda
By deriving the posterior GP, we can predict the data value at any location point by considering the spatial correlations and the dependences between areal data sets, simultaneously.
no code implementations • 21 Jun 2019 • Maya Okawa, Tomoharu Iwata, Takeshi Kurashima, Yusuke Tanaka, Hiroyuki Toda, Naonori Ueda
Though many point processes have been proposed to model events in a continuous spatio-temporal space, none of them allow for the consideration of the rich contextual factors that affect event occurrence, such as weather, social activities, geographical characteristics, and traffic.
no code implementations • 21 Sep 2018 • Yusuke Tanaka, Tomoharu Iwata, Toshiyuki Tanaka, Takeshi Kurashima, Maya Okawa, Hiroyuki Toda
With the proposed model, a distribution for each auxiliary data set on the continuous space is modeled using a Gaussian process, where the representation of uncertainty considers the levels of granularity.