no code implementations • 10 Dec 2024 • Eric Bigelow, Ari Holtzman, Hidenori Tanaka, Tomer Ullman
Estimating uncertainty in Large Language Models (LLMs) is important for properly evaluating LLMs, and ensuring safety for users.
no code implementations • 7 Dec 2024 • Tomer Ullman
Here, I invert the standard use of perceptual illusions to examine basic processing errors in current vision language models.
1 code implementation • 26 Nov 2024 • Colin Conwell, Rupert Tawiah-Quashie, Tomer Ullman
We asked human respondents (N=178 in total) to evaluate images generated by a state-of-the-art image-generating AI (DALL-E 3) prompted with these `logical probes', and find that none reliably produce human agreement scores greater than 50\%.
no code implementations • 7 Nov 2024 • Sonia K. Murthy, Tomer Ullman, Jennifer Hu
Researchers in social science and psychology have recently proposed using large language models (LLMs) as replacements for humans in behavioral research.
1 code implementation • 16 Jan 2024 • Chuanyang Jin, Yutong Wu, Jing Cao, Jiannan Xiang, Yen-Ling Kuo, Zhiting Hu, Tomer Ullman, Antonio Torralba, Joshua B. Tenenbaum, Tianmin Shu
To engineer multimodal ToM capacity, we propose a novel method, BIP-ALM (Bayesian Inverse Planning Accelerated by Language Models).
no code implementations • 16 Feb 2023 • Tomer Ullman
Intuitive psychology is a pillar of common-sense reasoning.
no code implementations • 4 Oct 2022 • Felix A. Sosa, Tomer Ullman
Humans can generate reasonable answers to novel queries (Schulz, 2012): if I asked you what kind of food you want to eat for lunch, you would respond with a food, not a time.
no code implementations • 29 Jul 2022 • Colin Conwell, Tomer Ullman
Relations are basic building blocks of human cognition.
no code implementations • NeurIPS 2021 • Kai Xu, Akash Srivastava, Dan Gutfreund, Felix Sosa, Tomer Ullman, Josh Tenenbaum, Charles Sutton
In this paper, we propose a Bayesian-symbolic framework (BSP) for physical reasoning and learning that is close to human-level sample-efficiency and accuracy.
no code implementations • 1 Jan 2021 • Kai Xu, Akash Srivastava, Dan Gutfreund, Felix Sosa, Tomer Ullman, Joshua B. Tenenbaum, Charles Sutton
As such, learning the laws is then reduced to symbolic regression and Bayesian inference methods are used to obtain the distribution of unobserved properties.
no code implementations • ICLR 2021 • Yilun Du, Kevin A. Smith, Tomer Ullman, Joshua B. Tenenbaum, Jiajun Wu
We study the problem of unsupervised physical object discovery.
no code implementations • 1 Jan 2021 • Jiayuan Mao, Zhezheng Luo, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu, Leslie Pack Kaelbling, Tomer Ullman
We aim to learn generalizable representations for complex activities by quantifying over both entities and time, as in “the kicker is behind all the other players,” or “the player controls the ball until it moves toward the goal.” Such a structural inductive bias of object relations, object quantification, and temporal orders will enable the learned representation to generalize to situations with varying numbers of agents, objects, and time courses.
1 code implementation • NeurIPS 2019 • Kevin Smith, Lingjie Mei, Shunyu Yao, Jiajun Wu, Elizabeth Spelke, Josh Tenenbaum, Tomer Ullman
We also present a new test set for measuring violations of physical expectations, using a range of scenarios derived from developmental psychology.
1 code implementation • 1 Dec 2016 • Michael B. Chang, Tomer Ullman, Antonio Torralba, Joshua B. Tenenbaum
By comparing to less structured architectures, we show that the NPE's compositional representation of the structure in physical interactions improves its ability to predict movement, generalize across variable object count and different scene configurations, and infer latent properties of objects such as mass.
no code implementations • NeurIPS 2009 • Tomer Ullman, Chris Baker, Owen Macindoe, Owain Evans, Noah Goodman, Joshua B. Tenenbaum
Everyday social interactions are heavily influenced by our snap judgments about others goals.