no code implementations • RaPID (LREC) 2022 • Ruihao Pan, Ziming Liu, Fengpei Yuan, Maryam Zare, Xiaopeng Zhao, Rebecca Jane Passonneau
An assistive robot Pepper has been designed to administer Referential Communication Tasks (RCTs) to human subjects without dementia as a step towards an agent to administer RCTs to dementia patients, potentially for earlier diagnosis.
no code implementations • 21 Jan 2025 • Yizhou Liu, Ziming Liu, Jeff Gore
Large language models (LLMs) demonstrate remarkable performance, and improving their pre-training process appears to be key to enhancing their capabilities further.
1 code implementation • 21 Jan 2025 • Ziming Liu, Yizhou Liu, Eric J. Michaud, Jeff Gore, Max Tegmark
We aim to understand physics of skill learning, i. e., how skills are learned in neural networks during training.
no code implementations • 16 Jan 2025 • Wei Bu, Uri Kol, Ziming Liu
The dynamical evolution of a neural network during training has been an incredibly fascinating subject of study.
no code implementations • 22 Nov 2024 • Yiran Qiao, Yateng Tang, Xiang Ao, Qi Yuan, Ziming Liu, Chen Shen, Xuehao Zheng
We evaluate LBSF on the financial risk assessment task using a large-scale real-world dataset.
no code implementations • 2 Oct 2024 • YiXuan Wang, Jonathan W. Siegel, Ziming Liu, Thomas Y. Hou
This shows that the approximation and representation capabilities of KANs are at least as good as MLPs.
no code implementations • 23 Aug 2024 • Qiyao Liang, Ziming Liu, Mitchell Ostrow, Ila Fiete
Diffusion models are capable of generating photo-realistic images that combine elements which likely do not appear together in the training set, demonstrating the ability to \textit{compositionally generalize}.
no code implementations • 22 Aug 2024 • Ziming Liu, Jingcai Guo, Song Guo, Xiaocheng Lu
This paper investigates a challenging problem of zero-shot learning in the multi-label scenario (MLZSL), wherein the model is trained to recognize multiple unseen classes within a sample (e. g., an image) based on seen classes and auxiliary knowledge, e. g., semantic information.
1 code implementation • 19 Aug 2024 • Ziming Liu, Pingchuan Ma, YiXuan Wang, Wojciech Matusik, Max Tegmark
The synergy is bidirectional: science to KAN (incorporating scientific knowledge into KANs), and KAN to science (extracting scientific insights from KANs).
1 code implementation • 27 May 2024 • Xiaoman Delores Ding, Zifan Carl Guo, Eric J. Michaud, Ziming Liu, Max Tegmark
To investigate this Survival of the Fittest hypothesis, we conduct a case study on neural networks performing modular addition, and find that these networks' multiple circular representations at different Fourier frequencies undergo such competitive dynamics, with only a few circles surviving at the end.
no code implementations • 26 May 2024 • Tyler Morris, Ziming Liu, Longjian Liu, Xiaopeng Zhao
Altogether, this combination of the convolution neural network and explainable AI algorithm creates a system that can be used in the medical field to not only aid in the proper classification of dementia but also allow everyone involved to visualize and understand the results.
no code implementations • 23 May 2024 • Subhash Kantamneni, Ziming Liu, Max Tegmark
We develop four criteria for the use of a method within the simple testbed of linear regression, where our method is $y = wx$ and our intermediate is $w$: (1) Can the intermediate be predicted from hidden states?
no code implementations • 7 May 2024 • Subhash Kantamneni, Ziming Liu, Max Tegmark
Integrable partial differential equation (PDE) systems are of great interest in natural science, but are exceedingly rare and difficult to discover.
23 code implementations • 30 Apr 2024 • Ziming Liu, YiXuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark
Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs).
2 code implementations • 15 Mar 2024 • Xuanlei Zhao, Shenggan Cheng, Chang Chen, Zangwei Zheng, Ziming Liu, Zheming Yang, Yang You
Scaling multi-dimensional transformers to long sequences is indispensable across various domains.
no code implementations • 8 Feb 2024 • David D. Baek, Ziming Liu, Max Tegmark
We present GenEFT: an effective theory framework for shedding light on the statics and dynamics of neural network generalization, and illustrate it with graph learning examples.
no code implementations • 7 Feb 2024 • Jinyeop Song, Ziming Liu, Max Tegmark, Jeff Gore
A task is usually composite hence can be decomposed into many subtasks, which compete for resources (measured by the number of neurons allocated to subtasks).
1 code implementation • 7 Feb 2024 • Eric J. Michaud, Isaac Liao, Vedang Lad, Ziming Liu, Anish Mudide, Chloe Loughridge, Zifan Carl Guo, Tara Rezaei Kheirkhah, Mateja Vukelić, Max Tegmark
We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code.
no code implementations • 5 Feb 2024 • Qiyao Liang, Ziming Liu, Ila Fiete
Corresponding to each of these phases, we identify qualitatively different generation behaviors: 1) multiple bumps are generated, 2) one bump is generated but at inaccurate $x$ and $y$ locations, 3) a bump is generated at the correct $x$ and y location.
no code implementations • 19 Jan 2024 • Xuanlei Zhao, Shenggan Cheng, Guangyang Lu, Jiarui Fang, Haotian Zhou, Bin Jia, Ziming Liu, Yang You
The experiments demonstrate that AutoChunk can reduce over 80\% of activation memory while maintaining speed loss within 10%, extend max sequence length by 3. 2x to 11. 7x, and outperform state-of-the-art methods by a large margin.
no code implementations • 15 Dec 2023 • Jingcai Guo, Qihua Zhou, Ruibing Li, Xiaocheng Lu, Ziming Liu, Junyang Chen, Xin Xie, Jie Zhang
Then, to facilitate the generalization of local linearities, we construct a maximal margin geometry on the learned features by enforcing low-rank constraints on intra-class samples and high-rank constraints on inter-class samples, resulting in orthogonal subspaces for different classes and each subspace lies on a compact manifold.
no code implementations • 5 Dec 2023 • Isaac Liao, Ziming Liu, Max Tegmark
The hypernetwork is carefully designed such that it can control network complexity, leading to a diverse family of interpretable algorithms ranked by their complexity.
no code implementations • 11 Oct 2023 • Ziming Liu, Mikail Khona, Ila R. Fiete, Max Tegmark
Recurrent neural networks (RNNs) trained on compositional tasks can exhibit functional modularity, in which neurons can be clustered by activity similarity and participation in shared computational subtasks.
no code implementations • 9 Oct 2023 • Ziming Liu, Ziqian Zhong, Max Tegmark
To do so, we define linear mapping number (LMN) to measure network complexity, which is a generalized version of linear region number for ReLU networks.
no code implementations • 3 Oct 2023 • Ziming Liu, Max Tegmark
Neural scaling laws (NSL) refer to the phenomenon where model performance improves with scale.
no code implementations • 2 Sep 2023 • Ziming Liu, Jingcai Guo, Xiaocheng Lu, Song Guo, Peiran Dong, Jiewei Zhang
That is, in the process of inferring unseen classes, global features represent the principal direction of the image in the feature space, while local features should maintain uniqueness within a certain range.
1 code implementation • NeurIPS 2023 • Ziqian Zhong, Ziming Liu, Max Tegmark, Jacob Andreas
Do neural networks, trained on well-understood algorithmic tasks, reliably rediscover known algorithms for solving those tasks?
2 code implementations • NeurIPS 2023 • Yilun Xu, Mingyang Deng, Xiang Cheng, Yonglong Tian, Ziming Liu, Tommi Jaakkola
Restart not only outperforms the previous best SDE results, but also accelerates the sampling speed by 10-fold / 2-fold on CIFAR-10 / ImageNet $64 \times 64$.
1 code implementation • 31 May 2023 • Ziming Liu, Patrick Obin Sturm, Saketh Bharadwaj, Sam Silva, Max Tegmark
Discovering conservation laws for a given dynamical system is important but challenging.
1 code implementation • 4 May 2023 • Ziming Liu, Eric Gan, Max Tegmark
We introduce Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.
no code implementations • 2 May 2023 • Xiaocheng Lu, Ziming Liu, Song Guo, Jingcai Guo, Fushuo Huo, Sikai Bai, Tao Han
Compositional Zero-shot Learning (CZSL) aims to recognize novel concepts composed of known knowledge without training samples.
no code implementations • 5 Apr 2023 • Ziming Liu, Di Luo, Yilun Xu, Tommi Jaakkola, Max Tegmark
We introduce a general family, Generative Models from Physical Processes (GenPhys), where we translate partial differential equations (PDEs) describing physical processes to generative models.
1 code implementation • NeurIPS 2023 • Eric J. Michaud, Ziming Liu, Uzay Girit, Max Tegmark
We tentatively find that the frequency at which these quanta are used in the training distribution roughly follows a power law corresponding with the empirical scaling exponent for language models, a prediction of our theory.
1 code implementation • 8 Feb 2023 • Yilun Xu, Ziming Liu, Yonglong Tian, Shangyuan Tong, Max Tegmark, Tommi Jaakkola
The new models reduce to PFGM when $D{=}1$ and to diffusion models when $D{\to}\infty$.
Ranked #1 on Image Generation on FFHQ 64x64 - 4x upscaling
no code implementations • CVPR 2023 • Ziming Liu, Song Guo, Xiaocheng Lu, Jingcai Guo, Jiewei Zhang, Yue Zeng, Fushuo Huo
Recent studies usually approach multi-label zero-shot learning (MLZSL) with visual-semantic mapping on spatial-class correlation, which can be computationally costly, and worse still, fails to capture fine-grained class-specific semantics.
1 code implementation • CVPR 2023 • Xiaocheng Lu, Ziming Liu, Song Guo, Jingcai Guo
Existing methods either learn the combined state-object representation, challenging the generalization of unseen compositions, or design two classifiers to identify state and object separately from image features, ignoring the intrinsic relationship between them.
no code implementations • 19 Nov 2022 • Fushuo Huo, Wenchao Xu, Song Guo, Jingcai Guo, Haozhao Wang, Ziming Liu, Xiaocheng Lu
Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space, which induces a tremendously large output space containing all possible state-object compositions.
1 code implementation • 24 Oct 2022 • Eric J. Michaud, Ziming Liu, Max Tegmark
We explore unique considerations involved in fitting ML models to data with very high precision, as is often required for science applications.
2 code implementations • 3 Oct 2022 • Ziming Liu, Eric J. Michaud, Max Tegmark
Grokking, the unusual phenomenon for algorithmic datasets where generalization happens long after overfitting the training data, has remained elusive.
1 code implementation • 22 Sep 2022 • Yilun Xu, Ziming Liu, Max Tegmark, Tommi Jaakkola
We interpret the data points as electrical charges on the $z=0$ hyperplane in a space augmented with an additional dimension $z$, generating a high-dimensional electric field (the gradient of the solution to Poisson equation).
no code implementations • 6 Sep 2022 • Jiangsu Du, Ziming Liu, Jiarui Fang, Shenggui Li, Yongbin Li, Yutong Lu, Yang You
Although the AI community has expanded the model scale to the trillion parameter level, the practical deployment of 10-100 billion parameter models is still uncertain due to the latency, throughput, and memory constraints.
no code implementations • 21 Aug 2022 • Jingcai Guo, Song Guo, Jie Zhang, Ziming Liu
Concretely, we maintain an edge-agnostic hidden model in the cloud server to estimate a less-accurate while direction-aware inversion of the global model.
no code implementations • 9 Aug 2022 • Ziming Liu, Andrew M. Stuart, YiXuan Wang
We propose a sampling method based on an ensemble approximation of second order Langevin dynamics.
1 code implementation • 13 Jun 2022 • Feijie Wu, Song Guo, Zhihao Qu, Shiqi He, Ziming Liu, Jing Gao
The lack of inactive clients' updates in partial client participation makes it more likely for the model aggregation to deviate from the aggregation based on full client participation.
1 code implementation • 20 May 2022 • Ziming Liu, Ouail Kitouni, Niklas Nolte, Eric J. Michaud, Max Tegmark, Mike Williams
We aim to understand grokking, a phenomenon where models generalize long after overfitting their training set.
no code implementations • 23 Mar 2022 • Ziming Liu, Varun Madhavan, Max Tegmark
We present a machine learning algorithm that discovers conservation laws from differential equations, both numerically (parametrized as neural networks) and symbolically, ensuring their functional independence (a non-linear generalization of linear independence).
no code implementations • 7 Mar 2022 • Ziming Liu, Song Guo, Jingcai Guo, Yuanyuan Xu, Fushuo Huo
We argue that disregarding the connection between major and minor classes, i. e., correspond to the global and local information, respectively, is the cause of the problem.
1 code implementation • 17 Dec 2021 • Feijie Wu, Song Guo, Haozhao Wang, Zhihao Qu, Haobo Zhang, Jie Zhang, Ziming Liu
In the setting of federated optimization, where a global model is aggregated periodically, step asynchronism occurs when participants conduct model training by efficiently utilizing their computational resources.
no code implementations • NeurIPS Workshop AI4Scien 2021 • Ziming Liu, Yunyue Chen, Yuanqi Du, Max Tegmark
Integrating physical inductive biases into machine learning can improve model generalizability.
no code implementations • 20 Sep 2021 • Ziming Liu, Max Tegmark
We present an automated method for finding hidden symmetries, defined as symmetries that become manifest only in a new coordinate system that must be discovered.
no code implementations • 31 May 2021 • Ziming Liu, Bohan Wang, Qi Meng, Wei Chen, Max Tegmark, Tie-Yan Liu
Energy conservation is a basic physics principle, the breakdown of which often implies new physics.
no code implementations • 9 Nov 2020 • Ziming Liu, Max Tegmark
We present AI Poincar\'e, a machine learning algorithm for auto-discovering conserved quantities using trajectory data from unknown dynamical systems.
no code implementations • 13 Jun 2020 • Ziming Liu, Guangyu Gao, Lin Sun, Zhiyuan Fang
By extracting various features from high to low resolutions, the MD-IPN is able to improve the performance of small object detection as well as maintaining the performance of middle and large objects.
no code implementations • 13 Jun 2020 • Ziming Liu, Guangyu Gao, A. K. Qin, Jinyang Li
Finally, the DTG-Net is evaluated in two ways: (i) the self-supervised DTG-Net to pre-train the supervised action recognition models with only unlabeled videos; (ii) the supervised DTG-Net to be jointly trained with the supervised action networks in an end-to-end way.
no code implementations • 8 Jun 2020 • Ziming Liu, Sitian Qian, Yi-Xuan Wang, Yuxuan Yan, Tianyi Yang
Counterintuitively, by drawing the connection between PCA and Schr\"odinger equation, we can not only attack the undersampling challenge but also compute in an efficient and decoupled way with the proposed algorithm called Schr\"odinger PCA.
no code implementations • 6 Dec 2019 • Ziming Liu, Yi-Xuan Wang, Zizhao Han, Dian Wu
Finally, both the original model and the perturbed model are tested on regional examples, as validations of our models.
1 code implementation • 4 Dec 2019 • Ziming Liu, Zheng Zhang
Hamiltonian Monte Carlo (HMC) is an efficient Bayesian sampling method that can make distant proposals in the parameter space by simulating a Hamiltonian dynamical system.
no code implementations • 2 Dec 2019 • Ziming Liu, Guangyu Gao, Lin Sun, Li Fang
In this paper, except for top-down combining of information for shallow layers, we propose a novel network called Image Pyramid Guidance Network (IPG-Net) to make sure both the spatial information and semantic information are abundant for each layer.
no code implementations • 21 Nov 2019 • Ziming Liu, Xiaobo Liu
The traditional PCA fault detection methods completely depend on the training data.