Inference Optimization
18 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Inference Optimization
Most implemented papers
Input Convex Neural Networks
We show that many existing neural network architectures can be made input-convex with a minor modification, and develop specialized optimization algorithms tailored to this setting.
Enhanced graph-learning schemes driven by similar distributions of motifs
Guided by this, we first assume that we have a reference graph that is related to the sought graph (in the sense of having similar motif densities) and then, we exploit this relation by incorporating a similarity constraint and a regularization term in the network topology inference optimization problem.
Representing Edge Flows on Graphs via Sparse Cell Complexes
In this paper, we generalize this approach to cellular complexes and introduce the flow representation learning problem, i. e., the problem of augmenting the observed graph by a set of cells, such that the eigenvectors of the associated Hodge Laplacian provide a sparse, interpretable representation of the observed edge flows on the graph.
Patched MOA: optimizing inference for diverse software development tasks
This paper introduces Patched MOA (Mixture of Agents), an inference optimization technique that significantly enhances the performance of large language models (LLMs) across diverse software development tasks.
CycleBNN: Cyclic Precision Training in Binary Neural Networks
This paper works on Binary Neural Networks (BNNs), a promising avenue for efficient deep learning, offering significant reductions in computational overhead and memory footprint to full precision networks.
Iterative Amortized Inference
The failure of these models to reach fully optimized approximate posterior estimates results in an amortization gap.
A General Method for Amortizing Variational Filtering
We introduce the variational filtering EM algorithm, a simple, general-purpose method for performing variational inference in dynamical latent variable models using information from only past and present variables, i. e. filtering.
Easy and Efficient Transformer : Scalable Inference Solution For large NLP model
To fill such a gap, we introduce a scalable inference solution: Easy and Efficient Transformer (EET), including a series of transformer inference optimization at the algorithm and implementation levels.
A Novel 1D State Space for Efficient Music Rhythmic Analysis
Inferring music time structures has a broad range of applications in music production, processing and analysis.
ADJUST: A Dictionary-Based Joint Reconstruction and Unmixing Method for Spectral Tomography
However, these methods inherently suffer from the ill-posedness of the joint reconstruction problem.