1 code implementation • 4 Feb 2025 • Hoang M. Nguyen, Satya N. Shukla, Qiang Zhang, Hanchao Yu, Sreya D. Roy, Taipeng Tian, Lingjiong Zhu, Yuchen Liu
To address these limitations, we introduce BRIDLE (Bidirectional Residual Quantization Interleaved Discrete Learning Encoder), a self-supervised encoder pretraining framework that incorporates residual quantization (RQ) into the bidirectional training process, and is generalized for pretraining with audio, image, and video.
no code implementations • 2 Feb 2025 • Thanh Dang, Melih Barsbey, A K M Rokonuzzaman Sonet, Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu
In this work, we establish generalization bounds for SGD with momentum (SGDm) under heavy-tailed gradient noise.
no code implementations • 20 Jan 2025 • Hengrong Du, Qi Feng, Changwei Tu, Xiaoyu Wang, Lingjiong Zhu
Based on the discretization of SRNLD, we propose skew-reflected non-reversible Langevin Monte Carlo (SRNLMC), and obtain non-asymptotic discretization error from SRNLD, and convergence guarantees to the target distribution in 1-Wasserstein distance.
no code implementations • 11 Jan 2025 • Dan Pirjol, Lingjiong Zhu
As a remedy, we propose a capped volatility process by capping the drift and diffusion terms in the $v_{t}$ process such that it becomes non-explosive and well-behaved, and study the short-maturity asymptotics for the pricing of VIX options.
no code implementations • 2 Dec 2024 • Mert Gurbuzbalaban, Mohammad Rafiqul Islam, Xiaoyu Wang, Lingjiong Zhu
Motivated by the EXTRA algorithm and its generalizations for decentralized optimization, we propose the generalized EXTRA stochastic gradient Langevin dynamics, which eliminates this bias in the full-batch setting.
no code implementations • 4 Nov 2024 • Dan Pirjol, Xiaoyu Wang, Lingjiong Zhu
We derive the short-maturity asymptotics for prices of options on realized variance in local-stochastic volatility models.
no code implementations • 12 Sep 2024 • Dan Pirjol, Lingjiong Zhu
Using large deviations theory methods, the asymptotics for the OTM options are expressed as a rate function which is represented as a two-dimensional variational problem.
no code implementations • 23 Jul 2024 • Dan Pirjol, Xiaoyu Wang, Lingjiong Zhu
We give explicit results for two classes of local-stochastic volatility models relevant in practice, with Heston-type and SABR-type stochastic volatility.
no code implementations • 4 Mar 2024 • Umut Şimşekli, Mert Gürbüzbalaban, Sinan Yildirim, Lingjiong Zhu
Injecting heavy-tailed noise to the iterates of stochastic gradient descent (SGD) has received increasing attention over the past few years.
no code implementations • 21 Feb 2024 • Dan Pirjol, Lingjiong Zhu
We derive the short-maturity asymptotics for option prices in the local volatility model in a new short-maturity limit $T\to 0$ at fixed $\rho = (r-q) T$, where $r$ is the interest rate and $q$ is the dividend yield.
no code implementations • 13 Feb 2024 • Shaeke Salman, Md Montasir Bin Shams, Xiuwen Liu, Lingjiong Zhu
Transformer-based models have dominated natural language processing and other areas in the last few years due to their superior (zero-shot) performance on benchmark datasets.
no code implementations • 31 Jan 2024 • Xuefeng Gao, Lingjiong Zhu
Score-based generative modeling with probability flow ordinary differential equations (ODEs) has achieved remarkable success in a variety of applications.
no code implementations • 18 Nov 2023 • Xuefeng Gao, Hoang M. Nguyen, Lingjiong Zhu
We find that the experimental results are in good agreement with our theoretical predictions on the iteration complexity, and the models with our newly proposed forward processes can outperform existing models.
no code implementations • 30 Aug 2023 • Dan Pirjol, Lingjiong Zhu
We present a study of the short maturity asymptotics for Asian options in a jump-diffusion model with a local volatility component, where the jumps are modeled as a compound Poisson process.
no code implementations • 15 Jun 2023 • Dan Pirjol, Lingjiong Zhu
We present an asymptotic result for the Laplace transform of the time integral of the geometric Brownian motion $F(\theta, T) = \mathbb{E}[e^{-\theta X_T}]$ with $X_T = \int_0^T e^{\sigma W_s + ( a - \frac12 \sigma^2)s} ds$, which is exact in the limit $\sigma^2 T \to 0$ at fixed $\sigma^2 \theta T^2$ and $aT$.
no code implementations • 10 Feb 2023 • Mert Gürbüzbalaban, Yuanhan Hu, Umut Şimşekli, Lingjiong Zhu
Our results bring a new understanding of the benefits of cyclic and randomized stepsizes compared to constant stepsize in terms of the tail behavior.
no code implementations • 27 Jan 2023 • Anant Raj, Lingjiong Zhu, Mert Gürbüzbalaban, Umut Şimşekli
Very recently, new generalization bounds have been proven, indicating a non-monotonic relationship between the generalization error and heavy tails, which is more pertinent to the reported empirical observations.
no code implementations • 16 Jan 2023 • Dan Pirjol, Lingjiong Zhu
We propose analytical approximations for the sensitivities (Greeks) of the Asian options in the Black-Scholes model, following from a small maturity/volatility approximation for the option prices which has the exact short maturity limit, obtained using large deviations theory.
no code implementations • 16 Jan 2023 • Lingjiong Zhu
In this paper, we study a dual risk model with delays in the spirit of Dassios-Zhao.
no code implementations • 29 Nov 2022 • Mert Gürbüzbalaban, Yuanhan Hu, Lingjiong Zhu
When $f$ is smooth and gradients are available, we get $\tilde{\mathcal{O}}(d/\varepsilon^{10})$ iteration complexity for PLD to sample the target up to an $\varepsilon$-error where the error is measured in the TV distance and $\tilde{\mathcal{O}}(\cdot)$ hides logarithmic factors.
no code implementations • 2 Jun 2022 • Anant Raj, Melih Barsbey, Mert Gürbüzbalaban, Lingjiong Zhu, Umut Şimşekli
Recent studies have shown that heavy tails can emerge in stochastic optimization and that the heaviness of the tails have links to the generalization error.
no code implementations • 13 May 2022 • Mert Gurbuzbalaban, Yuanhan Hu, Umut Simsekli, Kun Yuan, Lingjiong Zhu
To have a more explicit control on the tail exponent, we then consider the case where the loss at each node is a quadratic, and show that the tail-index can be estimated as a function of the step-size, batch-size, and the topological properties of the network of the computational nodes.
no code implementations • NeurIPS 2021 • Alexander Camuto, George Deligiannidis, Murat A. Erdogdu, Mert Gürbüzbalaban, Umut Şimşekli, Lingjiong Zhu
As our main contribution, we prove that the generalization error of a stochastic optimization algorithm can be bounded based on the `complexity' of the fractal structure that underlies its invariant measure.
no code implementations • NeurIPS 2021 • Hongjian Wang, Mert Gürbüzbalaban, Lingjiong Zhu, Umut Şimşekli, Murat A. Erdogdu
In this paper, we provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance, for a class of strongly convex objectives.
1 code implementation • 13 Feb 2021 • Alexander Camuto, Xiaoyu Wang, Lingjiong Zhu, Chris Holmes, Mert Gürbüzbalaban, Umut Şimşekli
In this paper we focus on the so-called `implicit effect' of GNIs, which is the effect of the injected noise on the dynamics of SGD.
no code implementations • 1 Jul 2020 • Mert Gürbüzbalaban, Xuefeng Gao, Yuanhan Hu, Lingjiong Zhu
Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of the parameters of a statistical model given the input data and the prior distribution over the model parameters.
1 code implementation • 8 Jun 2020 • Mert Gurbuzbalaban, Umut Şimşekli, Lingjiong Zhu
We claim that depending on the structure of the Hessian of the loss at the minimum, and the choices of the algorithm parameters $\eta$ and $b$, the SGD iterates will converge to a \emph{heavy-tailed} stationary distribution.
no code implementations • 6 Apr 2020 • Yuanhan Hu, Xiaoyu Wang, Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu
In this paper, we study the non reversible Stochastic Gradient Langevin Dynamics (NSGLD) which is based on discretization of the non-reversible Langevin diffusion.
1 code implementation • ICML 2020 • Umut Şimşekli, Lingjiong Zhu, Yee Whye Teh, Mert Gürbüzbalaban
Stochastic gradient descent with momentum (SGDm) is one of the most popular optimization algorithms in deep learning.
no code implementations • 27 Jan 2020 • Dan Pirjol, Lingjiong Zhu
We derive an almost sure limit and a large deviations result for the log-asset price in the limit of large number of time steps.
no code implementations • 19 Oct 2019 • Alireza Fallah, Mert Gurbuzbalaban, Asuman Ozdaglar, Umut Simsekli, Lingjiong Zhu
When gradients do not contain noise, we also prove that distributed accelerated methods can \emph{achieve acceleration}, requiring $\mathcal{O}(\kappa \log(1/\varepsilon))$ gradient evaluations and $\mathcal{O}(\kappa \log(1/\varepsilon))$ communications to converge to the same fixed point with the non-accelerated variant where $\kappa$ is the condition number and $\varepsilon$ is the target accuracy.
no code implementations • 22 Jan 2019 • Bugra Can, Mert Gurbuzbalaban, Lingjiong Zhu
In the special case of strongly convex quadratic objectives, we can show accelerated linear rates in the $p$-Wasserstein metric for any $p\geq 1$ with improved sensitivity to noise for both AG and HB through a non-asymptotic analysis under some additional assumptions on the noise structure.
no code implementations • 19 Dec 2018 • Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu
We study two variants that are based on non-reversible Langevin diffusions: the underdamped Langevin dynamics (ULD) and the Langevin dynamics with a non-symmetric drift (NLD).
no code implementations • 12 Sep 2018 • Xuefeng Gao, Mert Gürbüzbalaban, Lingjiong Zhu
We provide finite-time performance bounds for the global convergence of both SGHMC variants for solving stochastic non-convex optimization problems with explicit constants.
no code implementations • 13 Jan 2016 • Arash Fahim, Lingjiong Zhu
The dual risk model is a popular model in finance and insurance, which is often used to model the wealth process of a venture capital or high tech company.
no code implementations • 16 Oct 2015 • Arash Fahim, Lingjiong Zhu
In this paper, we propose to study the optimal investment strategy on research and development for the dual risk models to minimize the ruin probability of the underlying company.
no code implementations • 13 Oct 2015 • Lingjiong Zhu
In a dual risk model, the premiums are considered as the costs and the claims are regarded as the profits.