Search Results for author: Xiang Cheng

Found 30 papers, 3 papers with code

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

no code implementations15 Feb 2024 Xiang Cheng, Jingzhao Zhang, Suvrit Sra

We study the task of efficiently sampling from a Gibbs distribution $d \pi^* = e^{-h} d {vol}_g$ over a Riemannian manifold $M$ via (geometric) Langevin MCMC; this algorithm involves computing exponential maps in random Gaussian directions and is efficiently implementable in practice.

Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context

no code implementations11 Dec 2023 Xiang Cheng, Yuxin Chen, Suvrit Sra

Many neural network architectures are known to be Turing Complete, and can thus, in principle implement arbitrary algorithms.

In-Context Learning

Novel 3D Geometry-Based Stochastic Models for Non-Isotropic MIMO Vehicle-to-Vehicle Channels

no code implementations1 Dec 2023 Yi Yuan, Cheng-Xiang Wang, Xiang Cheng, Bo Ai, David I. Laurenson

Moreover, a novel parameter computation method is proposed for jointly calculating the azimuth and elevation angles in the SoS channel simulator.


Integrated Sensing and Communications Towards Proactive Beamforming in mmWave V2I via Multi-Modal Feature Fusion (MMFF)

no code implementations4 Oct 2023 Haotian Zhang, Shijian Gao, Xiang Cheng, Liuqing Yang

The future of vehicular communication networks relies on mmWave massive multi-input-multi-output antenna arrays for intensive data transfer and massive vehicle access.

Linear attention is (maybe) all you need (to understand transformer optimization)

1 code implementation2 Oct 2023 Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra

Transformer training is notoriously difficult, requiring a careful design of optimizers and use of various heuristics.

Restart Sampling for Improving Generative Processes

1 code implementation NeurIPS 2023 Yilun Xu, Mingyang Deng, Xiang Cheng, Yonglong Tian, Ziming Liu, Tommi Jaakkola

Restart not only outperforms the previous best SDE results, but also accelerates the sampling speed by 10-fold / 2-fold on CIFAR-10 / ImageNet $64 \times 64$.


Intelligent Multi-Modal Sensing-Communication Integration: Synesthesia of Machines

no code implementations25 Jun 2023 Xiang Cheng, Haotian Zhang, Jianan Zhang, Shijian Gao, Sijiang Li, Ziwei Huang, Lu Bai, Zonghui Yang, Xinhu Zheng, Liuqing Yang

Currently, some research efforts have been devoted to exploring multi-modal sensing-communication integration but still lack a comprehensive review.

Fast Conditional Mixing of MCMC Algorithms for Non-log-concave Distributions

no code implementations NeurIPS 2023 Xiang Cheng, Bohan Wang, Jingzhao Zhang, Yusong Zhu

However, on the theory side, MCMC algorithms suffer from slow mixing rate when $\pi(x)$ is non-log-concave.

Toward 6G with Terahertz Communications: Understanding the Propagation Channels

no code implementations16 Sep 2022 Xuesong Cai, Xiang Cheng, Fredrik Tufvesson

This article aims at providing insights for a comprehensive understanding of terahertz (THz) propagation channels.

Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning

no code implementations15 Jun 2021 Duzhen Zhang, Tielin Zhang, Shuncheng Jia, Xiang Cheng, Bo Xu

Based on a hybrid learning framework, where a spike actor-network infers actions from states and a deep critic network evaluates the actor, we propose a Population-coding and Dynamic-neurons improved Spiking Actor Network (PDSAN) for efficient state representation from two different scales: input coding and neuronal coding.

OpenAI Gym reinforcement-learning +1

Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm

no code implementations23 Dec 2020 Sinho Chewi, Chen Lu, Kwangjun Ahn, Xiang Cheng, Thibaut Le Gouic, Philippe Rigollet

Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as $O(d^{1/3})$, where $d$ is the dimension.

KHOVID: Interoperable Privacy Preserving Digital Contact Tracing

no code implementations17 Dec 2020 Xiang Cheng, Hanchao Yang, Archanaa S Krishnan, Patrick Schaumont, Yaling Yang

To accelerate the laborious manual contact tracing process, digital contact tracing (DCT) tools can track contact events transparently and privately by using the sensing and signaling capabilities of the ubiquitous cell phone.

Cryptography and Security Computers and Society

An End-to-End Solution for Named Entity Recognition in eCommerce Search

no code implementations11 Dec 2020 Xiang Cheng, Mitchell Bowden, Bhushan Ramesh Bhange, Priyanka Goyal, Thomas Packer, Faizan Javed

Beyond our application, this TripleLearn framework, as well as the end-to-end process, is model-independent and problem-independent, so it can be generalized to more industrial applications, especially to the eCommerce industry which has similar data foundations and problems.

named-entity-recognition Named Entity Recognition +1

Tuning Convolutional Spiking Neural Network with Biologically-plausible Reward Propagation

1 code implementation9 Oct 2020 Tielin Zhang, Shuncheng Jia, Xiang Cheng, Bo Xu

The performance of the proposed BRP-SNN is further verified on the spatial (including MNIST and Cifar-10) and temporal (including TIDigits and DvsGesture) tasks, where the SNN using BRP has reached a similar accuracy compared to other state-of-the-art BP-based SNNs and saved 50% more computational cost than ANNs.

Finite Meta-Dynamic Neurons in Spiking Neural Networks for Spatio-temporal Learning

no code implementations7 Oct 2020 Xiang Cheng, Tielin Zhang, Shuncheng Jia, Bo Xu

Spiking Neural Networks (SNNs) have incorporated more biologically-plausible structures and learning principles, hence are playing critical roles in bridging the gap between artificial and natural neural networks.

Cooperative Reasoning on Knowledge Graph and Corpus: A Multi-agentReinforcement Learning Approach

no code implementations4 Dec 2019 Yunan Zhang, Xiang Cheng, Heting Gao, ChengXiang Zhai

We model the question answering on KG as a cooperative task between two agents, a knowledge graph reasoning agent and an information extraction agent.

Question Answering

Stochastic Gradient and Langevin Processes

no code implementations ICML 2020 Xiang Cheng, Dong Yin, Peter L. Bartlett, Michael. I. Jordan

We prove quantitative convergence rates at which discrete Langevin-like processes converge to the invariant distribution of a related stochastic differential equation.

Is There an Analog of Nesterov Acceleration for MCMC?

no code implementations4 Feb 2019 Yi-An Ma, Niladri Chatterji, Xiang Cheng, Nicolas Flammarion, Peter Bartlett, Michael. I. Jordan

We formulate gradient-based Markov chain Monte Carlo (MCMC) sampling as optimization on the space of probability measures, with Kullback-Leibler (KL) divergence as the objective functional.

Quantitative Weak Convergence for Discrete Stochastic Processes

no code implementations3 Feb 2019 Xiang Cheng, Peter L. Bartlett, Michael. I. Jordan

In this paper, we quantitative convergence in $W_2$ for a family of Langevin-like stochastic processes that includes stochastic gradient descent and related gradient-based algorithms.

Sharp convergence rates for Langevin dynamics in the nonconvex setting

no code implementations4 May 2018 Xiang Cheng, Niladri S. Chatterji, Yasin Abbasi-Yadkori, Peter L. Bartlett, Michael. I. Jordan

We study the problem of sampling from a distribution $p^*(x) \propto \exp\left(-U(x)\right)$, where the function $U$ is $L$-smooth everywhere and $m$-strongly convex outside a ball of radius $R$, but potentially nonconvex inside this ball.

Underdamped Langevin MCMC: A non-asymptotic analysis

no code implementations12 Jul 2017 Xiang Cheng, Niladri S. Chatterji, Peter L. Bartlett, Michael. I. Jordan

We study the underdamped Langevin diffusion when the log of the target distribution is smooth and strongly concave.

Convergence of Langevin MCMC in KL-divergence

no code implementations25 May 2017 Xiang Cheng, Peter Bartlett

Langevin diffusion is a commonly used tool for sampling from a given distribution.

FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods

no code implementations26 May 2016 Xiang Cheng, Farbod Roosta-Khorasani, Stefan Palombo, Peter L. Bartlett, Michael W. Mahoney

We consider first order gradient methods for effectively optimizing a composite objective in the form of a sum of smooth and, potentially, non-smooth functions.

Asymptotic behavior of $\ell_p$-based Laplacian regularization in semi-supervised learning

no code implementations2 Mar 2016 Ahmed El Alaoui, Xiang Cheng, Aaditya Ramdas, Martin J. Wainwright, Michael. I. Jordan

Together, these properties show that $p = d+1$ is an optimal choice, yielding a function estimate $\hat{f}$ that is both smooth and non-degenerate, while remaining maximally sensitive to $P$.

Cannot find the paper you are looking for? You can Submit a new open access paper.