no code implementations • 11 Jun 2025 • Zhenqiao Song, Ramith Hettiarachchi, Chuan Li, Jianwen Xie, Lei LI
Given a textual description of the desired function and a ligand formula in SMILES, InstructPro generates protein sequences that are functionally consistent with the specified instructions.
no code implementations • 3 Jun 2025 • Yide Ran, Wentao Guo, Jingwei Sun, Yanzhou Pan, Xiaodong Yu, Hao Wang, Jianwen Xie, Yiran Chen, Denghui Zhang, Zhaozhuo Xu
Experiments confirm that Meerkat and Meerkat-vp significantly improve the efficiency and effectiveness of ZO federated LLM fine-tuning.
no code implementations • 6 May 2025 • Donghun Noh, Deqian Kong, Minglu Zhao, Andrew Lizarraga, Jianwen Xie, Ying Nian Wu, Dennis Hong
This paper presents Latent Adaptive Planner (LAP), a novel approach for dynamic nonprehensile manipulation tasks that formulates planning as latent space inference, effectively learned from human demonstration videos.
no code implementations • 28 Apr 2025 • Wufei Ma, Yu-Cheng Chou, Qihao Liu, Xingrui Wang, Celso de Melo, Jianwen Xie, Alan Yuille
Despite recent advances on multi-modal models, 3D spatial reasoning remains a challenging task for state-of-the-art open-source and proprietary models.
1 code implementation • 20 Apr 2025 • Enxin Song, Wenhao Chai, Weili Xu, Jianwen Xie, Yuxuan Liu, Gaoang Wang
Recent advancements in language multimodal models (LMMs) for video have demonstrated their potential for understanding video content, yet the task of comprehending multi-discipline lectures remains largely unexplored.
no code implementations • 25 Feb 2025 • Keqiang Yan, Montgomery Bohde, Andrii Kryvenko, Ziyu Xiang, Kaiji Zhao, Siya Zhu, Saagar Kolachina, Doğuhan Sarıtürk, Jianwen Xie, Raymundo Arroyave, Xiaoning Qian, Xiaofeng Qian, Shuiwang Ji
Machine learning interatomic potentials (MLIPs) can predict energy, force, and stress of materials and enable a wide range of downstream discovery tasks.
no code implementations • 3 Feb 2025 • Deqian Kong, Minglu Zhao, Dehong Xu, Bo Pang, Shu Wang, Edouardo Honig, Zhangzhang Si, Chuan Li, Jianwen Xie, Sirui Xie, Ying Nian Wu
We propose a novel family of language models, Latent-Thought Language Models (LTMs), which incorporate explicit latent thought vectors that follow an explicit prior model in latent space.
no code implementations • 5 Sep 2024 • Sheng Cheng, Deqian Kong, Jianwen Xie, Kookjin Lee, Ying Nian Wu, Yezhou Yang
This family of models generates each data point in the time series by a neural emission model, which is a non-linear transformation of a latent state vector.
no code implementations • 27 May 2024 • Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu
To this end, we formulate an learnable energy-based latent space, and propose Noise-intensified Telescoping density-Ratio Estimation (NTRE) scheme for variational learning of an accurate latent space model without costly Markov Chain Monte Carlo.
no code implementations • 27 Feb 2024 • Deqian Kong, Yuhao Huang, Jianwen Xie, Edouardo Honig, Ming Xu, Shuanghong Xue, Pei Lin, Sanping Zhou, Sheng Zhong, Nanning Zheng, Ying Nian Wu
We propose the Latent Prompt Transformer (LPT), a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution modeled by a neural transformation of Gaussian white noise; (2) a molecule generation model based on a causal Transformer, which uses the latent vector as a prompt; and (3) a property prediction model that predicts a molecule's target properties and/or constraint values using the latent prompt.
1 code implementation • 7 Feb 2024 • Deqian Kong, Dehong Xu, Minglu Zhao, Bo Pang, Jianwen Xie, Andrew Lizarraga, Yuhao Huang, Sirui Xie, Ying Nian Wu
We introduce the Latent Plan Transformer (LPT), a novel model that leverages a latent variable to connect a Transformer-based trajectory generator and the final return.
no code implementations • 19 Oct 2023 • Belhal Karimi, Jianwen Xie, Ping Li
We propose in this paper, STANLEY, a STochastic gradient ANisotropic LangEvin dYnamics, for sampling high dimensional data.
no code implementations • 5 Oct 2023 • Deqian Kong, Yuhao Huang, Jianwen Xie, Ying Nian Wu
This paper proposes a latent prompt Transformer model for solving challenging optimization problems such as molecule design, where the goal is to find molecules with optimal values of a target chemical or biological property that can be computed by an existing software.
1 code implementation • 10 Sep 2023 • Yaxuan Zhu, Jianwen Xie, YingNian Wu, Ruiqi Gao
Training energy-based models (EBMs) on high-dimensional data can be both challenging and time-consuming, and there exists a noticeable gap in sample quality between EBMs and other generative frameworks like GANs and diffusion models.
no code implementations • 26 Jun 2023 • Weinan Song, Yaxuan Zhu, Lei He, YingNian Wu, Jianwen Xie
The components of translator, style encoder, and style generator constitute a diversified image generator.
no code implementations • 16 Apr 2023 • Yaxuan Zhu, Jianwen Xie, Ping Li
We propose the NeRF-LEBM, a likelihood-based top-down 3D-aware 2D image generative model that incorporates 3D representation via Neural Radiance Fields (NeRF) and 2D imaging process via differentiable volume rendering.
no code implementations • 21 Mar 2023 • Yang Zhao, Jianwen Xie, Ping Li
The proposed algorithm consists of two learning stages: (i) Cooperative initialization stage: The discriminator of GAN is treated as an energy-based model (EBM) and is optimized via maximum likelihood estimation (MLE), with the help of the GAN's generator to provide synthetic data to approximate the learning gradients.
no code implementations • 23 Jan 2023 • Jianwen Xie, Yaxuan Zhu, Yifei Xu, Dingcheng Li, Ping Li
We study a normalizing flow in the latent space of a top-down generator model, in which the normalizing flow model plays the role of the informative prior model of the generator.
no code implementations • 9 Oct 2022 • Khoa D. Doan, Jianwen Xie, Yaxuan Zhu, Yang Zhao, Ping Li
Leveraging supervised information can lead to superior retrieval performance in the image hashing domain but the performance degrades significantly without enough labeled data.
no code implementations • ICLR 2022 • Jianwen Xie, Yaxuan Zhu, Jun Li, Ping Li
Under the short-run non-mixing MCMC scenario, the estimation of the energy-based model is shown to follow the perturbation of maximum likelihood, and the short-run Langevin flow and the normalizing flow form a two-flow generator that we call CoopFlow.
1 code implementation • 19 Apr 2022 • Jing Zhang, Jianwen Xie, Nick Barnes, Ping Li
We propose a novel generative saliency prediction framework that adopts an informative energy-based model as a prior distribution.
no code implementations • NeurIPS 2021 • Jing Zhang, Jianwen Xie, Nick Barnes, Ping Li
In this paper, we take a step further by proposing a novel generative vision transformer with latent variables following an informative energy-based prior for salient object detection.
no code implementations • 29 Sep 2021 • Nanqing Dong, Jianwen Xie, Ping Li
We present a simple yet robust noise synthesis framework based on unsupervised contrastive learning.
1 code implementation • CVPR 2022 • Zongsheng Yue, Qian Zhao, Jianwen Xie, Lei Zhang, Deyu Meng, Kwan-Yee K. Wong
To address the above issues, this paper proposes a model-based blind SISR method under the probabilistic framework, which elaborately models image degradation from the perspectives of noise and blur kernel.
1 code implementation • 25 Jun 2021 • Jing Zhang, Jianwen Xie, Zilong Zheng, Nick Barnes
In this paper, to model the uncertainty of visual saliency, we study the saliency prediction problem from the perspective of generative models by learning a conditional probability distribution over the saliency map given an input image, and treating the saliency prediction as a sampling process from the learned distribution.
no code implementations • CVPR 2021 • Dongsheng An, Jianwen Xie, Ping Li
Learning latent variable models with deep top-down architectures typically requires inferring the latent variables for each training example based on the posterior distribution of these latent variables.
no code implementations • CVPR 2021 • Zilong Zheng, Jianwen Xie, Ping Li
Exploiting internal statistics of a single natural image has long been recognized as a significant research paradigm where the goal is to learn the distribution of patches within the image without relying on external training data.
1 code implementation • CVPR 2021 • Zongsheng Yue, Jianwen Xie, Qian Zhao, Deyu Meng
Firstly, most of them do not sufficiently model the characteristics of rain layers of rainy videos.
no code implementations • 7 Mar 2021 • Jianwen Xie, Zilong Zheng, Xiaolin Fang, Song-Chun Zhu, Ying Nian Wu
This paper studies the unsupervised cross-domain translation problem by proposing a generative framework, in which the probability distribution of each domain is represented by a generative cooperative network that consists of an energy-based model and a latent variable model.
no code implementations • ICLR 2021 • Yang Zhao, Jianwen Xie, Ping Li
Energy-based models (EBMs) for generative modeling parametrize a single net and can be directly trained by maximum likelihood estimation.
no code implementations • 29 Dec 2020 • Jianwen Xie, Zilong Zheng, Ping Li
In this paper, we propose to learn a variational auto-encoder (VAE) to initialize the finite-step MCMC, such as Langevin dynamics that is derived from the energy function, for efficient amortized sampling of the EBM.
no code implementations • 25 Dec 2020 • Jianwen Xie, Zilong Zheng, Ruiqi Gao, Wenguan Wang, Song-Chun Zhu, Ying Nian Wu
3D data that contains rich geometry information of objects and scenes is valuable for understanding 3D physical world.
no code implementations • 28 Sep 2020 • Ruiqi Gao, Jianwen Xie, Xue-Xin Wei, Song-Chun Zhu, Ying Nian Wu
The grid cells in the mammalian medial entorhinal cortex exhibit striking hexagon firing patterns when the agent navigates in the open field.
no code implementations • ECCV 2020 • Jing Zhang, Jianwen Xie, Nick Barnes
The proposed model consists of two sub-models parameterized by neural networks: (1) a saliency predictor that maps input images to clean saliency maps, and (2) a noise generator, which is a latent variable model that produces noises from Gaussian latent vectors.
1 code implementation • NeurIPS 2021 • Ruiqi Gao, Jianwen Xie, Xue-Xin Wei, Song-Chun Zhu, Ying Nian Wu
In this paper, we conduct theoretical analysis of a general representation model of path integration by grid cells, where the 2D self-position is encoded as a higher dimensional vector, and the 2D self-motion is represented by a general transformation of the vector.
1 code implementation • CVPR 2021 • Jianwen Xie, Yifei Xu, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu
We propose a generative model of unordered point sets, such as point clouds, in the form of an energy-based model, where the energy function is parameterized by an input-permutation-invariant bottom-up neural network.
no code implementations • 26 Nov 2019 • Jianwen Xie, Ruiqi Gao, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu
To model the motions explicitly, it is natural for the model to be based on the motions or the displacement fields of the pixels.
no code implementations • 26 Nov 2019 • Jianwen Xie, Ruiqi Gao, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu
Learning representations of data is an important problem in statistics and machine learning.
no code implementations • 26 Sep 2019 • Jianwen Xie, Song-Chun Zhu, Ying Nian Wu
We show that an energy-based spatial-temporal generative ConvNet can be used to model and synthesize dynamic patterns.
no code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence 2019 • Yuanlu Xu, Wenguan Wang, Xiaobai Liu, Jianwen Xie, Song-Chun Zhu
In this paper, we propose a pose grammar to tackle the problem of 3D human pose estimation from a monocular RGB image.
Ranked #15 on
3D Human Pose Estimation
on HumanEva-I
1 code implementation • ICCV 2019 • Yizhe Zhu, Jianwen Xie, Bingchen Liu, Ahmed Elgammal
We investigate learning feature-to-feature translator networks by alternating back-propagation as a general-purpose solution to zero-shot learning (ZSL) problems.
no code implementations • 10 Apr 2019 • Yifei Xu, Jianwen Xie, Tianyang Zhao, Chris Baker, Yibiao Zhao, Ying Nian Wu
The problem of continuous inverse optimal control (over finite time horizon) is to learn the unknown cost function over the sequence of continuous control variables from expert demonstrations.
no code implementations • NeurIPS 2019 • Yizhe Zhu, Jianwen Xie, Zhiqiang Tang, Xi Peng, Ahmed Elgammal
Zero-shot learning extends the conventional object classification to the unseen class recognition by introducing semantic representations of classes.
no code implementations • 7 Feb 2019 • Jianwen Xie, Zilong Zheng, Xiaolin Fang, Song-Chun Zhu, Ying Nian Wu
This paper studies the problem of learning the conditional distribution of a high-dimensional output given an input, where the output and input may belong to two different domains, e. g., the output is a photo image and the input is a sketch image.
no code implementations • 24 Jan 2019 • Ruiqi Gao, Jianwen Xie, Siyuan Huang, Yufan Ren, Song-Chun Zhu, Ying Nian Wu
This paper proposes a representational model for image pairs such as consecutive video frames that are related by local pixel displacements, in the hope that the model may shed light on motion perception in primary visual cortex (V1).
no code implementations • 27 Dec 2018 • Jianwen Xie, Ruiqi Gao, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu
The non-linear transformation of this transition model can be parametrized by a feedforward neural network.
no code implementations • 19 Nov 2018 • Yunlu Xu, Chengwei Zhang, Zhanzhan Cheng, Jianwen Xie, Yi Niu, ShiLiang Pu, Fei Wu
Finally, we transform the output of recurrent neural network into the corresponding action distribution.
1 code implementation • ICLR 2019 • Ruiqi Gao, Jianwen Xie, Song-Chun Zhu, Ying Nian Wu
In this model, the 2D self-position of the agent is represented by a high-dimensional vector, and the 2D self-motion or displacement of the agent is represented by a matrix that transforms the vector.
1 code implementation • CVPR 2018 • Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, Cewu Lu
In this paper, we present a novel method to generate synthetic human part segmentation data using easily-obtained human keypoint annotations.
Ranked #4 on
Human Part Segmentation
on PASCAL-Part
(using extra training data)
1 code implementation • CVPR 2018 • Jianwen Xie, Zilong Zheng, Ruiqi Gao, Wenguan Wang, Song-Chun Zhu, Ying Nian Wu
This paper proposes a 3D shape descriptor network, which is a deep convolutional energy-based model, for modeling volumetric shape patterns.
no code implementations • CVPR 2018 • Yuanlu Xu, Lei Qin, Xiaobai Liu, Jianwen Xie, Song-Chun Zhu
We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e. g., from visible to invisible) and track humans in videos.
no code implementations • CVPR 2017 • Jianwen Xie, Yifei Xu, Erik Nijkamp, Ying Nian Wu, Song-Chun Zhu
This paper proposes a method for generative learning of hierarchical random field models.
no code implementations • ICCV 2017 • Wenguan Wang, Jianbing Shen, Jianwen Xie, Fatih Porikli
We introduce a novel semi-supervised video segmentation approach based on an efficient video representation, called as "super-trajectory".
no code implementations • 29 Sep 2016 • Jianwen Xie, Yang Lu, Ruiqi Gao, Song-Chun Zhu, Ying Nian Wu
Specifically, within each iteration of the cooperative learning algorithm, the generator model generates initial synthesized examples to initialize a finite-step MCMC that samples and trains the energy-based descriptor model.
no code implementations • 1 Jul 2016 • Jianwen Xie, Pamela K. Douglas, Ying Nian Wu, Arthur L. Brody, Ariana E. Anderson
Spatial sparse coding algorithms ($L1$ Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks.
no code implementations • CVPR 2017 • Jianwen Xie, Song-Chun Zhu, Ying Nian Wu
We show that a spatial-temporal generative ConvNet can be used to model and synthesize dynamic patterns.
no code implementations • 10 Feb 2016 • Jianwen Xie, Yang Lu, Song-Chun Zhu, Ying Nian Wu
If we further assume that the non-linearity in the ConvNet is Rectified Linear Unit (ReLU) and the reference distribution is Gaussian white noise, then we obtain a generative ConvNet model that is unique among energy-based models: The model is piecewise Gaussian, and the means of the Gaussian pieces are defined by an auto-encoder, where the filters in the bottom-up encoding become the basis functions in the top-down decoding, and the binary activation variables detected by the filters in the bottom-up convolution process become the coefficients of the basis functions in the top-down deconvolution process.
no code implementations • CVPR 2014 • Jianwen Xie, Wenze Hu, Song-Chun Zhu, Ying Nian Wu
We investigate an inhomogeneous version of the FRAME (Filters, Random field, And Maximum Entropy) model and apply it to modeling object patterns.