no code implementations • 7 Feb 2024 • Deqian Kong, Dehong Xu, Minglu Zhao, Bo Pang, Jianwen Xie, Andrew Lizarraga, Yuhao Huang, Sirui Xie, Ying Nian Wu
We introduce the Latent Plan Transformer (LPT), a novel model that leverages a latent space to connect a Transformer-based trajectory generator and the final return.
no code implementations • 1 Dec 2023 • Jialin Wu, Xia Hu, Yaqing Wang, Bo Pang, Radu Soricut
Large multi-modal models (LMMs) exhibit remarkable performance across numerous tasks.
Ranked #1 on Visual Question Answering (VQA) on A-OKVQA (using extra training data)
no code implementations • 3 Nov 2023 • Tian Yun, Zilai Zeng, Kunal Handa, Ashish V. Thapliyal, Bo Pang, Ellie Pavlick, Chen Sun
Decision making via sequence modeling aims to mimic the success of language models, where actions taken by an embodied agent are modeled as tokens to predict.
no code implementations • 18 Oct 2023 • Yaqing Wang, Jialin Wu, Tanmaya Dabral, Jiageng Zhang, Geoff Brown, Chun-Ta Lu, Frederick Liu, Yi Liang, Bo Pang, Michael Bendersky, Radu Soricut
Intrusive PEFT techniques directly change a model's internal architecture.
1 code implementation • 11 Sep 2023 • Bo Pang, Zhongtian Zheng, Guoping Wang, Peng-Shuai Wang
Then, we can compute the geodesic distance between a pair of points using our decoding function, which requires only several matrix multiplications and can be massively parallelized on GPUs.
1 code implementation • 7 Sep 2023 • Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong
Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.
1 code implementation • 9 Jun 2023 • Deqian Kong, Bo Pang, Tian Han, Ying Nian Wu
To search for molecules with desired properties, we propose a sampling with gradual distribution shifting (SGDS) algorithm, so that after learning the model initially on the training data of existing molecules and their properties, the proposed algorithm gradually shifts the model distribution towards the region supported by molecules with desired values of properties.
1 code implementation • 1 Jun 2023 • Yan Xu, Deqian Kong, Dehong Xu, Ziwei Ji, Bo Pang, Pascale Fung, Ying Nian Wu
The capability to generate responses with diversity and faithfulness using factual knowledge is paramount for creating a human-like, trustworthy dialogue system.
2 code implementations • 29 May 2023 • Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut
We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture.
Ranked #1 on Fine-Grained Image Recognition on OVEN
no code implementations • 23 May 2023 • Srijan Bansal, Semih Yavuz, Bo Pang, Meghana Bhat, Yingbo Zhou
Question-answering (QA) tasks often investigate specific question types, knowledge domains, or reasoning skills, leading to specialized models catering to specific categories of QA tasks.
no code implementations • CVPR 2023 • Bo Pang, Hongchi Xia, Cewu Lu
In this paper, we design the Triangle Constrained Contrast (TriCC) framework tailored for autonomous driving scenes which learns 3D unsupervised representations through both the multimodal information and dynamic of temporal sequences.
1 code implementation • 29 Oct 2022 • Mitch Hill, Erik Nijkamp, Jonathan Mitchell, Bo Pang, Song-Chun Zhu
This work proposes a method for using any generator network as the foundation of an Energy-Based Model (EBM).
no code implementations • 1 Oct 2022 • Leilei Cui, Bo Pang, Zhong-Ping Jiang
This paper studies the adaptive optimal control problem for a class of linear time-delay systems described by delay differential equations (DDEs).
no code implementations • 21 Jul 2022 • Paul Kassianik, Erik Nijkamp, Bo Pang, Yingbo Zhou, Caiming Xiong
As machine learning tools progress, the inevitable question arises: How can machine learning help us write better code?
1 code implementation • 13 Jul 2022 • Bo Pang, Yifan Zhang, Yaoyi Li, Jia Cai, Cewu Lu
In this paper, we propose a genuine group-level contrastive visual representation learning method whose linear evaluation performance on ImageNet surpasses the vanilla supervised learning.
Ranked #38 on Self-Supervised Image Classification on ImageNet
2 code implementations • 13 Jun 2022 • Peiyu Yu, Sirui Xie, Xiaojian Ma, Baoxiong Jia, Bo Pang, Ruiqi Gao, Yixin Zhu, Song-Chun Zhu, Ying Nian Wu
Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in generative modeling.
no code implementations • COLING 2022 • Wanrong Zhu, Bo Pang, Ashish V. Thapliyal, William Yang Wang, Radu Soricut
Dense video captioning aims to identify the events of interest in an input video, and generate descriptive captions for each event.
Ranked #3 on Dense Video Captioning on ViTT (CIDEr metric, using extra training data)
1 code implementation • CVPR 2022 • Yifan Zhang, Bo Pang, Cewu Lu
Typical vision backbones manipulate structured features.
5 code implementations • 25 Mar 2022 • Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong
To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.
Ranked #79 on Code Generation on HumanEval
no code implementations • 15 Mar 2022 • Bo Pang, Erik Nijkamp, Wojciech Kryściński, Silvio Savarese, Yingbo Zhou, Caiming Xiong
Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents.
Ranked #1 on Text Summarization on Pubmed
no code implementations • 19 Oct 2021 • Bo Pang, Yongquan Fu, Siyuan Ren, Ye Wang, Qing Liao, Yan Jia
Extensive evaluation over real-world traffic data sets, including normal, encrypted and malicious labels, show that, CGNN improves the prediction accuracy by 23\% to 29\% for application classification, by 2\% to 37\% for malicious traffic classification, and reaches the same accuracy level for encrypted traffic classification.
no code implementations • 5th Workshop on Meta-Learning at NeurIPS 2021 2021 • Deqian Kong, Bo Pang, Ying Nian Wu
We propose to learn an energy-based model (EBM) in the latent space of a top-down generative model such that the EBM in the low dimensional latent space is able to be learned efficiently and adapt to each task rapidly.
no code implementations • 29 Sep 2021 • Bo Pang, Erik Nijkamp, Wojciech Maciej Kryscinski, Silvio Savarese, Yingbo Zhou, Caiming Xiong
Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents.
no code implementations • ICLR 2022 • Erik Nijkamp, Ruiqi Gao, Pavel Sountsov, Srinivas Vasudevan, Bo Pang, Song-Chun Zhu, Ying Nian Wu
However, MCMC sampling of EBMs in high-dimensional data space is generally not mixing, because the energy function, which is usually parametrized by deep network, is highly multi-modal in the data space.
1 code implementation • 26 Aug 2021 • Bo Pang, Ying Nian Wu
The energy term of the prior model couples a continuous latent vector and a symbolic one-hot vector, so that discrete category can be inferred from the observed example based on the continuous latent vector.
no code implementations • ACL 2021 • Wenjuan Han, Bo Pang, YingNian Wu
Transfer learning with large pretrained transformer-based language models like BERT has become a dominating approach for most NLP tasks.
3 code implementations • ICCV 2021 • Jiefeng Li, Siyuan Bian, Ailing Zeng, Can Wang, Bo Pang, Wentao Liu, Cewu Lu
In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution.
Ranked #59 on 3D Human Pose Estimation on Human3.6M
2 code implementations • 16 Jul 2021 • Bo Pang, Zhong-Ping Jiang
This paper studies the adaptive optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques.
no code implementations • NAACL 2021 • Erik Nijkamp, Bo Pang, Ying Nian Wu, Caiming Xiong
We introduce Self-CRItic Pretraining Transformers (SCRIPT) for representation learning of text.
1 code implementation • EACL 2021 • Bo Pang, Erik Nijkamp, Tian Han, Ying Nian Wu
It is initialized from the prior distribution of the latent variable and then runs a small number (e. g., 20) of Langevin dynamics steps guided by its posterior distribution.
1 code implementation • CVPR 2021 • Bo Pang, Tianyang Zhao, Xu Xie, Ying Nian Wu
Sampling from or optimizing the learned LB-EBM yields a belief vector which is used to make a path plan, which then in turn helps to predict a long-range trajectory.
1 code implementation • CVPR 2021 • Bo Pang, Gao Peng, Yizhuo Li, Cewu Lu
This progressive training (PGT) method is able to train long videos end-to-end with limited resources and ensures the effective transmission of information.
no code implementations • 14 Dec 2020 • Bo Pang, Yizhuo Li, Jiefeng Li, Muchen Li, Hanwen Cao, Cewu Lu
Such spatial and attention features are nested deeply, therefore, the proposed framework works in a mixed top-down and bottom-up manner.
1 code implementation • CoNLL (EMNLP) 2021 • Edwin G. Ng, Bo Pang, Piyush Sharma, Radu Soricut
Image captioning models generally lack the capability to take into account user interest, and usually default to global descriptions that try to balance readability, informativeness, and information overload.
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Gabriel Huang, Bo Pang, Zhenhai Zhu, Clara Rivera, Radu Soricut
First, we construct and release a new dense video captioning dataset, Video Timeline Tags (ViTT), featuring a variety of instructional videos together with time-stamped annotations.
Ranked #1 on Dense Video Captioning on YouCook2 (ROUGE-L metric, using extra training data)
no code implementations • 19 Oct 2020 • Bo Pang, Tian Han, Ying Nian Wu
Deep generative models have recently been applied to molecule design.
no code implementations • NeurIPS Workshop ICBINB 2020 • Bo Pang, Erik Nijkamp, Jiali Cui, Tian Han, Ying Nian Wu
This paper proposes a latent space energy-based prior model for semi-supervised learning.
no code implementations • 15 Oct 2020 • Bo Pang, Deming Zhai, Junjun Jiang, Xianming Liu
In this work, we propose a novel selective contrastive learning framework for unsupervised feature learning.
no code implementations • 25 Aug 2020 • Bo Pang, Zhong-Ping Jiang
This paper studies the robustness of reinforcement learning algorithms to errors in the learning process.
1 code implementation • 12 Aug 2020 • Hanwen Cao, Yongyi Lu, Cewu Lu, Bo Pang, Gongshen Liu, Alan Yuille
In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds.
1 code implementation • ACL 2020 • Bo Pang, Erik Nijkamp, Wenjuan Han, Linqi Zhou, Yixian Liu, Kewei Tu
Open-domain dialogue generation has gained increasing attention in Natural Language Processing.
1 code implementation • NeurIPS 2020 • Bo Pang, Tian Han, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu
Due to the low dimensionality of the latent space and the expressiveness of the top-down network, a simple EBM in latent space can capture regularities in the data effectively, and MCMC sampling in latent space is efficient and mixes well.
no code implementations • 12 Jun 2020 • Erik Nijkamp, Ruiqi Gao, Pavel Sountsov, Srinivas Vasudevan, Bo Pang, Song-Chun Zhu, Ying Nian Wu
Learning energy-based model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm.
1 code implementation • CVPR 2020 • Bo Pang, Yizhuo Li, Yifan Zhang, Muchen Li, Cewu Lu
As deep learning brings excellent performances to object detection algorithms, Tracking by Detection (TBD) has become the mainstream tracking framework.
no code implementations • CVPR 2020 • Tian Han, Erik Nijkamp, Linqi Zhou, Bo Pang, Song-Chun Zhu, Ying Nian Wu
This paper proposes a joint training method to learn both the variational auto-encoder (VAE) and the latent energy-based model (EBM).
no code implementations • 9 Jun 2020 • Bo Pang, Deming Zhai, Junjun Jiang, Xian-Ming Liu
Image enhancement from degradation of rainy artifacts plays a critical role in outdoor visual computing systems.
1 code implementation • 30 May 2020 • Bo Pang, Kaiwen Zha, Hanwen Cao, Jiajun Tang, Minghui Yu, Cewu Lu
Understanding sequential information is a fundamental task for artificial intelligence.
no code implementations • 19 May 2020 • Bo Pang, Tao Bian, Zhong-Ping Jiang
This paper studies the robustness of policy iteration in the context of continuous-time infinite-horizon linear quadratic regulation (LQR) problem.
Systems and Control Numerical Analysis Systems and Control Numerical Analysis Optimization and Control
no code implementations • 5 May 2020 • Dario Fuoli, Zhiwu Huang, Martin Danelljan, Radu Timofte, Hua Wang, Longcun Jin, Dewei Su, Jing Liu, Jaehoon Lee, Michal Kudelski, Lukasz Bala, Dmitry Hrybov, Marcin Mozejko, Muchen Li, Si-Yao Li, Bo Pang, Cewu Lu, Chao Li, Dongliang He, Fu Li, Shilei Wen
For track 2, some existing methods are evaluated, showing promising solutions to the weakly-supervised video quality mapping problem.
1 code implementation • EMNLP 2020 • Jack Hessel, Zhenhai Zhu, Bo Pang, Radu Soricut
Pretraining from unlabelled web videos has quickly become the de-facto means of achieving high performance on many video understanding tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
2 code implementations • ECCV 2020 • Jiajun Tang, Jin Xia, Xinzhi Mu, Bo Pang, Cewu Lu
We propose the Asynchronous Interaction Aggregation network (AIA) that leverages different interactions to boost action detection.
no code implementations • ECCV 2020 • Erik Nijkamp, Bo Pang, Tian Han, Linqi Zhou, Song-Chun Zhu, Ying Nian Wu
Learning such a generative model requires inferring the latent variables for each training example based on the posterior distribution of these latent variables.
no code implementations • CONLL 2019 • Jack Hessel, Bo Pang, Zhenhai Zhu, Radu Soricut
Instructional videos get high-traffic on video sharing platforms, and prior work suggests that providing time-stamped, subtask annotations (e. g., "heat the oil in the pan") improves user experiences.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
3 code implementations • IJCNLP 2019 • Tao Yu, Rui Zhang, He Yang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter S. Lasecki, Dragomir Radev
We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems.
Ranked #8 on Dialogue State Tracking on CoSQL
no code implementations • IJCNLP 2019 • Soravit Changpinyo, Bo Pang, Piyush Sharma, Radu Soricut
Object detection plays an important role in current solutions to vision and language tasks like image captioning and visual question answering.
Ranked #4 on Visual Question Answering (VQA) on VizWiz 2018
no code implementations • ACL 2019 • Yuanchao Liu, Bo Pang, Bingquan Liu
Although the proper use of idioms can enhance the elegance of writing, the active use of various expressions is a challenge because remembering idioms is difficult.
4 code implementations • ACL 2019 • Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev
The best model obtains an exact match accuracy of 20. 2% over all questions and less than10% over all interaction sequences, indicating that the cross-domain setting and the con-textual phenomena of the dataset present significant challenges for future research.
1 code implementation • CVPR 2019 • Bo Pang, Kaiwen Zha, Hanwen Cao, Chen Shi, Cewu Lu
There are mainly two novel designs in our deep RNN framework: one is a new RNN module called Context Bridge Module (CBM) which splits the information flowing along the sequence (temporal direction) and along depth (spatial representation direction), making it easier to train when building deep by balancing these two directions; the other is the Overlap Coherence Training Scheme that reduces the training complexity for long visual sequential tasks on account of the limitation of computing resources.
no code implementations • 4 Feb 2018 • Bo Pang, Kaiwen Zha, Cewu Lu
We introduce the first benchmark for a new problem --- recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA).
no code implementations • ACL 2014 • Chenhao Tan, Lillian Lee, Bo Pang
Consider a person trying to spread an important message on a social network.
no code implementations • 28 May 2002 • Bo Pang, Lillian Lee, Shivakumar Vaithyanathan
We consider the problem of classifying documents not by topic, but by overall sentiment, e. g., determining whether a review is positive or negative.