L3Ms -- Lagrange Large Language Models

no code implementations28 Oct 2024 Guneet S. Dhillon, Xingjian Shi, Yee Whye Teh, Alex Smola

Supervised fine-tuning (SFT) and alignment of large language models (LLMs) are key steps in providing a good user experience.

Bridging Remote Sensors with Multisensor Geospatial Foundation Models

1 code implementation CVPR 2024 Boran Han, Shuai Zhang, Xingjian Shi, Markus Reichstein

A key discovery of our research is that representations derived from natural images are not always compatible with the distinct characteristics of geospatial remote sensors, underscoring the limitations of existing representations in this field.

Cloud Removal Diversity +1

PreDiff: Precipitation Nowcasting with Latent Diffusion Models

1 code implementation NeurIPS 2023 Zhihan Gao, Xingjian Shi, Boran Han, Hao Wang, Xiaoyong Jin, Danielle Maddix, Yi Zhu, Mu Li, Yuyang Wang

We conduct empirical studies on two datasets: N-body MNIST, a synthetic dataset with chaotic behavior, and SEVIR, a real-world precipitation nowcasting dataset.

Denoising Earth Observation

Automated Few-shot Classification with Instruction-Finetuned Language Models

1 code implementation21 May 2023 Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Gordon Wilson

We observe, in the context of classification tasks, that instruction finetuned language models exhibit remarkable prompt robustness, and we subsequently propose a simple method to eliminate the need for handcrafted prompts, named AuT-Few.

Classification Few-Shot Learning +1

Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation

1 code implementation16 May 2023 Yuxin Ren, Zihan Zhong, Xingjian Shi, Yi Zhu, Chun Yuan, Mu Li

It has been commonly observed that a teacher model with superior performance does not necessarily result in a stronger student, highlighting a discrepancy between current teacher training practices and effective knowledge transfer.

Knowledge Distillation text-classification +2

XTab: Cross-table Pretraining for Tabular Transformers

1 code implementation10 May 2023 Bingzhao Zhu, Xingjian Shi, Nick Erickson, Mu Li, George Karypis, Mahsa Shoaran

The success of self-supervised learning in computer vision and natural language processing has motivated pretraining methods on tabular data.

AutoML Federated Learning +1

LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation

no code implementations16 Feb 2023 Jiaxin Cheng, Xiao Liang, Xingjian Shi, Tong He, Tianjun Xiao, Mu Li

Layout-to-image generation refers to the task of synthesizing photo-realistic images based on semantic layouts.

Layout-to-Image Generation

Towards Geospatial Foundation Models via Continual Pretraining

2 code implementations ICCV 2023 Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen

Geospatial technologies are becoming increasingly essential in our world for a wide range of applications, including agriculture, urban planning, and disaster response.

Change Detection Continual Pretraining +6

Parameter-Efficient Fine-Tuning Design Spaces

no code implementations4 Jan 2023 Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang

We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups.

parameter-efficient fine-tuning

Learning Multimodal Data Augmentation in Feature Space

1 code implementation29 Dec 2022 Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson

The ability to jointly learn from multiple modalities, such as text, audio, and visual data, is a defining feature of intelligent systems.

Data Augmentation Image Classification +1

First De-Trend then Attend: Rethinking Attention for Time-Series Forecasting

1 code implementation15 Dec 2022 Xiyuan Zhang, Xiaoyong Jin, Karthick Gopalswamy, Gaurav Gupta, Youngsuk Park, Xingjian Shi, Hao Wang, Danielle C. Maddix, Yuyang Wang

Transformer-based models have gained large popularity and demonstrated promising results in long-term time-series forecasting in recent years.

Time Series Time Series Forecasting

Earthformer: Exploring Space-Time Transformers for Earth System Forecasting

2 code implementations12 Jul 2022 Zhihan Gao, Xingjian Shi, Hao Wang, Yi Zhu, Yuyang Wang, Mu Li, Dit-yan Yeung

With the explosive growth of the spatiotemporal Earth observation data in the past decade, data-driven models that apply Deep Learning (DL) are demonstrating impressive potential for various Earth system forecasting tasks.

Earth Observation Earth Surface Forecasting +1

Removing Batch Normalization Boosts Adversarial Training

1 code implementation4 Jul 2022 Haotao Wang, Aston Zhang, Shuai Zheng, Xingjian Shi, Mu Li, Zhangyang Wang

In addition, NoFrost achieves a $23. 56\%$ adversarial robustness against PGD attack, which improves the $13. 57\%$ robustness in BN-based AT.

Adversarial Robustness

Fix Bugs with Transformer through a Neural-Symbolic Edit Grammar

no code implementations13 Apr 2022 Yaojie Hu, Xingjian Shi, Qiang Zhou, Lee Pike

We introduce NSEdit (neural-symbolic edit), a novel Transformer-based code repair method.

Code Repair

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

2 code implementations4 Nov 2021 Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alexander J. Smola

We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well.

AutoML Benchmarking +1

Distiller: A Systematic Study of Model Distillation Methods in Natural Language Processing

no code implementations EMNLP (sustainlp) 2021 Haoyu He, Xingjian Shi, Jonas Mueller, Zha Sheng, Mu Li, George Karypis

We aim to identify how different components in the KD pipeline affect the resulting performance and how much the optimal KD pipeline varies across different datasets/tasks, such as the data augmentation policy, the loss function, and the intermediate representation for transferring the knowledge between teacher and student.

Data Augmentation Hyperparameter Optimization

Multimodal AutoML on Structured Tables with Text Fields

1 code implementation ICML Workshop AutoML 2021 Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alex Smola

We design automated supervised learning systems for data tables that not only contain numeric/categorical columns, but text fields as well.


GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

3 code implementations9 Jul 2019 Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).

Deep Learning

STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for Recommender Systems

no code implementations27 May 2019 Jiani Zhang, Xingjian Shi, Shenglin Zhao, Irwin King

We propose a new STAcked and Reconstructed Graph Convolutional Networks (STAR-GCN) architecture to learn node representations for boosting the performance in recommender systems, especially in the cold start scenario.

Link Prediction Matrix Completion +1

Machine Learning for Spatiotemporal Sequence Forecasting: A Survey

no code implementations21 Aug 2018 Xingjian Shi, Dit-yan Yeung

Forecasting the multi-step future of these spatiotemporal systems based on the past observations, or, Spatiotemporal Sequence Forecasting (STSF), is a significant and challenging problem.

BIG-bench Machine Learning Survey +1

Spatiotemporal Modeling for Crowd Counting in Videos

no code implementations ICCV 2017 Feng Xiong, Xingjian Shi, Dit-yan Yeung

To exploit the otherwise very useful temporal information in video sequences, we propose a variant of a recent deep learning model called convolutional LSTM (ConvLSTM) for crowd counting.

Crowd Counting Transfer Learning

Dynamic Key-Value Memory Networks for Knowledge Tracing

1 code implementation24 Nov 2016 Jiani Zhang, Xingjian Shi, Irwin King, Dit-yan Yeung

Knowledge Tracing (KT) is a task of tracing evolving knowledge state of students with respect to one or more concepts as they engage in a sequence of learning activities.

Knowledge Tracing

Natural-Parameter Networks: A Class of Probabilistic Neural Networks

1 code implementation NeurIPS 2016 Hao Wang, Xingjian Shi, Dit-yan Yeung

Another shortcoming of NN is the lack of flexibility to customize different distributions for the weights and neurons according to the data, as is often done in probabilistic graphical models.

Decision Making Under Uncertainty Link Prediction

Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks

no code implementations NeurIPS 2016 Hao Wang, Xingjian Shi, Dit-yan Yeung

To address this problem, we develop a collaborative recurrent autoencoder (CRAE) which is a denoising recurrent autoencoder (DRAE) that models the generation of content sequences in the collaborative filtering (CF) setting.

Collaborative Filtering Denoising +1

