no code implementations • 28 Oct 2024 • Guneet S. Dhillon, Xingjian Shi, Yee Whye Teh, Alex Smola
Supervised fine-tuning (SFT) and alignment of large language models (LLMs) are key steps in providing a good user experience.
no code implementations • 27 May 2024 • Hao Wu, Xingjian Shi, Ziyue Huang, Penghao Zhao, Wei Xiong, Jinbao Xue, Yangyu Tao, Xiaomeng Huang, Weiyan Wang
Data-driven deep learning has emerged as the new paradigm to model complex physical space-time systems.
1 code implementation • CVPR 2024 • Boran Han, Shuai Zhang, Xingjian Shi, Markus Reichstein
A key discovery of our research is that representations derived from natural images are not always compatible with the distinct characteristics of geospatial remote sensors, underscoring the limitations of existing representations in this field.
1 code implementation • NeurIPS 2023 • Zhihan Gao, Xingjian Shi, Boran Han, Hao Wang, Xiaoyong Jin, Danielle Maddix, Yi Zhu, Mu Li, Yuyang Wang
We conduct empirical studies on two datasets: N-body MNIST, a synthetic dataset with chaotic behavior, and SEVIR, a real-world precipitation nowcasting dataset.
1 code implementation • 21 May 2023 • Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Gordon Wilson
We observe, in the context of classification tasks, that instruction finetuned language models exhibit remarkable prompt robustness, and we subsequently propose a simple method to eliminate the need for handcrafted prompts, named AuT-Few.
1 code implementation • 16 May 2023 • Yuxin Ren, Zihan Zhong, Xingjian Shi, Yi Zhu, Chun Yuan, Mu Li
It has been commonly observed that a teacher model with superior performance does not necessarily result in a stronger student, highlighting a discrepancy between current teacher training practices and effective knowledge transfer.
1 code implementation • 10 May 2023 • Bingzhao Zhu, Xingjian Shi, Nick Erickson, Mu Li, George Karypis, Mahsa Shoaran
The success of self-supervised learning in computer vision and natural language processing has motivated pretraining methods on tabular data.
no code implementations • 16 Feb 2023 • Jiaxin Cheng, Xiao Liang, Xingjian Shi, Tong He, Tianjun Xiao, Mu Li
Layout-to-image generation refers to the task of synthesizing photo-realistic images based on semantic layouts.
2 code implementations • ICCV 2023 • Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen
Geospatial technologies are becoming increasingly essential in our world for a wide range of applications, including agriculture, urban planning, and disaster response.
no code implementations • 4 Jan 2023 • Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang
We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups.
1 code implementation • 29 Dec 2022 • Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson
The ability to jointly learn from multiple modalities, such as text, audio, and visual data, is a defining feature of intelligent systems.
no code implementations • 21 Dec 2022 • M Saiful Bari, Aston Zhang, Shuai Zheng, Xingjian Shi, Yi Zhu, Shafiq Joty, Mu Li
Pre-trained large language models can efficiently interpolate human-written prompts in a natural way.
1 code implementation • 15 Dec 2022 • Xiyuan Zhang, Xiaoyong Jin, Karthick Gopalswamy, Gaurav Gupta, Youngsuk Park, Xingjian Shi, Hao Wang, Danielle C. Maddix, Yuyang Wang
Transformer-based models have gained large popularity and demonstrated promising results in long-term time-series forecasting in recent years.
no code implementations • 15 Dec 2022 • JieLin Qiu, Yi Zhu, Xingjian Shi, Florian Wenzel, Zhiqiang Tang, Ding Zhao, Bo Li, Mu Li
Multimodal image-text models have shown remarkable performance in the past few years.
no code implementations • 4 Nov 2022 • Wenting Ye, Hongfei Yang, Shuai Zhao, Haoyang Fang, Xingjian Shi, Naveen Neppalli
The substitute-based recommendation is widely used in E-commerce to provide better alternatives to customers.
no code implementations • 10 Oct 2022 • Yunhe Gao, Xingjian Shi, Yi Zhu, Hao Wang, Zhiqiang Tang, Xiong Zhou, Mu Li, Dimitris N. Metaxas
First, DePT plugs visual prompts into the vision Transformer and only tunes these source-initialized prompts during adaptation.
Ranked #6 on Domain Adaptation on VisDA2017
2 code implementations • 12 Jul 2022 • Zhihan Gao, Xingjian Shi, Hao Wang, Yi Zhu, Yuyang Wang, Mu Li, Dit-yan Yeung
With the explosive growth of the spatiotemporal Earth observation data in the past decade, data-driven models that apply Deep Learning (DL) are demonstrating impressive potential for various Earth system forecasting tasks.
Ranked #1 on Earth Surface Forecasting on EarthNet2021 OOD Track
1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Shuai Zheng, Xingjian Shi, Mu Li, Zhangyang Wang
In addition, NoFrost achieves a $23. 56\%$ adversarial robustness against PGD attack, which improves the $13. 57\%$ robustness in BN-based AT.
no code implementations • 13 Apr 2022 • Yaojie Hu, Xingjian Shi, Qiang Zhou, Lee Pike
We introduce NSEdit (neural-symbolic edit), a novel Transformer-based code repair method.
Ranked #1 on Code Repair on CodeXGLUE - Bugs2Fix
2 code implementations • 4 Nov 2021 • Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alexander J. Smola
We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well.
Ranked #2 on Binary Classification on kickstarter
no code implementations • EMNLP (sustainlp) 2021 • Haoyu He, Xingjian Shi, Jonas Mueller, Zha Sheng, Mu Li, George Karypis
We aim to identify how different components in the KD pipeline affect the resulting performance and how much the optimal KD pipeline varies across different datasets/tasks, such as the data augmentation policy, the loss function, and the intermediate representation for transferring the knowledge between teacher and student.
1 code implementation • ICML Workshop AutoML 2021 • Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alex Smola
We design automated supervised learning systems for data tables that not only contain numeric/categorical columns, but text fields as well.
no code implementations • 25 Sep 2019 • Mufei Li, Hao Zhang, Xingjian Shi, Minjie Wang, Yixing Guan, Zheng Zhang
Does attention matter and, if so, when and how?
3 code implementations • 9 Jul 2019 • Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu
We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).
no code implementations • 27 May 2019 • Jiani Zhang, Xingjian Shi, Shenglin Zhao, Irwin King
We propose a new STAcked and Reconstructed Graph Convolutional Networks (STAR-GCN) architecture to learn node representations for boosting the performance in recommender systems, especially in the cold start scenario.
no code implementations • 21 Aug 2018 • Xingjian Shi, Dit-yan Yeung
Forecasting the multi-step future of these spatiotemporal systems based on the past observations, or, Spatiotemporal Sequence Forecasting (STSF), is a significant and challenging problem.
1 code implementation • 20 Mar 2018 • Jiani Zhang, Xingjian Shi, Junyuan Xie, Hao Ma, Irwin King, Dit-yan Yeung
We propose a new network architecture, Gated Attention Networks (GaAN), for learning on graphs.
Ranked #1 on Node Property Prediction on ogbn-proteins
no code implementations • ICCV 2017 • Feng Xiong, Xingjian Shi, Dit-yan Yeung
To exploit the otherwise very useful temporal information in video sequences, we propose a variant of a recent deep learning model called convolutional LSTM (ConvLSTM) for crowd counting.
4 code implementations • NeurIPS 2017 • Xingjian Shi, Zhihan Gao, Leonard Lausen, Hao Wang, Dit-yan Yeung, Wai-kin Wong, Wang-chun Woo
To address these problems, we propose both a new model and a benchmark for precipitation nowcasting.
Ranked #1 on Video Prediction on KTH (Cond metric)
1 code implementation • 24 Nov 2016 • Jiani Zhang, Xingjian Shi, Irwin King, Dit-yan Yeung
Knowledge Tracing (KT) is a task of tracing evolving knowledge state of students with respect to one or more concepts as they engage in a sequence of learning activities.
1 code implementation • NeurIPS 2016 • Hao Wang, Xingjian Shi, Dit-yan Yeung
Another shortcoming of NN is the lack of flexibility to customize different distributions for the weights and neurons according to the data, as is often done in probabilistic graphical models.
no code implementations • NeurIPS 2016 • Hao Wang, Xingjian Shi, Dit-yan Yeung
To address this problem, we develop a collaborative recurrent autoencoder (CRAE) which is a denoising recurrent autoencoder (DRAE) that models the generation of content sequences in the collaborative filtering (CF) setting.
18 code implementations • NeurIPS 2015 • Xingjian Shi, Zhourong Chen, Hao Wang, Dit-yan Yeung, Wai-kin Wong, Wang-chun Woo
The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time.
Ranked #1 on Video Prediction on KTH (Cond metric)