Search Results for author: Xiaomeng Yang

Found 10 papers, 3 papers with code

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

1 code implementation12 Apr 2024 Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou

The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.

End-to-end Story Plot Generator

no code implementations13 Oct 2023 Hanlin Zhu, Andrew Cohen, Danqing Wang, Kevin Yang, Xiaomeng Yang, Jiantao Jiao, Yuandong Tian

Story plots, while short, carry most of the essential information of a full story that may contain tens of thousands of words.

Blocking

Learning Personalized Story Evaluation

no code implementations5 Oct 2023 Danqing Wang, Kevin Yang, Hanlin Zhu, Xiaomeng Yang, Andrew Cohen, Lei LI, Yuandong Tian

We further develop a personalized story evaluation model PERSE to infer reviewer preferences and provide a personalized evaluation.

Retrieval Text Generation

TorchRL: A data-driven decision-making library for PyTorch

2 code implementations1 Jun 2023 Albert Bou, Matteo Bettini, Sebastian Dittert, Vikash Kumar, Shagun Sodhani, Xiaomeng Yang, Gianni de Fabritiis, Vincent Moens

PyTorch has ascended as a premier machine learning framework, yet it lacks a native and comprehensive library for decision and control tasks suitable for large development teams dealing with complex real-world data and environments.

Computational Efficiency Decision Making +1

Masked and Permuted Implicit Context Learning for Scene Text Recognition

no code implementations25 May 2023 Xiaomeng Yang, Zhi Qiao, Jin Wei, Dongbao Yang, Yu Zhou

We utilize the training procedure of PLM, and to integrate MLM, we incorporate word length information into the decoding process and replace the undetermined characters with mask tokens.

Language Modelling Masked Language Modeling +1

Sample-efficient Surrogate Model for Frequency Response of Linear PDEs using Self-Attentive Complex Polynomials

no code implementations6 Jan 2023 Andrew Cohen, Weiping Dou, Jiang Zhu, Slawomir Koziel, Peter Renner, Jan-Ove Mattsson, Xiaomeng Yang, Beidi Chen, Kevin Stone, Yuandong Tian

Linear Partial Differential Equations (PDEs) govern the spatial-temporal dynamics of physical systems that are essential to building modern technology.

Nocturne: a scalable driving benchmark for bringing multi-agent learning one step closer to the real world

1 code implementation20 Jun 2022 Eugene Vinitsky, Nathan Lichtlé, Xiaomeng Yang, Brandon Amos, Jakob Foerster

We introduce Nocturne, a new 2D driving simulator for investigating multi-agent coordination under partial observability.

Imitation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.