no code implementations • 1 Apr 2024 • Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Adrian de Wynter, Yan Xia, Wenshan Wu, Ting Song, Man Lan, Furu Wei
This paper presents a comprehensive survey of the current status and opportunities for Large Language Models (LLMs) in strategic reasoning, a sophisticated form of reasoning that necessitates understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly.
no code implementations • 8 Mar 2024 • Adrian de Wynter
We show that GPT-4's reasoning and planning capabilities extend to the 1993 first-person shooter Doom.
2 code implementations • 11 Dec 2023 • Adrian de Wynter, Xun Wang, Qilong Gu, Si-Qing Chen
We call these approaches meta-prompting, or prompting to obtain prompts.
1 code implementation • 29 Sep 2023 • Adrian de Wynter, Tommy Yuan
We frame our experiments in terms of the argument mining (AM) and argument pair extraction (APE) tasks, and evaluate their ability to perform reasoning at increasing levels of abstraction in the input and output representations (e. g., arbitrary label sets, semantic graphs).
no code implementations • 14 Sep 2023 • Rishav Hada, Varun Gumma, Adrian de Wynter, Harshita Diddee, Mohamed Ahmed, Monojit Choudhury, Kalika Bali, Sunayana Sitaram
Large Language Models (LLMs) excel in various Natural Language Processing (NLP) tasks, yet their evaluation, particularly in languages beyond the top $20$, remains inadequate due to existing benchmarks and metrics limitations.
1 code implementation • 15 Aug 2023 • Adrian de Wynter, Anthony Hevia, Si-Qing Chen
We present an evaluation of text simplification (TS) in Spanish for a production system, by means of two corpora focused in both complex-sentence and complex-word identification.
no code implementations • 17 Apr 2023 • Adrian de Wynter, Xun Wang, Alex Sokolov, Qilong Gu, Si-Qing Chen
We present an empirical evaluation of various outputs generated by nine of the most widely-available large language models (LLMs).
no code implementations • 29 Apr 2021 • Adrian de Wynter
We prove that three strategy video games from the Sid Meier's Civilization series: Sid Meier's Civilization: Beyond Earth, Sid Meier's Civilization V, and Sid Meier's Civilization VI, are Turing complete.
3 code implementations • 20 Oct 2020 • Adrian de Wynter, Daniel J. Perry
We extract an optimal subset of architectural parameters for the BERT architecture from Devlin et al. (2018) by applying recent breakthroughs in algorithms for neural architecture search.
no code implementations • 16 Oct 2020 • Adrian de Wynter
Our findings show that the presence of Mischief-generated adversarial samples in the test set significantly degrades (by up to $20\%$) the performance of these models with respect to their reported baselines.
2 code implementations • 16 Oct 2020 • Adrian de Wynter
We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate.
1 code implementation • 15 Oct 2020 • Adrian de Wynter
In the former case we obtain an asymptotic bound of $O\left(|\Theta^2|\left(\log{|\Theta|} + |\theta^2| + T_f\left(| D|\right)\right) + \bar{S}|\Theta||{E}|\right)$, where $|{\Theta}|$ is the cardinality of the set of hyperparameters $\theta$ to be searched; $|{E}|$ and $|{D}|$ are the sizes of the evaluation and training datasets, respectively; $\bar{S}$ and $\bar{f}$ are the inference times for the trained model and the candidate model; and $T_f({|{D}|})$ is a polynomial on $|{D}|$ and $\bar{f}$.
no code implementations • 26 Aug 2019 • Adrian de Wynter
For this, we first reformulate the function approximation problem in terms of sequences of functions, and we call it the Function Approximation (FA) problem; then we show that it is computationally infeasible to devise a procedure that solves FA for all functions to zero error, regardless of the search space.
no code implementations • 26 Aug 2019 • Adrian de Wynter, Lambert Mathias
This network projects the slot into an attribute space derived from the KB, and, by leveraging similarities in this space, we propose candidate slot keys and values to the dialogue state tracker.