no code implementations • 27 Aug 2024 • Adrian de Wynter
We perform a critical examination of the scientific methodology behind contemporary large language model (LLM) research.
1 code implementation • 22 Apr 2024 • Adrian de Wynter, Ishaan Watts, Nektar Ege Altıntoprak, Tua Wongsangaroonsri, Minghui Zhang, Noura Farra, Lena Baur, Samantha Claudet, Pavel Gajdusek, Can Gören, Qilong Gu, Anna Kaminska, Tomasz Kaminski, Ruby Kuo, Akiko Kyuba, Jongho Lee, Kartik Mathur, Petter Merok, Ivana Milovanović, Nani Paananen, Vesa-Matti Paananen, Anna Pavlenko, Bruno Pereira Vidal, Luciano Strika, Yueh Tsao, Davide Turcato, Oleksandr Vakhno, Judit Velcsov, Anna Vickers, Stéphanie Visser, Herdyan Widarmanto, Andrey Zaikin, Si-Qing Chen
Large language models (LLMs) and small language models (SLMs) are being adopted at remarkable speed, although their safety still remains a serious concern.
no code implementations • 1 Apr 2024 • Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Adrian de Wynter, Yan Xia, Wenshan Wu, Ting Song, Man Lan, Furu Wei
This paper presents a comprehensive survey of the current status and opportunities for Large Language Models (LLMs) in strategic reasoning, a sophisticated form of reasoning that necessitates understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly.
no code implementations • 8 Mar 2024 • Adrian de Wynter
We show that GPT-4's reasoning and planning capabilities extend to the 1993 first-person shooter Doom.
2 code implementations • 11 Dec 2023 • Adrian de Wynter, Xun Wang, Qilong Gu, Si-Qing Chen
We call these approaches meta-prompting, or prompting to obtain prompts.
1 code implementation • 29 Sep 2023 • Adrian de Wynter, Tangming Yuan
We evaluate two large language models (LLMs) ability to perform argumentative reasoning.
no code implementations • 14 Sep 2023 • Rishav Hada, Varun Gumma, Adrian de Wynter, Harshita Diddee, Mohamed Ahmed, Monojit Choudhury, Kalika Bali, Sunayana Sitaram
Large Language Models (LLMs) excel in various Natural Language Processing (NLP) tasks, yet their evaluation, particularly in languages beyond the top $20$, remains inadequate due to existing benchmarks and metrics limitations.
1 code implementation • 15 Aug 2023 • Adrian de Wynter, Anthony Hevia, Si-Qing Chen
We present an evaluation of text simplification (TS) in Spanish for a production system, by means of two corpora focused in both complex-sentence and complex-word identification.
no code implementations • 17 Apr 2023 • Adrian de Wynter, Xun Wang, Alex Sokolov, Qilong Gu, Si-Qing Chen
We present an empirical evaluation of various outputs generated by nine of the most widely-available large language models (LLMs).
no code implementations • 29 Apr 2021 • Adrian de Wynter
We prove that three strategy video games from the Sid Meier's Civilization series: Sid Meier's Civilization: Beyond Earth, Sid Meier's Civilization V, and Sid Meier's Civilization VI, are Turing complete.
3 code implementations • 20 Oct 2020 • Adrian de Wynter, Daniel J. Perry
We extract an optimal subset of architectural parameters for the BERT architecture from Devlin et al. (2018) by applying recent breakthroughs in algorithms for neural architecture search.
no code implementations • 16 Oct 2020 • Adrian de Wynter
Our findings show that the presence of Mischief-generated adversarial samples in the test set significantly degrades (by up to $20\%$) the performance of these models with respect to their reported baselines.
2 code implementations • 16 Oct 2020 • Adrian de Wynter
We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate.
1 code implementation • 15 Oct 2020 • Adrian de Wynter
In the former case we obtain an asymptotic bound of $O\left(|\Theta^2|\left(\log{|\Theta|} + |\theta^2| + T_f\left(| D|\right)\right) + \bar{S}|\Theta||{E}|\right)$, where $|{\Theta}|$ is the cardinality of the set of hyperparameters $\theta$ to be searched; $|{E}|$ and $|{D}|$ are the sizes of the evaluation and training datasets, respectively; $\bar{S}$ and $\bar{f}$ are the inference times for the trained model and the candidate model; and $T_f({|{D}|})$ is a polynomial on $|{D}|$ and $\bar{f}$.
no code implementations • 26 Aug 2019 • Adrian de Wynter
For this, we first reformulate the function approximation problem in terms of sequences of functions, and we call it the Function Approximation (FA) problem; then we show that it is computationally infeasible to devise a procedure that solves FA for all functions to zero error, regardless of the search space.
no code implementations • 26 Aug 2019 • Adrian de Wynter, Lambert Mathias
This network projects the slot into an attribute space derived from the KB, and, by leveraging similarities in this space, we propose candidate slot keys and values to the dialogue state tracker.