no code implementations • 23 May 2025 • Tijmen de Haan, Yuan-Sen Ting, Tirthankar Ghosal, Tuan Dung Nguyen, Alberto Accomazzi, Emily Herron, Vanessa Lama, Rui Pan, Azton Wells, Nesar Ramachandra
General-purpose large language models, despite their broad capabilities, often struggle with specialized domain knowledge, a limitation particularly pronounced in more accessible, lower-parameter versions.
no code implementations • 15 May 2025 • Tuan Dung Nguyen, Duncan J. Watts, Mark E. Whiting
Commonsense intelligence in machines is often assessed by static benchmarks that compare a model's output against human-prescribed correct labels.
no code implementations • 29 Sep 2024 • Rui Pan, Tuan Dung Nguyen, Hardik Arora, Alberto Accomazzi, Tirthankar Ghosal, Yuan-Sen Ting
We find that the previously released AstroLLaMA series, based on LLaMA-2-7B, underperforms compared to the base model.
no code implementations • 15 Jul 2024 • Yuan-Sen Ting, Tuan Dung Nguyen, Tirthankar Ghosal, Rui Pan, Hardik Arora, Zechang Sun, Tijmen de Haan, Nesar Ramachandra, Azton Wells, Sandeep Madireddy, Alberto Accomazzi
This dataset comprises 4, 425 multiple-choice questions curated from the Annual Review of Astronomy and Astrophysics, covering a broad range of astrophysical topics.
1 code implementation • 10 Jul 2024 • Tung-Anh Nguyen, Long Tan Le, Tuan Dung Nguyen, Wei Bao, Suranga Seneviratne, Choong Seon Hong, Nguyen H. Tran
Experimental results on the UNSW-NB15 and TON-IoT datasets show that our proposed methods offer performance in anomaly detection comparable to nonlinear baselines, while providing significant improvements in communication and memory efficiency, underscoring their potential for securing IoT networks.
no code implementations • 3 Jan 2024 • Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charlie O'Neill, Maja Jablonska, Zechang Sun, Michael J. Smith, Huiling Liu, Kevin Schawinski, Kartheik Iyer, Ioana Ciucă for UniverseTBD
We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training.
1 code implementation • 21 Dec 2023 • Anh Duc Nguyen, Tuan Dung Nguyen, Quang Minh Nguyen, Hoang H. Nguyen, Lam M. Nguyen, Kim-Chuan Toh
This paper studies the Partial Optimal Transport (POT) problem between two unbalanced measures with at most $n$ supports and its applications in various AI tasks such as color transfer or domain adaptation.
no code implementations • 27 Sep 2023 • Long Tan Le, Tuan Dung Nguyen, Tung-Anh Nguyen, Choong Seon Hong, Suranga Seneviratne, Wei Bao, Nguyen H. Tran
Federated Learning (FL) has emerged as a groundbreaking distributed learning paradigm enabling clients to train a global model collaboratively without exchanging data.
no code implementations • 12 Sep 2023 • Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciucă, Charlie O'Neill, Ze-Chang Sun, Maja Jabłońska, Sandor Kruk, Ernest Perkowski, Jack Miller, Jason Li, Josh Peek, Kartheik Iyer, Tomasz Różański, Pranav Khetarpal, Sharaf Zaman, David Brodrick, Sergio J. Rodríguez Méndez, Thang Bui, Alyssa Goodman, Alberto Accomazzi, Jill Naiman, Jesse Cranney, Kevin Schawinski, UniverseTBD
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy.
no code implementations • 3 Jun 2022 • Tung-Anh Nguyen, Tuan Dung Nguyen, Long Tan Le, Canh T. Dinh, Nguyen H. Tran
We show that the robustness of WAFL is more general than related approaches, and the generalization bound is robust to all adversarial distributions inside the Wasserstein ball (ambiguity set).
2 code implementations • 10 Dec 2020 • Canh T. Dinh, Nguyen H. Tran, Tuan Dung Nguyen, Wei Bao, Amir Rezaei Balef, Bing B. Zhou, Albert Y. Zomaya
In this work, we propose DONE, a distributed approximate Newton-type algorithm with fast convergence rate for communication-efficient federated edge learning.
4 code implementations • NeurIPS 2020 • Canh T. Dinh, Nguyen H. Tran, Tuan Dung Nguyen
Federated learning (FL) is a decentralized and privacy-preserving machine learning technique in which a group of clients collaborate with a server to learn a global model without sharing clients' data.