Search Results for author: Markel Sanz Ausin

Found 4 papers, 1 papers with code

NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment

1 code implementation • 2 May 2024 • Gerald Shen, Zhilin Wang, Olivier Delalleau, Jiaqi Zeng, Yi Dong, Daniel Egert, Shengyang Sun, Jimmy Zhang, Sahil Jain, Ali Taghibakhshi, Markel Sanz Ausin, Ashwath Aithal, Oleksii Kuchaiev

However, building efficient tools to perform alignment can be challenging, especially for the largest and most competent LLMs which often contain tens or hundreds of billions of parameters.

258

Paper
Code

Polaris: A Safety-focused LLM Constellation Architecture for Healthcare

no code implementations • 20 Mar 2024 • Subhabrata Mukherjee, Paul Gamble, Markel Sanz Ausin, Neel Kant, Kriti Aggarwal, Neha Manjunath, Debajyoti Datta, Zhengliang Liu, Jiayuan Ding, Sophia Busacca, Cezanne Bianco, Swapnil Sharma, Rae Lasko, Michelle Voisard, Sanchay Harneja, Darya Filippova, Gerry Meixiong, Kevin Cha, Amir Youssefi, Meyhaa Buvanesh, Howard Weingram, Sebastian Bierman-Lytle, Harpreet Singh Mangat, Kim Parikh, Saad Godil, Alex Miller

We train our models on proprietary data, clinical care plans, healthcare regulatory documents, medical manuals, and other medical reasoning documents.

Question Answering

Paper
Add Code

HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare

no code implementations • 18 Feb 2023 • Ge Gao, Song Ju, Markel Sanz Ausin, Min Chi

Reinforcement learning (RL) has been extensively researched for enhancing human-environment interactions in various human-centric tasks, including e-learning and healthcare.

Off-policy evaluation

Paper
Add Code

InferNet for Delayed Reinforcement Tasks: Addressing the Temporal Credit Assignment Problem

no code implementations • 2 May 2021 • Markel Sanz Ausin, Hamoon Azizsoltani, Song Ju, Yeo Jin Kim, Min Chi

Overall, our results show that the effectiveness of InferNet is robust against noisy reward functions and is an effective add-on mechanism for solving temporal CAP in a wide range of RL tasks, from classic RL simulation environments to a real-world RL problem and for both online and offline learning.

Atari Games Offline RL +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.