Search Results for author: Zhilin Wang

Found 33 papers, 12 papers with code

Uncovering Surprising Event Boundaries in Narratives

no code implementations NAACL (WNU) 2022 Zhilin Wang, Anna Jafarpour, Maarten Sap

It is important to define meaningful and interpretable automatic evaluation metrics for open-domain dialog research.

Open-Domain Dialog Text Generation

How to be Helpful on Online Support Forums?

no code implementations NAACL (WNU) 2022 Zhilin Wang, Pablo E. Torres

Internet forums such as Reddit offer people a platform to ask for advice when they encounter various issues at work, school or in relationships.

Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas

1 code implementation20 May 2025 Yu Ying Chiu, Zhilin Wang, Sharan Maiya, Yejin Choi, Kyle Fish, Sydney Levine, Evan Hubinger

Detecting AI risks becomes more challenging as stronger models emerge and find novel methods such as Alignment Faking to circumvent these detection attempts.

HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages

no code implementations16 May 2025 Zhilin Wang, Jiaqi Zeng, Olivier Delalleau, Hoo-chang Shin, Felipe Soares, Alexander Bukharin, Ellie Evans, Yi Dong, Oleksii Kuchaiev

To address this need, we introduce HelpSteer3-Preference, a permissively licensed (CC-BY-4. 0), high-quality, human-annotated preference dataset comprising of over 40, 000 samples.

Diversity Instruction Following

Llama-Nemotron: Efficient Reasoning Models

no code implementations2 May 2025 Akhiad Bercovich, Itay Levy, Izik Golan, Mohammad Dabbah, Ran El-Yaniv, Omri Puny, Ido Galil, Zach Moshe, Tomer Ronen, Najeeb Nabwani, Ido Shahaf, Oren Tropp, Ehud Karpas, Ran Zilberstein, Jiaqi Zeng, Soumye Singhal, Alexander Bukharin, Yian Zhang, Tugrul Konuk, Gerald Shen, Ameya Sunil Mahabaleshwarkar, Bilal Kartal, Yoshi Suhara, Olivier Delalleau, Zijia Chen, Zhilin Wang, David Mosallanezhad, Adi Renduchintala, Haifeng Qian, Dima Rekesh, Fei Jia, Somshubra Majumdar, Vahid Noroozi, Wasi Uddin Ahmad, Sean Narenthiran, Aleksander Ficek, Mehrzad Samadi, Jocelyn Huang, Siddhartha Jain, Igor Gitman, Ivan Moshkov, Wei Du, Shubham Toshniwal, George Armstrong, Branislav Kisacanin, Matvei Novikov, Daria Gitman, Evelina Bakhturina, Jane Polak Scowcroft, John Kamalu, Dan Su, Kezhi Kong, Markus Kliegl, Rabeeh Karimi, Ying Lin, Sanjeev Satheesh, Jupinder Parmar, Pritam Gundecha, Brandon Norick, Joseph Jennings, Shrimai Prabhumoye, Syeda Nahida Akter, Mostofa Patwary, Abhinav Khattar, Deepak Narayanan, Roger Waleffe, Jimmy Zhang, Bor-Yiing Su, Guyue Huang, Terry Kong, Parth Chadha, Sahil Jain, Christine Harvey, Elad Segal, Jining Huang, Sergey Kashirsky, Robert McQueen, Izzy Putterman, George Lam, Arun Venkatesan, Sherry Wu, Vinh Nguyen, Manoj Kilaru, Andrew Wang, Anna Warno, Abhilash Somasamudramath, Sandip Bhaskar, Maka Dong, Nave Assaf, Shahar Mor, Omer Ullman Argov, Scot Junkin, Oleksandr Romanenko, Pedro Larroy, Marco Rovinelli, Viji Balas, Nicholas Edelman, Anahita Bhiwandiwalla, Muthu Subramaniam, Smita Ithape, Karthik Ramamoorthy, Yuting Wu, Suguna Varshini Velury, Omri Almog, Joyjit Daw, Denys Fridman, Erick Galinkin, Michael Evans, Shaona Ghosh, Katherine Luna, Leon Derczynski, Nikki Pope, Eileen Long, Seth Schneider, Guillermo Siman, Tomasz Grzegorzek, Pablo Ribalta, Monika Katariya, Chris Alexiuk, Joey Conway, Trisha Saar, Ann Guan, Krzysztof Pawelec, Shyamala Prayaga, Oleksii Kuchaiev, Boris Ginsburg, Oluwatobi Olabiyi, Kari Briski, Jonathan Cohen, Bryan Catanzaro, Jonah Alben, Yonatan Geifman, Eric Chung

We introduce the Llama-Nemotron series of models, an open family of heterogeneous reasoning models that deliver exceptional reasoning capabilities, inference efficiency, and an open license for enterprise use.

Knowledge Distillation Neural Architecture Search

SEE: Continual Fine-tuning with Sequential Ensemble of Experts

1 code implementation9 Apr 2025 Zhilin Wang, Yafu Li, Xiaoye Qu, Yu Cheng

Some approaches use routers to assign tasks to experts, but in continual learning, they often require retraining for optimal performance.

Continual Learning Multi-Task Learning

HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks

no code implementations6 Mar 2025 Zhilin Wang, Jiaqi Zeng, Olivier Delalleau, Daniel Egert, Ellie Evans, Hoo-chang Shin, Felipe Soares, Yi Dong, Oleksii Kuchaiev

To this end, we collect HelpSteer3 data to train dedicated Feedback and Edit Models that are capable of performing inference-time scaling for open-ended general-domain tasks.

Chatbot Logical Reasoning +1

Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing

no code implementations21 Feb 2025 Zhilin Wang, Yafu Li, Jianhao Yan, Yu Cheng, Yue Zhang

Dynamical systems theory provides a framework for analyzing iterative processes and evolution over time.

Diversity

From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning

1 code implementation21 Jan 2025 Yafu Li, Zhilin Wang, Tingchen Fu, Ganqu Cui, Sen yang, Yu Cheng

Scaling data and model size has been proven effective for boosting the performance of large language models.

Diverging Preferences: When do Annotators Disagree and do Models Know?

no code implementations18 Oct 2024 Michael JQ Zhang, Zhilin Wang, Jena D. Hwang, Yi Dong, Olivier Delalleau, Yejin Choi, Eunsol Choi, Xiang Ren, Valentina Pyatkin

We find that the majority of disagreements are in opposition with standard reward modeling approaches, which are designed with the assumption that annotator disagreement is noise.

HelpSteer2-Preference: Complementing Ratings with Preferences

no code implementations2 Oct 2024 Zhilin Wang, Alexander Bukharin, Olivier Delalleau, Daniel Egert, Gerald Shen, Jiaqi Zeng, Oleksii Kuchaiev, Yi Dong

Reward models are critical for aligning models to follow instructions, and are typically trained following one of two popular paradigms: Bradley-Terry style or Regression style.

regression

Towards Building a Robust Knowledge Intensive Question Answering Model with Large Language Models

no code implementations9 Sep 2024 Xingyun Hong, Yan Shao, Zhilin Wang, Manni Duan, Jin Xiongnan

The development of LLMs has greatly enhanced the intelligence and fluency of question answering, while the emergence of retrieval enhancement has enabled models to better utilize external information.

Contrastive Learning Data Augmentation +2

Data, Data Everywhere: A Guide for Pretraining Dataset Construction

no code implementations8 Jul 2024 Jupinder Parmar, Shrimai Prabhumoye, Joseph Jennings, Bo Liu, Aastha Jhunjhunwala, Zhilin Wang, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro

The impressive capabilities of recent language models can be largely attributed to the multi-trillion token pretraining datasets that they are trained on.

Attribute

Nemotron-4 340B Technical Report

1 code implementation17 Jun 2024 Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek, Robert Hero, Jining Huang, Vibhu Jawa, Joseph Jennings, Aastha Jhunjhunwala, John Kamalu, Sadaf Khan, Oleksii Kuchaiev, Patrick Legresley, Hui Li, Jiwei Liu, Zihan Liu, Eileen Long, Ameya Sunil Mahabaleshwarkar, Somshubra Majumdar, James Maki, Miguel Martinez, Maer Rodrigues de Melo, Ivan Moshkov, Deepak Narayanan, Sean Narenthiran, Jesus Navarro, Phong Nguyen, Osvald Nitski, Vahid Noroozi, Guruprasad Nutheti, Christopher Parisien, Jupinder Parmar, Mostofa Patwary, Krzysztof Pawelec, Wei Ping, Shrimai Prabhumoye, Rajarshi Roy, Trisha Saar, Vasanth Rao Naik Sabavat, Sanjeev Satheesh, Jane Polak Scowcroft, Jason Sewall, Pavel Shamis, Gerald Shen, Mohammad Shoeybi, Dave Sizer, Misha Smelyanskiy, Felipe Soares, Makesh Narsimhan Sreedhar, Dan Su, Sandeep Subramanian, Shengyang Sun, Shubham Toshniwal, Hao Wang, Zhilin Wang, Jiaxuan You, Jiaqi Zeng, Jimmy Zhang, Jing Zhang, Vivienne Zhang, Yian Zhang, Chen Zhu

We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward.

Synthetic Data Generation

HelpSteer2: Open-source dataset for training top-performing reward models

1 code implementation12 Jun 2024 Zhilin Wang, Yi Dong, Olivier Delalleau, Jiaqi Zeng, Gerald Shen, Daniel Egert, Jimmy J. Zhang, Makesh Narsimhan Sreedhar, Oleksii Kuchaiev

Using a powerful internal base model trained on HelpSteer2, we are able to achieve the SOTA score (92. 0%) on Reward-Bench's primary dataset, outperforming currently listed open and proprietary models, as of June 12th, 2024.

Attribute

Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text

1 code implementation21 May 2024 Yafu Li, Zhilin Wang, Leyang Cui, Wei Bi, Shuming Shi, Yue Zhang

To this end, we propose a novel detection framework, paraphrased text span detection (PTD), aiming to identify paraphrased text spans within a text.

Diversity Text Detection

HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM

no code implementations16 Nov 2023 Zhilin Wang, Yi Dong, Jiaqi Zeng, Virginia Adams, Makesh Narsimhan Sreedhar, Daniel Egert, Olivier Delalleau, Jane Polak Scowcroft, Neel Kant, Aidan Swope, Oleksii Kuchaiev

To alleviate this problem, we collect HelpSteer, a multi-attribute helpfulness dataset annotated for the various aspects that make responses helpful.

Attribute

Can We Trust the Similarity Measurement in Federated Learning?

no code implementations20 Oct 2023 Zhilin Wang, Qin Hu, Xukai Zou

We first uncover the deficiencies of similarity metrics that high-dimensional local models, including benign and poisoned models, may be evaluated to have the same similarity while being significantly different in the parameter values.

Federated Learning Model Poisoning

SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF

no code implementations9 Oct 2023 Yi Dong, Zhilin Wang, Makesh Narsimhan Sreedhar, Xianchao Wu, Oleksii Kuchaiev

Model alignment with human preferences is an essential step in making Large Language Models (LLMs) helpful and consistent with human values.

Attribute

Humanoid Agents: Platform for Simulating Human-like Generative Agents

1 code implementation9 Oct 2023 Zhilin Wang, Yu Ying Chiu, Yu Cheung Chiu

Just as computational simulations of atoms, molecules and cells have shaped the way we study the sciences, true-to-life simulations of human-like agents can be valuable tools for studying human behavior.

Unity

FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering

no code implementations19 Mar 2023 Weizhe Lin, Zhilin Wang, Bill Byrne

The widely used Fact-based Visual Question Answering (FVQA) dataset contains visually-grounded questions that require information retrieval using common sense knowledge graphs to answer.

Common Sense Reasoning Information Retrieval +4

Transformer-Empowered Content-Aware Collaborative Filtering

no code implementations2 Apr 2022 Weizhe Lin, Linjun Shou, Ming Gong, Pei Jian, Zhilin Wang, Bill Byrne, Daxin Jiang

Knowledge graph (KG) based Collaborative Filtering is an effective approach to personalizing recommendation systems for relatively static domains such as movies and books, by leveraging structured information from KG to enrich both item and user representations.

Collaborative Filtering Contrastive Learning +1

Incentive Mechanism Design for Joint Resource Allocation in Blockchain-based Federated Learning

no code implementations18 Feb 2022 Zhilin Wang, Qin Hu, Ruinian Li, Minghui Xu, Zehui Xiong

Since each client has a limited amount of computing resources, the problem of allocating computing resources into training and mining needs to be carefully addressed.

Federated Learning

Blockchain and Federated Edge Learning for Privacy-Preserving Mobile Crowdsensing

no code implementations16 Oct 2021 Qin Hu, Zhilin Wang, Minghui Xu, Xiuzhen Cheng

Mobile crowdsensing (MCS) counting on the mobility of massive workers helps the requestor accomplish various sensing tasks with more flexibility and lower cost.

Federated Learning Privacy Preserving

A Systematic Survey of Blockchained Federated Learning

no code implementations5 Oct 2021 Zhilin Wang, Qin Hu, Minghui Xu, Yan Zhuang, Yawei Wang, Xiuzhen Cheng

Then, we analyze the concrete functions of BCFL from the perspective of mechanism design and illustrate what problems blockchain addresses specifically for FL.

BIG-bench Machine Learning Federated Learning +1

Extracting and Inferring Personal Attributes from Dialogue

1 code implementation NLP4ConvAI (ACL) 2022 Zhilin Wang, Xuhui Zhou, Rik Koncel-Kedziorski, Alex Marin, Fei Xia

Personal attributes represent structured information about a person, such as their hobbies, pets, family, likes and dislikes.

Attribute Language Modeling +1

Learning Similarity between Movie Characters and Its Potential Implications on Understanding Human Experiences

no code implementations NAACL (NUSE) 2021 Zhilin Wang, Weizhe Lin, Xiaodong Wu

While many different aspects of human experiences have been studied by the NLP community, none has captured its full richness.

NASNet: A Neuron Attention Stage-by-Stage Net for Single Image Deraining

3 code implementations6 Dec 2019 Xu Qin, Zhilin Wang

In this paper, we propose a novel end-to-end Neuron Attention Stage-by-Stage Net (NASNet), which can solve all types of rain model tasks efficiently.

Single Image Deraining

FFA-Net: Feature Fusion Attention Network for Single Image Dehazing

4 code implementations18 Nov 2019 Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, Huizhu Jia

The FFA-Net architecture consists of three key components: 1) A novel Feature Attention (FA) module combines Channel Attention with Pixel Attention mechanism, considering that different channel-wise features contain totally different weighted information and haze distribution is uneven on the different image pixels.

Image Dehazing Single Image Dehazing

No, you're not alone: A better way to find people with similar experiences on Reddit

no code implementations WS 2019 Zhilin Wang, Elena Rastorgueva, Weizhe Lin, Xiaodong Wu

This model is built upon the BERT Next Sentence Prediction model and reduces the time complexity for clustering all posts in a corpus from O(n{\^{}}2) to O(n) with respect to the number of posts.

Clustering Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.