Search Results for author: Chenguang Wang

Found 48 papers, 25 papers with code

基于风格化嵌入的中文文本风格迁移(Chinese text style transfer based on stylized embedding)

no code implementations CCL 2021 Chenguang Wang, Hongfei Lin, Liang Yang

“对话风格能够反映对话者的属性, 例如情感、性别和教育背景等。在对话系统中, 通过理解用户的对话风格, 能够更好地对用户进行建模。同样的, 面对不同背景的用户, 对话机器人也应该使用不同的语言风格与之交流。语言表达风格是文本的内在属性, 然而现有的大多数文本风格迁移研究, 集中在英文领域, 在中文领域则研究较少。本文构建了三个可用于中文文本风格迁移研究的数据集, 并将多种已有的文本风格迁移方法应用于该数据集。同时, 本文提出了基于DeepStyle算法与Transformer的风格迁移模型, 通过预训练可以获得不同风格的隐层向量表示。并基于Transformer构建生成端模型, 在解码阶段, 通过重建源文本的方式, 保留生成文本的内容信息, 并且引入对立风格的嵌入表示, 使得模型能够生成不同风格的文本。实验结果表明, 本文提出的模型在构建的中文数据集上均优于现有模型。”

Style Transfer Text Style Transfer

Humanity's Last Exam

no code implementations24 Jan 2025 Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Daron Anderson, Tung Nguyen, Mobeen Mahmood, Fiona Feng, Steven Y. Feng, Haoran Zhao, Michael Yu, Varun Gangal, Chelsea Zou, Zihan Wang, Jessica P. Wang, Pawan Kumar, Oleksandr Pokutnyi, Robert Gerbicz, Serguei Popov, John-Clark Levin, Mstyslav Kazakov, Johannes Schmitt, Geoff Galgon, Alvaro Sanchez, Yongki Lee, Will Yeadon, Scott Sauers, Marc Roth, Chidozie Agu, Søren Riis, Fabian Giska, Saiteja Utpala, Zachary Giboney, Gashaw M. Goshu, Joan of Arc Xavier, Sarah-Jane Crowson, Mohinder Maheshbhai Naiya, Noah Burns, Lennart Finke, Zerui Cheng, Hyunwoo Park, Francesco Fournier-Facio, John Wydallis, Mark Nandor, Ankit Singh, Tim Gehrunger, Jiaqi Cai, Ben McCarty, Darling Duclosel, Jungbae Nam, Jennifer Zampese, Ryan G. Hoerr, Aras Bacho, Gautier Abou Loume, Abdallah Galal, Hangrui Cao, Alexis C Garretson, Damien Sileo, Qiuyu Ren, Doru Cojoc, Pavel Arkhipov, Usman Qazi, Lianghui Li, Sumeet Motwani, Christian Schroeder de Witt, Edwin Taylor, Johannes Veith, Eric Singer, Taylor D. Hartman, Paolo Rissone, Jaehyeok Jin, Jack Wei Lun Shi, Chris G. Willcocks, Aleksandar Mikov, Ameya Prabhu, Longke Tang, Xavier Alapont, Justine Leon Uro, Kevin Zhou, Emily de Oliveira Santos, Andrey Pupasov Maksimov, Edward Vendrow, Kengo Zenitani, Julien Guillod, Yuqi Li, Joshua Vendrow, Vladyslav Kuchkin, Ng Ze-An, Pierre Marion, Denis Efremov, Jayson Lynch, Kaiqu Liang, Andrew Gritsevskiy, Dakotah Martinez, Ben Pageler, Nick Crispino, Dimitri Zvonkine, Natanael Wildner Fraga, Saeed Soori, Ori Press, Henry Tang, Julian Salazar, Sean R. Green, Lina Brüssel, Moon Twayana, Aymeric Dieuleveut, T. Ryan Rogers, Wenjin Zhang, Bikun Li, Jinzhou Yang, Arun Rao, Gabriel Loiseau, Mikhail Kalinin, Marco Lukas, Ciprian Manolescu, Subrata Mishra, Ariel Ghislain Kemogne Kamdoum, Tobias Kreiman, Tad Hogg, Alvin Jin, Carlo Bosio, Gongbo Sun, Brian P Coppola, Tim Tarver, Haline Heidinger, Rafael Sayous, Stefan Ivanov, Joseph M Cavanagh, Jiawei Shen, Joseph Marvin Imperial, Philippe Schwaller, Shaipranesh Senthilkuma, Andres M Bran, Ali Dehghan, Andres Algaba, Brecht Verbeken, David Noever, Ragavendran P V, Lisa Schut, Ilia Sucholutsky, Evgenii Zheltonozhskii, Derek Lim, Richard Stanley, Shankar Sivarajan, Tong Yang, John Maar, Julian Wykowski, Martí Oller, Jennifer Sandlin, Anmol Sahu, Yuzheng Hu, Sara Fish, Nasser Heydari, Archimedes Apronti, Kaivalya Rawal, Tobias Garcia Vilchis, Yuexuan Zu, Martin Lackner, James Koppel, Jeremy Nguyen, Daniil S. Antonenko, Steffi Chern, Bingchen Zhao, Pierrot Arsene, Alan Goldfarb, Sergey Ivanov, Rafał Poświata, Chenguang Wang, Daofeng Li, Donato Crisostomi, Andrea Achilleos, Benjamin Myklebust, Archan Sen, David Perrella, Nurdin Kaparov, Mark H Inlow, Allen Zang, Elliott Thornley, Daniil Orel, Vladislav Poritski, Shalev Ben-David, Zachary Berger, Parker Whitfill, Michael Foster, Daniel Munro, Linh Ho, Dan Bar Hava, Aleksey Kuchkin, Robert Lauff, David Holmes, Frank Sommerhage, Keith Schneider, Zakayo Kazibwe, Nate Stambaugh, Mukhwinder Singh, Ilias Magoulas, Don Clarke, Dae Hyun Kim, Felipe Meneguitti Dias, Veit Elser, Kanu Priya Agarwal, Victor Efren Guadarrama Vilchis, Immo Klose, Christoph Demian, Ujjwala Anantheswaran, Adam Zweiger, Guglielmo Albani, Jeffery Li, Nicolas Daans, Maksim Radionov, Václav Rozhoň, Ziqiao Ma, Christian Stump, Mohammed Berkani, Jacob Platnick, Volodymyr Nevirkovets, Luke Basler, Marco Piccardo, Ferenc Jeanplong, Niv Cohen, Josef Tkadlec, Paul Rosu, Piotr Padlewski, Stanislaw Barzowski, Kyle Montgomery, Aline Menezes, Arkil Patel, Zixuan Wang, Jamie Tucker-Foltz, Jack Stade, Tom Goertzen, Fereshteh Kazemi, Jeremiah Milbauer, John Arnold Ambay, Abhishek Shukla, Yan Carlos Leyva Labrador, Alan Givré, Hew Wolff, Vivien Rossbach, Muhammad Fayez Aziz, Younesse Kaddar, Yanxu Chen, Robin Zhang, Jiayi Pan, Antonio Terpin, Niklas Muennighoff, Hailey Schoelkopf, Eric Zheng, Avishy Carmi, Adam Jones, Jainam Shah, Ethan D. L. Brown, Kelin Zhu, Max Bartolo, Richard Wheeler, Andrew Ho, Shaul Barkan, Jiaqi Wang, Martin Stehberger, Egor Kretov, Kaustubh Sridhar, Zienab EL-Wasif, Anji Zhang, Daniel Pyda, Joanna Tam, David M. Cunningham, Vladimir Goryachev, Demosthenes Patramanis, Michael Krause, Andrew Redenti, Daniel Bugas, David Aldous, Jesyin Lai, Shannon Coleman, Mohsen Bahaloo, Jiangnan Xu, Sangwon Lee, Sandy Zhao, Ning Tang, Michael K. Cohen, Micah Carroll, Orr Paradise, Jan Hendrik Kirchner, Stefan Steinerberger, Maksym Ovchynnikov, Jason O. Matos, Adithya Shenoy, Benedito Alves de Oliveira Junior, Michael Wang, Yuzhou Nie, Paolo Giordano, Philipp Petersen, Anna Sztyber-Betley, Priti Shukla, Jonathan Crozier, Antonella Pinto, Shreyas Verma, Prashant Joshi, Zheng-Xin Yong, Allison Tee, Jérémy Andréoletti, Orion Weller, Raghav Singhal, Gang Zhang, Alexander Ivanov, Seri Khoury, Hamid Mostaghimi, Kunvar Thaman, Qijia Chen, Tran Quoc Khánh, Jacob Loader, Stefano Cavalleri, Hannah Szlyk, Zachary Brown, Jonathan Roberts, William Alley, Kunyang Sun, Ryan Stendall, Max Lamparth, Anka Reuel, Ting Wang, Hanmeng Xu, Sreenivas Goud Raparthi, Pablo Hernández-Cámara, Freddie Martin, Dmitry Malishev, Thomas Preu, Tomek Korbak, Marcus Abramovitch, Dominic Williamson, Ziye Chen, Biró Bálint, M Saiful Bari, Peyman Kassani, ZiHao Wang, Behzad Ansarinejad, Laxman Prasad Goswami, Yewen Sun, Hossam Elgnainy, Daniel Tordera, George Balabanian, Earth Anderson, Lynna Kvistad, Alejandro José Moyano, Rajat Maheshwari, Ahmad Sakor, Murat Eron, Isaac C. McAlister, Javier Gimenez, Innocent Enyekwe, Andrew Favre D. O., Shailesh Shah, Xiaoxiang Zhou, Firuz Kamalov, Ronald Clark, Sherwin Abdoli, Tim Santens, Khalida Meer, Harrison K Wang, Kalyan Ramakrishnan, Evan Chen, Alessandro Tomasiello, G. Bruno De Luca, Shi-Zhuo Looi, Vinh-Kha Le, Noam Kolt, Niels Mündler, Avi Semler, Emma Rodman, Jacob Drori, Carl J Fossum, Milind Jagota, Ronak Pradeep, Honglu Fan, Tej Shah, Jonathan Eicher, Michael Chen, Kushal Thaman, William Merrill, Carter Harris, Jason Gross, Ilya Gusev, Asankhaya Sharma, Shashank Agnihotri, Pavel Zhelnov, Siranut Usawasutsakorn, Mohammadreza Mofayezi, Sergei Bogdanov, Alexander Piperski, Marc Carauleanu, David K. Zhang, Dylan Ler, Roman Leventov, Ignat Soroko, Thorben Jansen, Pascal Lauer, Joshua Duersch, Vage Taamazyan, Wiktor Morak, Wenjie Ma, William Held, Tran Đuc Huy, Ruicheng Xian, Armel Randy Zebaze, Mohanad Mohamed, Julian Noah Leser, Michelle X Yuan, Laila Yacar, Johannes Lengler, Hossein Shahrtash, Edson Oliveira, Joseph W. Jackson, Daniel Espinosa Gonzalez, Andy Zou, Muthu Chidambaram, Timothy Manik, Hector Haffenden, Dashiell Stander, Ali Dasouqi, Alexander Shen, Emilien Duc, Bita Golshani, David Stap, Mikalai Uzhou, Alina Borisovna Zhidkovskaya, Lukas Lewark, Mátyás Vincze, Dustin Wehr, Colin Tang, Zaki Hossain, Shaun Phillips, Jiang Muzhen, Fredrik Ekström, Angela Hammon, Oam Patel, Nicolas Remy, Faraz Farhidi, George Medley, Forough Mohammadzadeh, Madellene Peñaflor, Haile Kassahun, Alena Friedrich, Claire Sparrow, Taom Sakal, Omkar Dhamane, Ali Khajegili Mirabadi, Eric Hallman, Mike Battaglia, Mohammad Maghsoudimehrabani, Hieu Hoang, Alon Amit, Dave Hulbert, Roberto Pereira, Simon Weber, Stephen Mensah, Nathan Andre, Anton Peristyy, Chris Harjadi, Himanshu Gupta, Stephen Malina, Samuel Albanie, Will Cai, Mustafa Mehkary, Frank Reidegeld, Anna-Katharina Dick, Cary Friday, Jasdeep Sidhu, Wanyoung Kim, Mariana Costa, Hubeyb Gurdogan, Brian Weber, Harsh Kumar, Tong Jiang, Arunim Agarwal, Chiara Ceconello, Warren S. Vaz, Chao Zhuang, Haon Park, Andrew R. Tawfeek, Daattavya Aggarwal, Michael Kirchhof, Linjie Dai, Evan Kim, Johan Ferret, Yuzhou Wang, Minghao Yan, Krzysztof Burdzy, Lixin Zhang, Antonio Franca, Diana T. Pham, Kang Yong Loh, Joshua Robinson, Shreen Gul, Gunjan Chhablani, Zhehang Du, Adrian Cosma, Colin White, Robin Riblet, Prajvi Saxena, Jacob Votava, Vladimir Vinnikov, Ethan Delaney, Shiv Halasyamani, Syed M. Shahid, Jean-Christophe Mourrat, Lavr Vetoshkin, Renas Bacho, Vincent Ginis, Aleksandr Maksapetyan, Florencia de la Rosa, Xiuyu Li, Guillaume Malod, Leon Lang, Julien Laurendeau, Fatimah Adesanya, Julien Portier, Lawrence Hollom, Victor Souza, Yuchen Anna Zhou, Yiğit Yalın, Gbenga Daniel Obikoya, Luca Arnaboldi, Rai, Filippo Bigi, Kaniuar Bacho, Pierre Clavier, Gabriel Recchia, Mara Popescu, Nikita Shulga, Ngefor Mildred Tanwie, Thomas C. H. Lux, Ben Rank, Colin Ni, Alesia Yakimchyk, Huanxu, Liu, Olle Häggström, Emil Verkama, Himanshu Narayan, Hans Gundlach, Leonor Brito-Santana, Brian Amaro, Vivek Vajipey, Rynaa Grover, Yiyang Fan, Gabriel Poesia Reis e Silva, Linwei Xin, Yosi Kratish, Jakub Łucki, Wen-Ding Li, Justin Xu, Kevin Joseph Scaria, Freddie Vargus, Farzad Habibi, Long, Lian, Emanuele Rodolà, Jules Robins, Vincent Cheng, Declan Grabb, Ida Bosio, Tony Fruhauff, Ido Akov, Eve J. Y. Lo, Hao Qi, Xi Jiang, Ben Segev, Jingxuan Fan, Sarah Martinson, Erik Y. Wang, Kaylie Hausknecht, Michael P. Brenner, Mao Mao, Yibo Jiang, Xinyu Zhang, David Avagian, Eshawn Jessica Scipio, Muhammad Rehan Siddiqi, Alon Ragoler, Justin Tan, Deepakkumar Patil, Rebeka Plecnik, Aaron Kirtland, Roselynn Grace Montecillo, Stephane Durand, Omer Faruk Bodur, Zahra Adoul, Mohamed Zekry, Guillaume Douville, Ali Karakoc, Tania C. B. Santos, Samir Shamseldeen, Loukmane Karim, Anna Liakhovitskaia, Nate Resman, Nicholas Farina, Juan Carlos Gonzalez, Gabe Maayan, Sarah Hoback, Rodrigo De Oliveira Pena, Glen Sherman, Hodjat Mariji, Rasoul Pouriamanesh, Wentao Wu, Gözdenur Demir, Sandra Mendoza, Ismail Alarab, Joshua Cole, Danyelle Ferreira, Bryan Johnson, Hsiaoyun Milliron, Mohammad Safdari, Liangti Dai, Siriphan Arthornthurasuk, Alexey Pronin, Jing Fan, Angel Ramirez-Trinidad, Ashley Cartwright, Daphiny Pottmaier, Omid Taheri, David Outevsky, Stanley Stepanic, Samuel Perry, Luke Askew, Raúl Adrián Huerta Rodríguez, Abdelkader Dendane, Sam Ali, Ricardo Lorena, Krishnamurthy Iyer, Sk Md Salauddin, Murat Islam, Juan Gonzalez, Josh Ducey, Russell Campbell, Maja Somrak, Vasilios Mavroudis, Eric Vergo, Juehang Qin, Benjámin Borbás, Eric Chu, Jack Lindsey, Anil Radhakrishnan, Antoine Jallon, I. M. J. McInnis, Alex Hoover, Sören Möller, Song Bian, John Lai, Tejal Patwardhan, Summer Yue, Alexandr Wang, Dan Hendrycks

However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities.

Humanity's Last Exam Language Modeling +4

MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models

1 code implementation15 Nov 2024 Jianhong Tu, Zhuohao Ni, Nicholas Crispino, Zihao Yu, Michael Bendersky, Beliz Gunel, Ruoxi Jia, Xin Liu, Lingjuan Lyu, Dawn Song, Chenguang Wang

With a small number of visual instructions, this emerging language instruction following ability transfers well to the unseen vision datasets, outperforming the state of the art with greater training efficiency.

Instruction Following Zero-shot Generalization

Rethinking the "Heatmap + Monte Carlo Tree Search" Paradigm for Solving Large Scale TSP

1 code implementation14 Nov 2024 Xuanhao Pan, Chenguang Wang, Chaolong Ying, Ye Xue, Tianshu Yu

The Travelling Salesman Problem (TSP) remains a fundamental challenge in combinatorial optimization, inspiring diverse algorithmic strategies.

Combinatorial Optimization

JudgeBench: A Benchmark for Evaluating LLM-based Judges

1 code implementation16 Oct 2024 Sijun Tan, Siyuan Zhuang, Kyle Montgomery, William Y. Tang, Alejandro Cuadron, Chenguang Wang, Raluca Ada Popa, Ion Stoica

To address this, we propose a novel evaluation framework to objectively evaluate LLM-based judges.

Math

An Efficient and Explainable Transformer-Based Few-Shot Learning for Modeling Electricity Consumption Profiles Across Thousands of Domains

1 code implementation15 Aug 2024 Weijie Xia, Gao Peng, Chenguang Wang, Peter Palensky, Eric Pauwels, Pedro P. Vergara

Nevertheless, standard FSL methods, such as those used for images, are unsuitable for ECP modeling because (1) these methods usually assume several source domains with sufficient data and several target domains.

Few-Shot Learning Transfer Learning

RuleR: Improving LLM Controllability by Rule-based Data Recycling

3 code implementations22 Jun 2024 Ming Li, Han Chen, Chenguang Wang, Dang Nguyen, Dianqi Li, Tianyi Zhou

Despite the remarkable advancement of Large language models (LLMs), they still lack delicate controllability under sophisticated constraints, which is critical to enhancing their response quality and the user experience.

Data Augmentation Instruction Following

Mosaic-IT: Free Compositional Data Augmentation Improves Instruction Tuning

3 code implementations22 May 2024 Ming Li, Pei Chen, Chenguang Wang, Hongyu Zhao, Yijun Liang, Yupeng Hou, Fuxiao Liu, Tianyi Zhou

Finetuning large language models with a variety of instruction-response pairs has enhanced their capability to understand and follow instructions.

Data Augmentation Diversity +1

A Flow-Based Model for Conditional and Probabilistic Electricity Consumption Profile Generation and Prediction

1 code implementation3 May 2024 Weijie Xia, Chenguang Wang, Peter Palensky, Pedro P. Vergara

Residential Load Profile (RLP) generation and prediction are critical for the operation and planning of distribution networks, especially as diverse low-carbon technologies (e. g., photovoltaic and electric vehicles) are increasingly adopted.

Load Forecasting Profile Generation

Measuring Social Norms of Large Language Models

no code implementations3 Apr 2024 Ye Yuan, Kexin Tang, Jianhao Shen, Ming Zhang, Chenguang Wang

This enables the direct comparison of the social understanding of large language models to humans, more specifically, elementary students.

Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study

1 code implementation15 Mar 2024 Chenguang Wang, Ruoxi Jia, Xin Liu, Dawn Song

We show that CLIP leads to a significant robustness drop compared to supervised ImageNet models on our benchmark, especially under synthetic distribution shift and adversarial attacks.

Benchmarking

Measuring Vision-Language STEM Skills of Neural Models

1 code implementation27 Feb 2024 Jianhao Shen, Ye Yuan, Srbuhi Mirzoyan, Ming Zhang, Chenguang Wang

Compared to existing datasets that often focus on examining expert-level ability, our dataset includes fundamental skills and questions designed based on the K-12 curriculum.

Multimodal Reasoning

Towards Principled Task Grouping for Multi-Task Learning

no code implementations23 Feb 2024 Chenguang Wang, Xuanhao Pan, Tianshu Yu

This paper presents a novel approach to task grouping in Multitask Learning (MTL), advancing beyond existing methods by addressing key theoretical and practical limitations.

Combinatorial Optimization Multi-Task Learning +1

Preference Poisoning Attacks on Reward Model Learning

no code implementations2 Feb 2024 Junlin Wu, Jiongxiao Wang, Chaowei Xiao, Chenguang Wang, Ning Zhang, Yevgeniy Vorobeychik

In addition, we observe that the simpler and more scalable rank-by-distance approaches are often competitive with, and on occasion significantly outperform, gradient-based methods.

model Recommendation Systems

Near-real-time Earthquake-induced Fatality Estimation using Crowdsourced Data and Large-Language Models

no code implementations4 Dec 2023 Chenguang Wang, Davis Engler, Xuechun Li, James Hou, David J. Wald, Kishor Jaiswal, Susu Xu

Traditional systems for estimating human loss in disasters often depend on manually collected early casualty reports from global media, a process that's labor-intensive and slow with notable time delays.

Few-Shot Learning

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

1 code implementation5 Oct 2023 Nicholas Crispino, Kyle Montgomery, Fankun Zeng, Dawn Song, Chenguang Wang

For instance, our method boosts the performance of state-of-the-art large language models by a large margin, including Vicuna-13b (13. 3%), Llama-2-70b-chat (23. 2%), and GPT-3. 5 Turbo (17. 0%).

Causality-informed Rapid Post-hurricane Building Damage Detection in Large Scale from InSAR Imagery

no code implementations2 Oct 2023 Chenguang Wang, Yepeng Liu, Xiaojian Zhang, Xuechun Li, Vladimir Paramygin, Arthriya Subgranon, Peter Sheng, Xilei Zhao, Susu Xu

We gathered and annotated building damage ground truth data in Lee County, Florida, and compared the introduced method's estimation results with the ground truth and benchmarked it against state-of-the-art models to assess the effectiveness of our proposed method.

Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study

1 code implementation ICCV 2023 Myeongseob Ko, Ming Jin, Chenguang Wang, Ruoxi Jia

Furthermore, our enhanced attacks outperform the baseline across multiple models and datasets, with the weakly supervised attack demonstrating an average-case performance improvement of $17\%$ and being at least $7$X more effective at low false-positive rates.

Efficient Training of Multi-task Combinarotial Neural Solver with Multi-armed Bandits

no code implementations10 May 2023 Chenguang Wang, Tianshu Yu

Efficiently training a multi-task neural solver for various combinatorial optimization problems (COPs) has been less studied so far.

Combinatorial Optimization Decoder +1

Targeted Analysis of High-Risk States Using an Oriented Variational Autoencoder

no code implementations20 Mar 2023 Chenguang Wang, Ensieh Sharifnia, Simon H. Tindemans, Peter Palensky

Variational autoencoder (VAE) neural networks can be trained to generate power system states that capture both marginal distribution and multivariate dependencies of historical data.

Vocal Bursts Intensity Prediction

ASP: Learn a Universal Neural Solver!

1 code implementation1 Mar 2023 Chenguang Wang, Zhouliang Yu, Stephen Mcaleer, Tianshu Yu, Yaodong Yang

Applying machine learning to combinatorial optimization problems has the potential to improve both efficiency and accuracy.

Combinatorial Optimization Traveling Salesman Problem

Comprehensive Analysis of Over-smoothing in Graph Neural Networks from Markov Chains Perspective

no code implementations12 Nov 2022 Weichen Zhao, Chenguang Wang, Congying Han, Tiande Guo

Results show that our proposed sufficient condition can effectively improve over-smoothing problem in operator-inconsistent GNN and enhance the performance of the model.

Attribute Graph Neural Network

Benchmarking Language Models for Code Syntax Understanding

1 code implementation26 Oct 2022 Da Shen, Xinyun Chen, Chenguang Wang, Koushik Sen, Dawn Song

Our key observation is that existing language models pretrained on code still lack the understanding of code syntax.

Benchmarking

IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models

no code implementations25 Oct 2022 Chenguang Wang, Xiao Liu, Dawn Song

Instead of focusing on pre-defined relations, we create an OIE benchmark aiming to fully examine the open relational information present in the pre-trained LMs.

Open Information Extraction

Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models

1 code implementation18 Oct 2022 Zhiyuan Zhang, Lingjuan Lyu, Xingjun Ma, Chenguang Wang, Xu sun

In this work, we take the first step to exploit the pre-trained (unfine-tuned) weights to mitigate backdoors in fine-tuned language models.

Language Modelling Sentence +4

Joint Language Semantic and Structure Embedding for Knowledge Graph Completion

1 code implementation COLING 2022 Jianhao Shen, Chenguang Wang, Linyuan Gong, Dawn Song

Unlike previous approaches that rely on either the structures or semantics of the knowledge graphs, we propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information.

Link Prediction

Generating Contextual Load Profiles Using a Conditional Variational Autoencoder

no code implementations8 Sep 2022 Chenguang Wang, Simon H. Tindemans, Peter Palensky

Generating power system states that have similar distribution and dependency to the historical ones is essential for the tasks of system planning and security assessment, especially when the historical data is insufficient.

Protecting Intellectual Property of Language Generation APIs with Lexical Watermark

1 code implementation5 Dec 2021 Xuanli He, Qiongkai Xu, Lingjuan Lyu, Fangzhao Wu, Chenguang Wang

Nowadays, due to the breakthrough in natural language generation (NLG), including machine translation, document summarization, image captioning, etc NLG models have been encapsulated in cloud APIs to serve over half a billion people worldwide and process over one hundred billion word generations per day.

Document Summarization Image Captioning +3

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

no code implementations28 Oct 2021 Chenguang Wang, Yaodong Yang, Oliver Slumbers, Congying Han, Tiande Guo, Haifeng Zhang, Jun Wang

In this paper, we introduce a two-player zero-sum framework between a trainable \emph{Solver} and a \emph{Data Generator} to improve the generalization ability of deep learning-based solvers for Traveling Salesman Problem (TSP).

Traveling Salesman Problem

Generating Multivariate Load States Using a Conditional Variational Autoencoder

1 code implementation21 Oct 2021 Chenguang Wang, Ensieh Sharifnia, Zhi Gao, Simon H. Tindemans, Peter Palensky

In this paper, a multivariate load state generating model on the basis of a conditional variational autoencoder (CVAE) neural network is proposed.

Learning Graph Representation by Aggregating Subgraphs via Mutual Information Maximization

no code implementations24 Mar 2021 Chenguang Wang, Ziwen Liu

For this purpose, we propose a universal framework to generate subgraphs in an auto-regressive way and then using these subgraphs to guide the learning of graph representation by Graph Neural Networks.

Attribute Contrastive Learning +2

Language Models are Open Knowledge Graphs

2 code implementations22 Oct 2020 Chenguang Wang, Xiao Liu, Dawn Song

This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e. g., BERT, GPT-2/3), without human supervision.

Knowledge Graphs

Training Strategies for Autoencoder-based Detection of False Data Injection Attacks

no code implementations14 May 2020 Chenguang Wang, Kaikai Pan, Simon Tindemans, Peter Palensky

The security of energy supply in a power grid critically depends on the ability to accurately estimate the state of the system.

Detection of False Data Injection Attacks Using the Autoencoder Approach

no code implementations4 Mar 2020 Chenguang Wang, Simon Tindemans, Kaikai Pan, Peter Palensky

State estimation is of considerable significance for the power system operation and control.

Transformer on a Diet

1 code implementation14 Feb 2020 Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, Alexander J. Smola

Transformer has been widely used thanks to its ability to capture sequence information in an efficient way.

Language Modeling Language Modelling

Meta-Path Constrained Random Walk Inference for Large-Scale Heterogeneous Information Networks

no code implementations2 Dec 2019 Chenguang Wang

It is impractical for users to provide the meta-path(s) to support the large scale inference, and biased examples will result in incorrect meta-path based inference, thus limit the power of the meta-path.

PoD: Positional Dependency-Based Word Embedding for Aspect Term Extraction

no code implementations COLING 2020 Yichun Yin, Chenguang Wang, Ming Zhang

Dependency context-based word embedding jointly learns the representations of word and dependency context, and has been proved effective in aspect term extraction.

Aspect Term Extraction and Sentiment Classification POS +2

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

3 code implementations9 Jul 2019 Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).

Deep Learning

Language Models with Transformers

1 code implementation arXiv 2019 Chenguang Wang, Mu Li, Alexander J. Smola

In this paper, we explore effective Transformer architectures for language model, including adding additional LSTM layers to better capture the sequential context while still keeping the computation efficient.

Ranked #2 on Language Modelling on Penn Treebank (Word Level) (using extra training data)

Computational Efficiency Language Modeling +2

World Knowledge as Indirect Supervision for Document Clustering

no code implementations30 Jul 2016 Chenguang Wang, Yangqiu Song, Dan Roth, Ming Zhang, Jiawei Han

We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network.

Clustering World Knowledge

Cannot find the paper you are looking for? You can Submit a new open access paper.