Search Results for author: Jacques Klein

Found 22 papers, 10 papers with code

LuxemBERT: Simple and Practical Data Augmentation in Language Model Pre-Training for Luxembourgish

no code implementations • LREC 2022 • Cedric Lothritz, Bertrand Lebichot, Kevin Allix, Lisa Veiber, Tegawende Bissyande, Jacques Klein, Andrey Boytsov, Clément Lefebvre, Anne Goujon

Pre-trained Language Models such as BERT have become ubiquitous in NLP where they have achieved state-of-the-art performance in most NLP tasks.

Data Augmentation Language Modelling

Paper
Add Code

Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance

no code implementations • 12 Apr 2024 • Yewei Song, Cedric Lothritz, Daniel Tang, Tegawendé F. Bissyandé, Jacques Klein

This paper revisits recent code similarity evaluation metrics, particularly focusing on the application of Abstract Syntax Tree (AST) editing distance in diverse programming languages.

Paper
Add Code

Open-Source AI-based SE Tools: Opportunities and Challenges of Collaborative Software Learning

no code implementations • 9 Apr 2024 • ZhiHao Lin, Wei Ma, Tao Lin, Yaowen Zheng, Jingquan Ge, Jun Wang, Jacques Klein, Tegawende Bissyande, Yang Liu, Li Li

We introduce a governance framework centered on federated learning (FL), designed to foster the joint development and maintenance of open-source AI code models while safeguarding data privacy and security.

Federated Learning

Paper
Add Code

Soft Prompt Tuning for Cross-Lingual Transfer: When Less is More

1 code implementation • 6 Feb 2024 • Fred Philippy, Siwen Guo, Shohreh Haddadan, Cedric Lothritz, Jacques Klein, Tegawendé F. Bissyandé

Soft Prompt Tuning (SPT) is a parameter-efficient method for adapting pre-trained language models (PLMs) to specific tasks by inserting learnable embeddings, or soft prompts, at the input layer of the PLM, without modifying its parameters.

Cross-Lingual Transfer

Paper
Code

Just-in-Time Security Patch Detection -- LLM At the Rescue for Data Augmentation

no code implementations • 2 Dec 2023 • Xunzhu Tang, Zhenghan Chen, Kisub Kim, Haoye Tian, Saad Ezzini, Jacques Klein

To address this pressing issue, we introduce a novel security patch detection system, LLMDA, which capitalizes on Large Language Models (LLMs) and code-text alignment methodologies for patch review, data enhancement, and feature combination.

Contrastive Learning Data Augmentation

Paper
Add Code

LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning

no code implementations • 30 Jul 2023 • Tiezhu Sun, Weiguo Pian, Nadia Daoudi, Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein

This efficiency, coupled with its state-of-the-art performance, highlights LaFiCMIL's potential as a groundbreaking approach in the field of large file classification.

Android Malware Detection Defect Detection +5

Paper
Add Code

Is ChatGPT the Ultimate Programming Assistant -- How far is it?

no code implementations • 24 Apr 2023 • Haoye Tian, Weiqi Lu, Tsz On Li, Xunzhu Tang, Shing-Chi Cheung, Jacques Klein, Tegawendé F. Bissyandé

To assess the feasibility of using an LLM as a useful assistant bot for programmers, we must assess its realistic capabilities on unseen problems as well as its capabilities on various tasks.

Code Generation Code Summarization +2

Paper
Add Code

Letz Translate: Low-Resource Machine Translation for Luxembourgish

no code implementations • 2 Mar 2023 • Yewei Song, Saad Ezzini, Jacques Klein, Tegawende Bissyande, Clément Lefebvre, Anne Goujon

We also make use of high-resource languages that are related or share the same linguistic root as the target LRL.

Knowledge Distillation Machine Translation +1

Paper
Add Code

App Review Driven Collaborative Bug Finding

1 code implementation • 7 Jan 2023 • Xunzhu Tang, Haoye Tian, Pingfan Kong, Kui Liu, Jacques Klein, Tegawendé F. Bissyande

Our novelty is that we guide the bug finding process by considering that existing bugs have been hinted within app reviews.

Paper
Code

DexBERT: Effective, Task-Agnostic and Fine-grained Representation Learning of Android Bytecode

1 code implementation • 12 Dec 2022 • Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyandé, Jacques Klein

Central to applying ML to software artifacts (like source or executable code) is converting them into forms suitable for learning.

Language Modelling Representation Learning

Paper
Code

AI-driven Mobile Apps: an Explorative Study

1 code implementation • 3 Dec 2022 • Yinghua Li, Xueqi Dang, Haoye Tian, Tiezhu Sun, Zhijie Wang, Lei Ma, Jacques Klein, Tegawende F. Bissyande

In this paper, we conduct the most extensive empirical study on 56, 682 published AI apps from three perspectives: dataset characteristics, development issues, and user feedback and privacy.

Paper
Code

Is this Change the Answer to that Problem? Correlating Descriptions of Bug and Code Changes for Evaluating Patch Correctness

1 code implementation • 8 Aug 2022 • Haoye Tian, Xunzhu Tang, Andrew Habib, Shangwen Wang, Kui Liu, Xin Xia, Jacques Klein, Tegawendé F. Bissyandé

To tackle this problem, our intuition is that natural language processing can provide the necessary representations and models for assessing the semantic correlation between a bug (question) and a patch (answer).

Question Answering

Paper
Code

MetaTPTrans: A Meta Learning Approach for Multilingual Code Representation Learning

1 code implementation • 13 Jun 2022 • Weiguo Pian, Hanyu Peng, Xunzhu Tang, Tiezhu Sun, Haoye Tian, Andrew Habib, Jacques Klein, Tegawendé F. Bissyandé

Representation learning of source code is essential for applying machine learning to software engineering tasks.

Code Completion Code Summarization +2

Paper
Code

A two-steps approach to improve the performance of Android malware detectors

no code implementations • 17 May 2022 • Nadia Daoudi, Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein

For the subset of "difficult" samples, we rely on GUIDED RETRAINING, which leverages the correct predictions and the errors made by the base malware detector to guide the retraining process.

Android Malware Detection Binary Classification +4

Paper
Add Code

Early Detection of Security-Relevant Bug Reports using Machine Learning: How Far Are We?

no code implementations • 19 Dec 2021 • Arthur D. Sawadogo, Quentin Guimard, Tegawendé F. Bissyandé, Abdoul Kader Kaboré, Jacques Klein, Naouel Moha

Bug reports are common artefacts in software development.

TAG

Paper
Add Code

DexRay: A Simple, yet Effective Deep Learning Approach to Android Malware Detection based on Image Representation of Bytecode

1 code implementation • 5 Sep 2021 • Nadia Daoudi, Jordan Samhi, Abdoul Kader Kabore, Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein

This work-in-progress paper contributes to the domain of Deep Learning based Malware detection by providing a sound, simple, yet effective approach (with available artefacts) that can be the basis to scope the many profound questions that will need to be investigated to fully develop this domain.

Android Malware Detection Malware Detection +1

Paper
Code

Predicting Patch Correctness Based on the Similarity of Failing Test Cases

1 code implementation • 28 Jul 2021 • Haoye Tian, Yinghua Li, Weiguo Pian, Abdoul Kader Kaboré, Kui Liu, Andrew Habib, Jacques Klein, Tegawendé F. Bissyande

Then, after collecting a large dataset of 1278 plausible patches (written by developers or generated by some 32 APR tools), we use BATS to predict correctness: BATS achieves an AUC between 0. 557 to 0. 718 and a recall between 0. 562 and 0. 854 in identifying correct patches.

Representation Learning

Paper
Code

IBIR: Bug Report driven Fault Injection

no code implementations • 11 Dec 2020 • Ahmed Khanfir, Anil Koyuncu, Mike Papadakis, Maxime Cordy, Tegawendé F. Bissyandé, Jacques Klein, Yves Le Traon

It remains indeed challenging to inject few but realistic faults that target a particular functionality in a program.

Fault Detection Program Repair +2 Software Engineering

Paper
Add Code

Evaluating Pretrained Transformer-based Models on the Task of Fine-Grained Named Entity Recognition

no code implementations • COLING 2020 • Cedric Lothritz, Kevin Allix, Lisa Veiber, Tegawend{\'e} F. Bissyand{\'e}, Jacques Klein

In this paper, we compare three transformer-based models (BERT, RoBERTa, and XLNet) to two non-transformer-based models (CRF and BiLSTM-CNN-CRF).

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

A First Look at Android Applications in Google Play related to Covid-19

no code implementations • 19 Jun 2020 • Jordan Samhi, Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein

Due to the convenience of access-on-demand to information and business solutions, mobile apps have become an important asset in the digital world.

Software Engineering Computers and Society

Paper
Add Code

iFixR: Bug Report driven Program Repair

2 code implementations • 12 Jul 2019 • Anil Koyuncu, Kui Liu, Tegawendé F. Bissyandé, Dongsun Kim, Martin Monperrus, Jacques Klein, Yves Le Traon

Towards increasing the adoption of patch generation tools by practitioners, we investigate a new repair pipeline, iFixR, driven by bug reports: (1) bug reports are fed to an IR-based fault localizer; (2) patches are generated from fix patterns and validated via regression testing; (3) a prioritized list of generated patches is proposed to developers.

Software Engineering

Paper
Code

Rebooting Research on Detecting Repackaged Android Apps: Literature Review and Benchmark

1 code implementation • 20 Nov 2018 • Li Li, Tegawendé Bissyandé, Jacques Klein

Repackaging is a serious threat to the Android ecosystem as it deprives app developers of their benefits, contributes to spreading malware on users' devices, and increases the workload of market maintainers.

Software Engineering

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.