Search Results for author: Junjie Wang

Found 73 papers, 32 papers with code

MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering

1 code implementation Findings (EMNLP) 2021 Junjie Wang, Yatai Ji, Jiaqi Sun, Yujiu Yang, Tetsuya Sakai

On the other hand, trilinear models such as the CTI model efficiently utilize the inter-modality information between answers, questions, and images, while ignoring intra-modality information.

multimodal interaction Multiple-choice +2

Joint-GCG: Unified Gradient-Based Poisoning Attacks on Retrieval-Augmented Generation Systems

1 code implementation6 Jun 2025 Haowei Wang, Rupeng Zhang, Junjie Wang, Mingyang Li, Yuekai Huang, Dandan Wang, Qing Wang

Joint-GCG's innovative unification of gradient-based attacks across retrieval and generation stages fundamentally reshapes our understanding of vulnerabilities within RAG systems.

RAG Retrieval +1

AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery

1 code implementation27 May 2025 Haowei Wang, Junjie Wang, Xiaojun Jia, Rupeng Zhang, Mingyang Li, Zhe Liu, Yang Liu, Qing Wang

In this paper, we propose AdInject, a novel and real-world black-box attack method that leverages the internet advertising delivery to inject malicious content into the Web Agent's environment.

Fast and Accurate Power Load Data Completion via Regularization-optimized Low-Rank Factorization

no code implementations25 May 2025 Yan Xia, Hao Feng, Hongwei Sun, Junjie Wang, Qicong Hu

Low-rank representation learning has emerged as a powerful tool for recovering missing values in power load data due to its ability to exploit the inherent low-dimensional structures of spatiotemporal measurements.

Computational Efficiency Imputation +2

One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems

no code implementations15 May 2025 Zhiyuan Chang, Mingyang Li, Xiaojun Jia, Junjie Wang, Yuekai Huang, Ziyou Jiang, Yang Liu, Qing Wang

Large Language Models (LLMs) enhanced with Retrieval-Augmented Generation (RAG) have shown improved performance in generating accurate responses.

RAG Retrieval +1

DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception

1 code implementation CVPR 2025 Junjie Wang, Bin Chen, Yulin Li, Bin Kang, YiChi Chen, Zhuotao Tian

To address this issue, we propose DeCLIP, a novel framework that enhances CLIP by decoupling the self-attention module to obtain ``content'' and ``context'' features respectively.

object-detection Object Detection +2

PIV-FlowDiffuser:Transfer-learning-based denoising diffusion models for PIV

1 code implementation21 Apr 2025 Qianyu Zhu, Junjie Wang, Jeremiah Hu, Jia Ai, Yong Lee

To reduce the special noise step-by-step, we employ a denoising diffusion model~(FlowDiffuser) for PIV analysis.

Denoising Optical Flow Estimation +1

Vulnerability of Text-to-Image Models to Prompt Template Stealing: A Differential Evolution Approach

no code implementations20 Feb 2025 Yurong Wu, Fangwen Mu, Qiuhong Zhang, Jinjing Zhao, Xinrun Xu, Lingrui Mei, Yang Wu, Lin Shi, Junjie Wang, Zhiming Ding, Yiwei Wang

Prompt trading has emerged as a significant intellectual property concern in recent years, where vendors entice users by showcasing sample images before selling prompt templates that can generate similar images.

Explore-Construct-Filter: An Automated Framework for Rich and Reliable API Knowledge Graph Construction

no code implementations19 Feb 2025 Yanbang Sun, Qing Huang, Xiaoxue Ren, Zhenchang Xing, Xiaohong Li, Junjie Wang

The API Knowledge Graph (API KG) is a structured network that models API entities and their relations, providing essential semantic insights for tasks such as API recommendation, code generation, and API misuse detection.

Code Generation graph construction

Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning System

no code implementations17 Feb 2025 Ziyou Jiang, Mingyang Li, Guowei Yang, Junjie Wang, Yuekai Huang, Zhiyuan Chang, Qing Wang

Inspired by the concept of mimicking the familiar, AutoCMD is capable of inferring the information utilized by upstream tools in the toolchain through learning on open-source systems and reinforcement with target system examples, thereby generating more targeted commands for information theft.

Comment Generation Large Language Model

OntoTune: Ontology-Driven Self-training for Aligning Large Language Models

1 code implementation8 Feb 2025 Zhiqiang Liu, Chengtao Gan, Junjie Wang, Yichi Zhang, Zhongpu Bo, Mengshu Sun, Huajun Chen, Wen Zhang

Compared to existing domain LLMs based on newly collected large-scale domain-specific corpora, our OntoTune, which relies on the existing, long-term developed ontology and LLM itself, significantly reduces data maintenance costs and offers improved generalization ability.

Hypernym Discovery In-Context Learning

Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective

no code implementations19 Jan 2025 Yiyao Yu, Yuxiang Zhang, Dongdong Zhang, Xiao Liang, Hengyuan Zhang, Xingxing Zhang, Mahmoud Khademi, Hany Awadalla, Junjie Wang, Yujiu Yang, Furu Wei

Large Language Models (LLMs) have made notable progress in mathematical reasoning, yet often rely on single-paradigm reasoning, limiting their effectiveness across diverse tasks.

Automated Theorem Proving Math +2

Joint Secrecy Rate Achieving and Authentication Enhancement via Tag-based Encoding in Chaotic UAV Communication Environment

no code implementations6 Jan 2025 Junjie Wang, Fang Fang, Gangtao Han, Ning Wang, Xianbin Wang

To protect legitimate communication in a chaotic UAV environment, where both eavesdropping and jamming become straightforward from multiple adversaries with line-of-sight signal propagation, a new reliable and integrated physical layer security mechanism is proposed in this paper for a massive multiple-input-multiple-output (MIMO) UAV system.

TAG

Identity-Clothing Similarity Modeling for Unsupervised Clothing Change Person Re-Identification

no code implementations CVPR 2025 Zhiqi Pang, Junjie Wang, Lingling Zhao, Chunyu Wang

Clothing change person re-identification (CC-ReID) aims to match different images of the same person, even when the clothing varies across images.

Person Re-Identification Pseudo Label

Understanding Individual Agent Importance in Multi-Agent System via Counterfactual Reasoning

no code implementations20 Dec 2024 Jianming Chen, Yawen Wang, Junjie Wang, Xiaofei Xie, Jun Hu, Qing Wang, Fanjiang Xu

Inspired by counterfactual reasoning, a larger change in reward caused by the randomized action of agent indicates its higher importance.

counterfactual Counterfactual Reasoning

What External Knowledge is Preferred by LLMs? Characterizing and Exploring Chain of Evidence in Imperfect Context

no code implementations17 Dec 2024 Zhiyuan Chang, Mingyang Li, Xiaojun Jia, Junjie Wang, Yuekai Huang, Qing Wang, Yihao Huang, Yang Liu

Incorporating external knowledge into large language models (LLMs) has emerged as a promising approach to mitigate outdated knowledge and hallucination in LLMs.

Hallucination Misinformation +2

Multi-Domain Features Guided Supervised Contrastive Learning for Radar Target Detection

no code implementations17 Dec 2024 Junjie Wang, Yuze Gao, Dongying Li, Wenxian Yu

In this letter, we propose a multi-domain features guided supervised contrastive learning (MDFG_SCL) method, which integrates statistical features derived from multi-domain differences with deep features obtained through supervised contrastive learning, thereby capturing both low-level domain-specific variations and high-level semantic information.

Contrastive Learning

From Allies to Adversaries: Manipulating LLM Tool-Calling through Adversarial Injection

1 code implementation13 Dec 2024 Haowei Wang, Rupeng Zhang, Junjie Wang, Mingyang Li, Yuekai Huang, Dandan Wang, Qing Wang

To fill this gap, we present ToolCommander, a novel framework designed to exploit vulnerabilities in LLM tool-calling systems through adversarial tool injection.

Language Modeling Language Modelling +2

CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification

no code implementations26 Oct 2024 Fangwen Mu, Junjie Wang, Zhuohao Yu, Lin Shi, Song Wang, Mingyang Li, Qing Wang

In this study, we propose CodePurify, a novel defense against backdoor attacks on code models through entropy-based purification.

TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training

3 code implementations9 Oct 2024 Wanchao Liang, Tianyu Liu, Less Wright, Will Constable, Andrew Gu, Chien-chin Huang, Iris Zhang, Wei Feng, Howard Huang, Junjie Wang, Sanket Purandare, Gokul Nadathur, Stratos Idreos

By stacking training optimizations, we demonstrate accelerations of 65. 08% with 1D parallelism at the 128-GPU scale (Llama 3. 1 8B), an additional 12. 59% with 2D parallelism at the 256-GPU scale (Llama 3. 1 70B), and an additional 30% with 3D parallelism at the 512-GPU scale (Llama 3. 1 405B) on NVIDIA H100 GPUs over optimized baselines.

The Llama 3 Herd of Models

4 code implementations31 Jul 2024 Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang, Bobbie Chern, Charlotte Caucheteux, Chaya Nayak, Chloe Bi, Chris Marra, Chris McConnell, Christian Keller, Christophe Touret, Chunyang Wu, Corinne Wong, Cristian Canton Ferrer, Cyrus Nikolaidis, Damien Allonsius, Daniel Song, Danielle Pintz, Danny Livshits, Danny Wyatt, David Esiobu, Dhruv Choudhary, Dhruv Mahajan, Diego Garcia-Olano, Diego Perino, Dieuwke Hupkes, Egor Lakomkin, Ehab AlBadawy, Elina Lobanova, Emily Dinan, Eric Michael Smith, Filip Radenovic, Francisco Guzmán, Frank Zhang, Gabriel Synnaeve, Gabrielle Lee, Georgia Lewis Anderson, Govind Thattai, Graeme Nail, Gregoire Mialon, Guan Pang, Guillem Cucurell, Hailey Nguyen, Hannah Korevaar, Hu Xu, Hugo Touvron, Iliyan Zarov, Imanol Arrieta Ibarra, Isabel Kloumann, Ishan Misra, Ivan Evtimov, Jack Zhang, Jade Copet, Jaewon Lee, Jan Geffert, Jana Vranes, Jason Park, Jay Mahadeokar, Jeet Shah, Jelmer Van der Linde, Jennifer Billock, Jenny Hong, Jenya Lee, Jeremy Fu, Jianfeng Chi, Jianyu Huang, Jiawen Liu, Jie Wang, Jiecao Yu, Joanna Bitton, Joe Spisak, Jongsoo Park, Joseph Rocca, Joshua Johnstun, Joshua Saxe, Junteng Jia, Kalyan Vasuden Alwala, Karthik Prasad, Kartikeya Upasani, Kate Plawiak, Ke Li, Kenneth Heafield, Kevin Stone, Khalid El-Arini, Krithika Iyer, Kshitiz Malik, Kuenley Chiu, Kunal Bhalla, Kushal Lakhotia, Lauren Rantala-Yeary, Laurens van der Maaten, Lawrence Chen, Liang Tan, Liz Jenkins, Louis Martin, Lovish Madaan, Lubo Malo, Lukas Blecher, Lukas Landzaat, Luke de Oliveira, Madeline Muzzi, Mahesh Pasupuleti, Mannat Singh, Manohar Paluri, Marcin Kardas, Maria Tsimpoukelli, Mathew Oldham, Mathieu Rita, Maya Pavlova, Melanie Kambadur, Mike Lewis, Min Si, Mitesh Kumar Singh, Mona Hassan, Naman Goyal, Narjes Torabi, Nikolay Bashlykov, Nikolay Bogoychev, Niladri Chatterji, Ning Zhang, Olivier Duchenne, Onur Çelebi, Patrick Alrassy, Pengchuan Zhang, Pengwei Li, Petar Vasic, Peter Weng, Prajjwal Bhargava, Pratik Dubal, Praveen Krishnan, Punit Singh Koura, Puxin Xu, Qing He, Qingxiao Dong, Ragavan Srinivasan, Raj Ganapathy, Ramon Calderer, Ricardo Silveira Cabral, Robert Stojnic, Roberta Raileanu, Rohan Maheswari, Rohit Girdhar, Rohit Patel, Romain Sauvestre, Ronnie Polidoro, Roshan Sumbaly, Ross Taylor, Ruan Silva, Rui Hou, Rui Wang, Saghar Hosseini, Sahana Chennabasappa, Sanjay Singh, Sean Bell, Seohyun Sonia Kim, Sergey Edunov, Shaoliang Nie, Sharan Narang, Sharath Raparthy, Sheng Shen, Shengye Wan, Shruti Bhosale, Shun Zhang, Simon Vandenhende, Soumya Batra, Spencer Whitman, Sten Sootla, Stephane Collot, Suchin Gururangan, Sydney Borodinsky, Tamar Herman, Tara Fowler, Tarek Sheasha, Thomas Georgiou, Thomas Scialom, Tobias Speckbacher, Todor Mihaylov, Tong Xiao, Ujjwal Karn, Vedanuj Goswami, Vibhor Gupta, Vignesh Ramanathan, Viktor Kerkez, Vincent Gonguet, Virginie Do, Vish Vogeti, Vítor Albiero, Vladan Petrovic, Weiwei Chu, Wenhan Xiong, Wenyin Fu, Whitney Meers, Xavier Martinet, Xiaodong Wang, Xiaofang Wang, Xiaoqing Ellen Tan, Xide Xia, Xinfeng Xie, Xuchao Jia, Xuewei Wang, Yaelle Goldschlag, Yashesh Gaur, Yasmine Babaei, Yi Wen, Yiwen Song, Yuchen Zhang, Yue Li, Yuning Mao, Zacharie Delpierre Coudert, Zheng Yan, Zhengxing Chen, Zoe Papakipos, Aaditya Singh, Aayushi Srivastava, Abha Jain, Adam Kelsey, Adam Shajnfeld, Adithya Gangidi, Adolfo Victoria, Ahuva Goldstand, Ajay Menon, Ajay Sharma, Alex Boesenberg, Alexei Baevski, Allie Feinstein, Amanda Kallet, Amit Sangani, Amos Teo, Anam Yunus, Andrei Lupu, Andres Alvarado, Andrew Caples, Andrew Gu, Andrew Ho, Andrew Poulton, Andrew Ryan, Ankit Ramchandani, Annie Dong, Annie Franco, Anuj Goyal, Aparajita Saraf, Arkabandhu Chowdhury, Ashley Gabriel, Ashwin Bharambe, Assaf Eisenman, Azadeh Yazdan, Beau James, Ben Maurer, Benjamin Leonhardi, Bernie Huang, Beth Loyd, Beto De Paola, Bhargavi Paranjape, Bing Liu, Bo Wu, Boyu Ni, Braden Hancock, Bram Wasti, Brandon Spence, Brani Stojkovic, Brian Gamido, Britt Montalvo, Carl Parker, Carly Burton, Catalina Mejia, Ce Liu, Changhan Wang, Changkyu Kim, Chao Zhou, Chester Hu, Ching-Hsiang Chu, Chris Cai, Chris Tindal, Christoph Feichtenhofer, Cynthia Gao, Damon Civin, Dana Beaty, Daniel Kreymer, Daniel Li, David Adkins, David Xu, Davide Testuggine, Delia David, Devi Parikh, Diana Liskovich, Didem Foss, Dingkang Wang, Duc Le, Dustin Holland, Edward Dowling, Eissa Jamil, Elaine Montgomery, Eleonora Presani, Emily Hahn, Emily Wood, Eric-Tuan Le, Erik Brinkman, Esteban Arcaute, Evan Dunbar, Evan Smothers, Fei Sun, Felix Kreuk, Feng Tian, Filippos Kokkinos, Firat Ozgenel, Francesco Caggioni, Frank Kanayet, Frank Seide, Gabriela Medina Florez, Gabriella Schwarz, Gada Badeer, Georgia Swee, Gil Halpern, Grant Herman, Grigory Sizov, Guangyi, Zhang, Guna Lakshminarayanan, Hakan Inan, Hamid Shojanazeri, Han Zou, Hannah Wang, Hanwen Zha, Haroun Habeeb, Harrison Rudolph, Helen Suk, Henry Aspegren, Hunter Goldman, Hongyuan Zhan, Ibrahim Damlaj, Igor Molybog, Igor Tufanov, Ilias Leontiadis, Irina-Elena Veliche, Itai Gat, Jake Weissman, James Geboski, James Kohli, Janice Lam, Japhet Asher, Jean-Baptiste Gaya, Jeff Marcus, Jeff Tang, Jennifer Chan, Jenny Zhen, Jeremy Reizenstein, Jeremy Teboul, Jessica Zhong, Jian Jin, Jingyi Yang, Joe Cummings, Jon Carvill, Jon Shepard, Jonathan McPhie, Jonathan Torres, Josh Ginsburg, Junjie Wang, Kai Wu, Kam Hou U, Karan Saxena, Kartikay Khandelwal, Katayoun Zand, Kathy Matosich, Kaushik Veeraraghavan, Kelly Michelena, Keqian Li, Kiran Jagadeesh, Kun Huang, Kunal Chawla, Kyle Huang, Lailin Chen, Lakshya Garg, Lavender A, Leandro Silva, Lee Bell, Lei Zhang, Liangpeng Guo, Licheng Yu, Liron Moshkovich, Luca Wehrstedt, Madian Khabsa, Manav Avalani, Manish Bhatt, Martynas Mankus, Matan Hasson, Matthew Lennie, Matthias Reso, Maxim Groshev, Maxim Naumov, Maya Lathi, Meghan Keneally, Miao Liu, Michael L. Seltzer, Michal Valko, Michelle Restrepo, Mihir Patel, Mik Vyatskov, Mikayel Samvelyan, Mike Clark, Mike Macey, Mike Wang, Miquel Jubert Hermoso, Mo Metanat, Mohammad Rastegari, Munish Bansal, Nandhini Santhanam, Natascha Parks, Natasha White, Navyata Bawa, Nayan Singhal, Nick Egebo, Nicolas Usunier, Nikhil Mehta, Nikolay Pavlovich Laptev, Ning Dong, Norman Cheng, Oleg Chernoguz, Olivia Hart, Omkar Salpekar, Ozlem Kalinli, Parkin Kent, Parth Parekh, Paul Saab, Pavan Balaji, Pedro Rittner, Philip Bontrager, Pierre Roux, Piotr Dollar, Polina Zvyagina, Prashant Ratanchandani, Pritish Yuvraj, Qian Liang, Rachad Alao, Rachel Rodriguez, Rafi Ayub, Raghotham Murthy, Raghu Nayani, Rahul Mitra, Rangaprabhu Parthasarathy, Raymond Li, Rebekkah Hogan, Robin Battey, Rocky Wang, Russ Howes, Ruty Rinott, Sachin Mehta, Sachin Siby, Sai Jayesh Bondu, Samyak Datta, Sara Chugh, Sara Hunt, Sargun Dhillon, Sasha Sidorov, Satadru Pan, Saurabh Mahajan, Saurabh Verma, Seiji Yamamoto, Sharadh Ramaswamy, Shaun Lindsay, Sheng Feng, Shenghao Lin, Shengxin Cindy Zha, Shishir Patil, Shiva Shankar, Shuqiang Zhang, Sinong Wang, Sneha Agarwal, Soji Sajuyigbe, Soumith Chintala, Stephanie Max, Stephen Chen, Steve Kehoe, Steve Satterfield, Sudarshan Govindaprasad, Sumit Gupta, Summer Deng, Sungmin Cho, Sunny Virk, Suraj Subramanian, Sy Choudhury, Sydney Goldman, Tal Remez, Tamar Glaser, Tamara Best, Thilo Koehler, Thomas Robinson, Tianhe Li, Tianjun Zhang, Tim Matthews, Timothy Chou, Tzook Shaked, Varun Vontimitta, Victoria Ajayi, Victoria Montanez, Vijai Mohan, Vinay Satish Kumar, Vishal Mangla, Vlad Ionescu, Vlad Poenaru, Vlad Tiberiu Mihailescu, Vladimir Ivanov, Wei Li, Wenchen Wang, WenWen Jiang, Wes Bouaziz, Will Constable, Xiaocheng Tang, Xiaojian Wu, Xiaolan Wang, Xilun Wu, Xinbo Gao, Yaniv Kleinman, Yanjun Chen, Ye Hu, Ye Jia, Ye Qi, Yenda Li, Yilin Zhang, Ying Zhang, Yossi Adi, Youngjin Nam, Yu, Wang, Yu Zhao, Yuchen Hao, Yundi Qian, Yunlu Li, Yuzi He, Zach Rait, Zachary DeVito, Zef Rosnbrick, Zhaoduo Wen, Zhenyu Yang, Zhiwei Zhao, Zhiyu Ma

This paper presents a new set of foundation models, called Llama 3.

answerability prediction Language Modeling +5

TrustUQA: A Trustful Framework for Unified Structured Data Question Answering

1 code implementation27 Jun 2024 Wen Zhang, Long Jin, Yushan Zhu, Jiaoyan Chen, Zhiwei Huang, Junjie Wang, Yin Hua, Lei Liang, Huajun Chen

Further more, we have demonstrated the potential of our method for more general QA tasks, QA over mixed structured data and QA across structured data.

Answer Generation Knowledge Graphs +2

Repairing Catastrophic-Neglect in Text-to-Image Diffusion Models via Attention-Guided Feature Enhancement

1 code implementation24 Jun 2024 Zhiyuan Chang, Mingyang Li, Junjie Wang, Yi Liu, Qing Wang, Yang Liu

The most prominent issue among these semantic inconsistencies is catastrophic-neglect, where the images generated by T2I DMs miss key objects mentioned in the prompt.

Image Generation

PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents

no code implementations20 Jun 2024 Junjie Wang, Yin Zhang, Yatai Ji, Yuxiang Zhang, Chunyang Jiang, YuBo Wang, Kang Zhu, Zekun Wang, Tiezhen Wang, Wenhao Huang, Jie Fu, Bei Chen, Qunshu Lin, Minghao Liu, Ge Zhang, Wenhu Chen

Recent advancements in Large Multimodal Models (LMMs) have leveraged extensive multimodal datasets to enhance capabilities in complex knowledge-driven tasks.

HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing

no code implementations17 Jun 2024 Jing Chen, Xinyu Zhu, Cheng Yang, Chufan Shi, Yadong Xi, Yuxiang Zhang, Junjie Wang, Jiashu Pu, Rongsheng Zhang, Yujiu Yang, Tian Feng

Generative AI has demonstrated unprecedented creativity in the field of computer vision, yet such phenomena have not been observed in natural language processing.

ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

1 code implementation14 Jun 2024 Cheng Yang, Chufan Shi, Yaxin Liu, Bo Shui, Junjie Wang, Mohan Jing, Linran Xu, Xinyu Zhu, Siheng Li, Yuxiang Zhang, Gongye Liu, Xiaomei Nie, Deng Cai, Yujiu Yang

We introduce a new benchmark, ChartMimic, aimed at assessing the visually-grounded code generation capabilities of large multimodal models (LMMs).

Code Generation

Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning

1 code implementation11 Jun 2024 Chenyu Yang, Xizhou Zhu, Jinguo Zhu, Weijie Su, Junjie Wang, Xuan Dong, Wenhai Wang, Lewei Lu, Bin Li, Jie zhou, Yu Qiao, Jifeng Dai

Recently, vision model pre-training has evolved from relying on manually annotated datasets to leveraging large-scale, web-crawled image-text data.

Contrastive Learning

MLAE: Masked LoRA Experts for Visual Parameter-Efficient Fine-Tuning

1 code implementation29 May 2024 Junjie Wang, Guangjing Yang, Wentao Chen, Huahui Yi, Xiaohu Wu, Zhouchen Lin, Qicheng Lao

To address these issues, a natural idea is to enhance the independence and diversity of the learning process for the low-rank matrices.

parameter-efficient fine-tuning

OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

1 code implementation28 May 2024 Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang, Yong Xu

However, existing open-vocabulary detectors trained on base category data tend to assign higher confidence to trained categories and confuse novel categories with the background.

Contrastive Learning Denoising +3

VEglue: Testing Visual Entailment Systems via Object-Aligned Joint Erasing

1 code implementation5 Mar 2024 Zhiyuan Chang, Mingyang Li, Junjie Wang, Cheng Li, Qing Wang

Visual entailment (VE) is a multimodal reasoning task consisting of image-sentence pairs whereby a promise is defined by an image, and a hypothesis is described by a sentence.

Multimodal Reasoning Sentence +1

Adversarial Testing for Visual Grounding via Image-Aware Property Reduction

no code implementations2 Mar 2024 Zhiyuan Chang, Mingyang Li, Junjie Wang, Cheng Li, Boyu Wu, Fanjiang Xu, Qing Wang

To this end, we propose PEELING, a text perturbation approach via image-aware property reduction for adversarial testing of the VG model.

Visual Grounding

Decictor: Towards Evaluating the Robustness of Decision-Making in Autonomous Driving Systems

no code implementations28 Feb 2024 Mingfei Cheng, Yuan Zhou, Xiaofei Xie, Junjie Wang, Guozhu Meng, Kairui Yang

Subsequently, the Consistency Check is applied to determine the presence of non-optimal PPDs by comparing the driving paths in the original and mutated scenarios.

Autonomous Driving Decision Making

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

no code implementations26 Feb 2024 Alex Zhuang, Ge Zhang, Tianyu Zheng, Xinrun Du, Junjie Wang, Weiming Ren, Stephen W. Huang, Jie Fu, Xiang Yue, Wenhu Chen

Utilizing this dataset, we train a series of models, referred to as StructLM, based on the Mistral and the CodeLlama model family, ranging from 7B to 34B parameters.

Play Guessing Game with LLM: Indirect Jailbreak Attack with Implicit Clues

no code implementations14 Feb 2024 Zhiyuan Chang, Mingyang Li, Yi Liu, Junjie Wang, Qing Wang, Yang Liu

With the development of LLMs, the security threats of LLMs are getting more and more attention.

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

1 code implementation22 Jan 2024 Ge Zhang, Xinrun Du, Bei Chen, Yiming Liang, Tongxu Luo, Tianyu Zheng, Kang Zhu, Yuyang Cheng, Chunpu Xu, Shuyue Guo, Haoran Zhang, Xingwei Qu, Junjie Wang, Ruibin Yuan, Yizhi Li, Zekun Wang, Yudong Liu, Yu-Hsuan Tsai, Fengji Zhang, Chenghua Lin, Wenhao Huang, Jie Fu

We introduce CMMMU, a new Chinese Massive Multi-discipline Multimodal Understanding benchmark designed to evaluate LMMs on tasks demanding college-level subject knowledge and deliberate reasoning in a Chinese context.

Know Your Needs Better: Towards Structured Understanding of Marketer Demands with Analogical Reasoning Augmented LLMs

3 code implementations9 Jan 2024 Junjie Wang, Dan Yang, Binbin Hu, Yue Shen, Wen Zhang, Jinjie Gu

To stimulate the LLMs' reasoning ability, the chain-of-thought (CoT) prompting method is widely used, but existing methods still have some limitations in our scenario: (1) Previous methods either use simple "Let's think step by step" spells or provide fixed examples in demonstrations without considering compatibility between prompts and concrete questions, making LLMs ineffective when the marketers' demands are abstract and diverse.

Language Modelling Large Language Model

AdapterDistillation: Non-Destructive Task Composition with Knowledge Distillation

no code implementations26 Dec 2023 Junjie Wang, Yicheng Chen, Wangshu Zhang, Sen Hu, Teng Xu, Jing Zheng

In the second stage, we distill the knowledge from the existing teacher adapters into the student adapter to help its inference.

Knowledge Distillation Retrieval

A Survey on Query-based API Recommendation

no code implementations17 Dec 2023 Moshi Wei, Nima Shiri Harzevili, Alvine Boaye Belle, Junjie Wang, Lin Shi, Jinqiu Yang, Song Wang, Ming Zhen, Jiang

We also investigate the typical data extraction procedures and collection approaches employed by the existing approaches.

Survey

From Beginner to Expert: Modeling Medical Knowledge into General LLMs

no code implementations2 Dec 2023 Qiang Li, Xiaoyan Yang, Haowen Wang, Qin Wang, Lei Liu, Junjie Wang, Yang Zhang, Mingyuan Chu, Sen Hu, Yicheng Chen, Yue Shen, Cong Fan, Wangshu Zhang, Teng Xu, Jinjie Gu, Jing Zheng, Guannan Zhang Ant Group

(3) Specifically for multi-choice questions in the medical domain, we propose a novel Verification-of-Choice approach for prompting engineering, which significantly enhances the reasoning ability of LLMs.

Language Modelling Large Language Model +3

GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions

no code implementations CVPR 2024 Junjie Wang, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Qi Tian

Specifically, we first extract the region of interest (RoI) corresponding to the text instruction, aligning it to 3D Gaussians.

3D scene Editing

Reliable Academic Conference Question Answering: A Study Based on Large Language Model

1 code implementation19 Oct 2023 Zhiwei Huang, Juan Li, Long Jin, Junjie Wang, Mingchen Tu, Yin Hua, Zhiqiang Liu, Jiawei Meng, Wen Zhang

Specifically, for each conference, we first organize academic conference data in a tree-structured format through a semi-automated method.

Hallucination Language Modeling +4

EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval

1 code implementation2 Oct 2023 Yiyao Yu, Junjie Wang, Yuxiang Zhang, Lin Zhang, Yujiu Yang, Tetsuya Sakai

Artificial intelligence (AI) technologies should adhere to human norms to better serve our society and avoid disseminating harmful or misleading information, particularly in Conversational Information Retrieval (CIR).

Ethics Information Retrieval +1

UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective

no code implementations17 May 2023 Ping Yang, Junyu Lu, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Jiaxing Zhang, Pingjian Zhang

We propose a new paradigm for universal information extraction (IE) that is compatible with any schema format and applicable to a list of IE tasks, such as named entity recognition, relation extraction, event extraction and sentiment analysis.

Event Extraction named-entity-recognition +3

NER-to-MRC: Named-Entity Recognition Completely Solving as Machine Reading Comprehension

no code implementations6 May 2023 Yuxiang Zhang, Junjie Wang, Xinyu Zhu, Tetsuya Sakai, Hayato Yamana

Named-entity recognition (NER) detects texts with predefined semantic labels and is an essential building block for natural language processing (NLP).

Machine Reading Comprehension named-entity-recognition +2

Prototypical context-aware dynamics generalization for high-dimensional model-based reinforcement learning

no code implementations23 Nov 2022 Junjie Wang, Yao Mu, Dong Li, Qichao Zhang, Dongbin Zhao, Yuzheng Zhuang, Ping Luo, Bin Wang, Jianye Hao

The latent world model provides a promising way to learn policies in a compact latent space for tasks with high-dimensional observations, however, its generalization across diverse environments with unseen dynamics remains challenging.

Model-based Reinforcement Learning reinforcement-learning +1

Solving Math Word Problems via Cooperative Reasoning induced Language Models

1 code implementation28 Oct 2022 Xinyu Zhu, Junjie Wang, Lin Zhang, Yuxiang Zhang, Ruyi Gan, Jiaxing Zhang, Yujiu Yang

This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier.

Arithmetic Reasoning Math

Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective

1 code implementation16 Oct 2022 Ping Yang, Junjie Wang, Ruyi Gan, Xinyu Zhu, Lin Zhang, Ziwei Wu, Xinyu Gao, Jiaxing Zhang, Tetsuya Sakai

We propose a new paradigm for zero-shot learners that is format agnostic, i. e., it is compatible with any format and applicable to a list of language tasks, such as text classification, commonsense reasoning, coreference resolution, and sentiment analysis.

Multiple-choice Natural Language Inference +4

TC-SKNet with GridMask for Low-complexity Classification of Acoustic scene

no code implementations5 Oct 2022 Luyuan Xie, Yan Zhong, Lin Yang, Zhaoyu Yan, Zhonghai Wu, Junjie Wang

In our experiments, the performance gain brought by GridMask is stronger than spectrum augmentation in ASCs.

AutoML Data Augmentation

Im2Oil: Stroke-Based Oil Painting Rendering with Linearly Controllable Fineness Via Adaptive Sampling

1 code implementation27 Sep 2022 Zhengyan Tong, Xiaohang Wang, Shengchao Yuan, Xuanhong Chen, Junjie Wang, Xiangzhong Fang

Comparison with existing state-of-the-art oil painting techniques shows that our results have higher fidelity and more realistic textures.

Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

1 code implementation5 Aug 2022 Junjie Wang, Yuxiang Zhang, Ping Yang, Ruyi Gan

This report describes a pre-trained language model Erlangshen with propensity-corrected loss, the No. 1 in CLUE Semantic Matching Challenge.

Language Modeling Language Modelling +1

Change Detection from Synthetic Aperture Radar Images via Graph-Based Knowledge Supplement Network

1 code implementation22 Jan 2022 Junjie Wang, Feng Gao, Junyu Dong, Shan Zhang, Qian Du

Synthetic aperture radar (SAR) image change detection is a vital yet challenging task in the field of remote sensing image analysis.

Change Detection Feature Correlation

Adaptive DropBlock Enhanced Generative Adversarial Networks for Hyperspectral Image Classification

1 code implementation22 Jan 2022 Junjie Wang, Feng Gao, Junyu Dong, Qian Du

Second, an adaptive DropBlock (AdapDrop) is proposed as a regularization method employed in the generator and discriminator to alleviate the mode collapse issue.

Classification Hyperspectral Image Classification +1

Low-Latency Online Speaker Diarization with Graph-Based Label Generation

no code implementations27 Nov 2021 Yucong Zhang, Qinjian Lin, Weiqing Wang, Lin Yang, Xuyang Wang, Junjie Wang, Ming Li

To ensure the low latency in the online setting, we introduce a variant of AHC, namely chkpt-AHC, to cluster the speakers.

Clustering speaker-diarization +1

Benchmarking Lane-changing Decision-making for Deep Reinforcement Learning

no code implementations22 Sep 2021 Junjie Wang, Qichao Zhang, Dongbin Zhao

We train several state-of-the-art deep reinforcement learning methods in the designed training scenarios and provide the benchmark metrics evaluation results of the trained models in the test scenarios.

Autonomous Driving Benchmarking +5

Change Detection from SAR Images Based on Deformable Residual Convolutional Neural Networks

no code implementations6 Apr 2021 Junjie Wang, Feng Gao, Junyu Dong

Convolutional neural networks (CNN) have made great progress for synthetic aperture radar (SAR) images change detection.

Change Detection

TransfoRNN: Capturing the Sequential Information in Self-Attention Representations for Language Modeling

no code implementations4 Apr 2021 Tze Yuang Chong, Xuyang Wang, Lin Yang, Junjie Wang

Also, the TransfoRNN model was applied on the LibriSpeech speech recognition task and has shown comparable results with the Transformer models.

Language Modeling Language Modelling +2

Skeleton2Mesh: Kinematics Prior Injected Unsupervised Human Mesh Recovery

no code implementations ICCV 2021 Zhenbo Yu, Junjie Wang, Jingwei Xu, Bingbing Ni, Chenglong Zhao, Minsi Wang, Wenjun Zhang

The challenges of the latter task are two folds: (1) pose failure (i. e., pose mismatching -- different skeleton definitions in dataset and SMPL , and pose ambiguity -- endpoints have arbitrary joint angle configurations for the same 3D joint coordinates).

3D Pose Estimation Human Mesh Recovery

Training Wake Word Detection with Synthesized Speech Data on Confusion Words

no code implementations3 Nov 2020 Yan Jia, Zexin Cai, Murong Ma, Zeqing Zhao, Xuyang Wang, Junjie Wang, Ming Li

Confusing-words are commonly encountered in real-life keyword spotting applications, which causes severe degradation of performance due to complex spoken terms and various kinds of words that sound similar to the predefined keywords.

Data Augmentation Keyword Spotting +3

Cannot find the paper you are looking for? You can Submit a new open access paper.