Search Results for author: Ke Li

Found 229 papers, 95 papers with code

C^3KG: A Chinese Commonsense Conversation Knowledge Graph

1 code implementation Findings (ACL) 2022 Dawei Li, Yanran Li, Jiayi Zhang, Ke Li, Chen Wei, Jianwei Cui, Bin Wang

Existing commonsense knowledge bases often organize tuples in an isolated manner, which is deficient for commonsense conversational models to plan the next steps.

Scale Contrastive Learning with Selective Attentions for Blind Image Quality Assessment

no code implementations13 Nov 2024 Zihao Huang, Xudong Li, Bohan Fu, Xiaohui Chu, Ke Li, Yunhang Shen, Yan Zhang

This paper addresses two primary challenges: the significant redundancy of information across different scales, and the confusion caused by combining features from these scales, which may vary widely in quality.

Blind Image Quality Assessment Contrastive Learning

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

no code implementations1 Nov 2024 Xiong Wang, Yangze Li, Chaoyou Fu, Yunhang Shen, Lei Xie, Ke Li, Xing Sun, Long Ma

Our main contribution is that the speech input and output modalities can be easily connected to a textual LLM while keeping the LLM's parameters frozen throughout the training process.

Show Me What and Where has Changed? Question Answering and Grounding for Remote Sensing Change Detection

1 code implementation31 Oct 2024 Ke Li, Fuyu Dong, Di Wang, Shaofeng Li, Quan Wang, Xinbo Gao, Tat-Seng Chua

Furthermore, we present VisTA, a simple yet effective baseline method that unifies the tasks of question answering and grounding by delivering both visual and textual answers.

Change Detection Question Answering +1

Frozen Large Language Models Can Perceive Paralinguistic Aspects of Speech

no code implementations2 Oct 2024 Wonjune Kang, Junteng Jia, Chunyang Wu, Wei Zhou, Egor Lakomkin, Yashesh Gaur, Leda Sari, Suyoun Kim, Ke Li, Jay Mahadeokar, Ozlem Kalinli

As speech becomes an increasingly common modality for interacting with large language models (LLMs), it is becoming desirable to develop systems where LLMs can take into account users' emotions or speaking styles when providing their responses.

OmniGenBench: Automating Large-scale in-silico Benchmarking for Genomic Foundation Models

1 code implementation2 Oct 2024 Heng Yang, Jack Cole, Ke Li

Recent breakthroughs in GFMs, such as Evo, have attracted significant investment and attention to genomic modeling, as they address long-standing challenges and transform in-silico genomic studies into automated, reliable, and efficient paradigms.

Benchmarking

Implicit Euler Discrete-Time Set-Valued Admittance Control for Impact-Contact Force Control

no code implementations28 Sep 2024 Ke Li, Xiaogang Xiong, Anjia Wang, Ying Qu, Yunjiang Lou

This paper introduces a novel admittance controller that ensures stable force control after impacting unknown stiffness environments by leveraging the differentiability of impact-contact forces.

Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image Synthesis

no code implementations26 Sep 2024 Chirag Vashist, Shichong Peng, Ke Li

An emerging area of research aims to learn deep generative models with limited training data.

Image Generation

Test-time Training for Hyperspectral Image Super-resolution

no code implementations13 Sep 2024 Ke Li, Luc van Gool, Dengxin Dai

In order to better support our test-time training method, we also propose a new network architecture to learn HSI SR without modeling spectral band interaction and propose a new data augmentation method Spectral Mixup to increase the diversity of the training data at test time.

Data Augmentation Hyperspectral Image Super-Resolution +1

Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models

1 code implementation28 Aug 2024 Yuncheng Yang, Yulei Qin, Tong Wu, Zihan Xu, Gang Li, Pengcheng Guo, Hang Shao, Yuchen Shi, Ke Li, Xing Sun, Jie Yang, Yun Gu

For the latter, we highlight the diversity of constituting experts and that of the fine-tuning instructions throughout the model and data selection process.

Diversity

Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models

1 code implementation4 Aug 2024 Yulei Qin, Yuncheng Yang, Pengcheng Guo, Gang Li, Hang Shao, Yuchen Shi, Zihan Xu, Yun Gu, Ke Li, Xing Sun

To bridge this gap, we present a comprehensive review on existing literature of data assessment and selection especially for instruction tuning of LLMs.

The Llama 3 Herd of Models

1 code implementation31 Jul 2024 Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang, Bobbie Chern, Charlotte Caucheteux, Chaya Nayak, Chloe Bi, Chris Marra, Chris McConnell, Christian Keller, Christophe Touret, Chunyang Wu, Corinne Wong, Cristian Canton Ferrer, Cyrus Nikolaidis, Damien Allonsius, Daniel Song, Danielle Pintz, Danny Livshits, Danny Wyatt, David Esiobu, Dhruv Choudhary, Dhruv Mahajan, Diego Garcia-Olano, Diego Perino, Dieuwke Hupkes, Egor Lakomkin, Ehab AlBadawy, Elina Lobanova, Emily Dinan, Eric Michael Smith, Filip Radenovic, Francisco Guzmán, Frank Zhang, Gabriel Synnaeve, Gabrielle Lee, Georgia Lewis Anderson, Govind Thattai, Graeme Nail, Gregoire Mialon, Guan Pang, Guillem Cucurell, Hailey Nguyen, Hannah Korevaar, Hu Xu, Hugo Touvron, Iliyan Zarov, Imanol Arrieta Ibarra, Isabel Kloumann, Ishan Misra, Ivan Evtimov, Jack Zhang, Jade Copet, Jaewon Lee, Jan Geffert, Jana Vranes, Jason Park, Jay Mahadeokar, Jeet Shah, Jelmer Van der Linde, Jennifer Billock, Jenny Hong, Jenya Lee, Jeremy Fu, Jianfeng Chi, Jianyu Huang, Jiawen Liu, Jie Wang, Jiecao Yu, Joanna Bitton, Joe Spisak, Jongsoo Park, Joseph Rocca, Joshua Johnstun, Joshua Saxe, Junteng Jia, Kalyan Vasuden Alwala, Karthik Prasad, Kartikeya Upasani, Kate Plawiak, Ke Li, Kenneth Heafield, Kevin Stone, Khalid El-Arini, Krithika Iyer, Kshitiz Malik, Kuenley Chiu, Kunal Bhalla, Kushal Lakhotia, Lauren Rantala-Yeary, Laurens van der Maaten, Lawrence Chen, Liang Tan, Liz Jenkins, Louis Martin, Lovish Madaan, Lubo Malo, Lukas Blecher, Lukas Landzaat, Luke de Oliveira, Madeline Muzzi, Mahesh Pasupuleti, Mannat Singh, Manohar Paluri, Marcin Kardas, Maria Tsimpoukelli, Mathew Oldham, Mathieu Rita, Maya Pavlova, Melanie Kambadur, Mike Lewis, Min Si, Mitesh Kumar Singh, Mona Hassan, Naman Goyal, Narjes Torabi, Nikolay Bashlykov, Nikolay Bogoychev, Niladri Chatterji, Ning Zhang, Olivier Duchenne, Onur Çelebi, Patrick Alrassy, Pengchuan Zhang, Pengwei Li, Petar Vasic, Peter Weng, Prajjwal Bhargava, Pratik Dubal, Praveen Krishnan, Punit Singh Koura, Puxin Xu, Qing He, Qingxiao Dong, Ragavan Srinivasan, Raj Ganapathy, Ramon Calderer, Ricardo Silveira Cabral, Robert Stojnic, Roberta Raileanu, Rohan Maheswari, Rohit Girdhar, Rohit Patel, Romain Sauvestre, Ronnie Polidoro, Roshan Sumbaly, Ross Taylor, Ruan Silva, Rui Hou, Rui Wang, Saghar Hosseini, Sahana Chennabasappa, Sanjay Singh, Sean Bell, Seohyun Sonia Kim, Sergey Edunov, Shaoliang Nie, Sharan Narang, Sharath Raparthy, Sheng Shen, Shengye Wan, Shruti Bhosale, Shun Zhang, Simon Vandenhende, Soumya Batra, Spencer Whitman, Sten Sootla, Stephane Collot, Suchin Gururangan, Sydney Borodinsky, Tamar Herman, Tara Fowler, Tarek Sheasha, Thomas Georgiou, Thomas Scialom, Tobias Speckbacher, Todor Mihaylov, Tong Xiao, Ujjwal Karn, Vedanuj Goswami, Vibhor Gupta, Vignesh Ramanathan, Viktor Kerkez, Vincent Gonguet, Virginie Do, Vish Vogeti, Vítor Albiero, Vladan Petrovic, Weiwei Chu, Wenhan Xiong, Wenyin Fu, Whitney Meers, Xavier Martinet, Xiaodong Wang, Xiaofang Wang, Xiaoqing Ellen Tan, Xide Xia, Xinfeng Xie, Xuchao Jia, Xuewei Wang, Yaelle Goldschlag, Yashesh Gaur, Yasmine Babaei, Yi Wen, Yiwen Song, Yuchen Zhang, Yue Li, Yuning Mao, Zacharie Delpierre Coudert, Zheng Yan, Zhengxing Chen, Zoe Papakipos, Aaditya Singh, Aayushi Srivastava, Abha Jain, Adam Kelsey, Adam Shajnfeld, Adithya Gangidi, Adolfo Victoria, Ahuva Goldstand, Ajay Menon, Ajay Sharma, Alex Boesenberg, Alexei Baevski, Allie Feinstein, Amanda Kallet, Amit Sangani, Amos Teo, Anam Yunus, Andrei Lupu, Andres Alvarado, Andrew Caples, Andrew Gu, Andrew Ho, Andrew Poulton, Andrew Ryan, Ankit Ramchandani, Annie Dong, Annie Franco, Anuj Goyal, Aparajita Saraf, Arkabandhu Chowdhury, Ashley Gabriel, Ashwin Bharambe, Assaf Eisenman, Azadeh Yazdan, Beau James, Ben Maurer, Benjamin Leonhardi, Bernie Huang, Beth Loyd, Beto De Paola, Bhargavi Paranjape, Bing Liu, Bo Wu, Boyu Ni, Braden Hancock, Bram Wasti, Brandon Spence, Brani Stojkovic, Brian Gamido, Britt Montalvo, Carl Parker, Carly Burton, Catalina Mejia, Ce Liu, Changhan Wang, Changkyu Kim, Chao Zhou, Chester Hu, Ching-Hsiang Chu, Chris Cai, Chris Tindal, Christoph Feichtenhofer, Cynthia Gao, Damon Civin, Dana Beaty, Daniel Kreymer, Daniel Li, David Adkins, David Xu, Davide Testuggine, Delia David, Devi Parikh, Diana Liskovich, Didem Foss, Dingkang Wang, Duc Le, Dustin Holland, Edward Dowling, Eissa Jamil, Elaine Montgomery, Eleonora Presani, Emily Hahn, Emily Wood, Eric-Tuan Le, Erik Brinkman, Esteban Arcaute, Evan Dunbar, Evan Smothers, Fei Sun, Felix Kreuk, Feng Tian, Filippos Kokkinos, Firat Ozgenel, Francesco Caggioni, Frank Kanayet, Frank Seide, Gabriela Medina Florez, Gabriella Schwarz, Gada Badeer, Georgia Swee, Gil Halpern, Grant Herman, Grigory Sizov, Guangyi, Zhang, Guna Lakshminarayanan, Hakan Inan, Hamid Shojanazeri, Han Zou, Hannah Wang, Hanwen Zha, Haroun Habeeb, Harrison Rudolph, Helen Suk, Henry Aspegren, Hunter Goldman, Hongyuan Zhan, Ibrahim Damlaj, Igor Molybog, Igor Tufanov, Ilias Leontiadis, Irina-Elena Veliche, Itai Gat, Jake Weissman, James Geboski, James Kohli, Janice Lam, Japhet Asher, Jean-Baptiste Gaya, Jeff Marcus, Jeff Tang, Jennifer Chan, Jenny Zhen, Jeremy Reizenstein, Jeremy Teboul, Jessica Zhong, Jian Jin, Jingyi Yang, Joe Cummings, Jon Carvill, Jon Shepard, Jonathan McPhie, Jonathan Torres, Josh Ginsburg, Junjie Wang, Kai Wu, Kam Hou U, Karan Saxena, Kartikay Khandelwal, Katayoun Zand, Kathy Matosich, Kaushik Veeraraghavan, Kelly Michelena, Keqian Li, Kiran Jagadeesh, Kun Huang, Kunal Chawla, Kyle Huang, Lailin Chen, Lakshya Garg, Lavender A, Leandro Silva, Lee Bell, Lei Zhang, Liangpeng Guo, Licheng Yu, Liron Moshkovich, Luca Wehrstedt, Madian Khabsa, Manav Avalani, Manish Bhatt, Martynas Mankus, Matan Hasson, Matthew Lennie, Matthias Reso, Maxim Groshev, Maxim Naumov, Maya Lathi, Meghan Keneally, Miao Liu, Michael L. Seltzer, Michal Valko, Michelle Restrepo, Mihir Patel, Mik Vyatskov, Mikayel Samvelyan, Mike Clark, Mike Macey, Mike Wang, Miquel Jubert Hermoso, Mo Metanat, Mohammad Rastegari, Munish Bansal, Nandhini Santhanam, Natascha Parks, Natasha White, Navyata Bawa, Nayan Singhal, Nick Egebo, Nicolas Usunier, Nikhil Mehta, Nikolay Pavlovich Laptev, Ning Dong, Norman Cheng, Oleg Chernoguz, Olivia Hart, Omkar Salpekar, Ozlem Kalinli, Parkin Kent, Parth Parekh, Paul Saab, Pavan Balaji, Pedro Rittner, Philip Bontrager, Pierre Roux, Piotr Dollar, Polina Zvyagina, Prashant Ratanchandani, Pritish Yuvraj, Qian Liang, Rachad Alao, Rachel Rodriguez, Rafi Ayub, Raghotham Murthy, Raghu Nayani, Rahul Mitra, Rangaprabhu Parthasarathy, Raymond Li, Rebekkah Hogan, Robin Battey, Rocky Wang, Russ Howes, Ruty Rinott, Sachin Mehta, Sachin Siby, Sai Jayesh Bondu, Samyak Datta, Sara Chugh, Sara Hunt, Sargun Dhillon, Sasha Sidorov, Satadru Pan, Saurabh Mahajan, Saurabh Verma, Seiji Yamamoto, Sharadh Ramaswamy, Shaun Lindsay, Sheng Feng, Shenghao Lin, Shengxin Cindy Zha, Shishir Patil, Shiva Shankar, Shuqiang Zhang, Sinong Wang, Sneha Agarwal, Soji Sajuyigbe, Soumith Chintala, Stephanie Max, Stephen Chen, Steve Kehoe, Steve Satterfield, Sudarshan Govindaprasad, Sumit Gupta, Summer Deng, Sungmin Cho, Sunny Virk, Suraj Subramanian, Sy Choudhury, Sydney Goldman, Tal Remez, Tamar Glaser, Tamara Best, Thilo Koehler, Thomas Robinson, Tianhe Li, Tianjun Zhang, Tim Matthews, Timothy Chou, Tzook Shaked, Varun Vontimitta, Victoria Ajayi, Victoria Montanez, Vijai Mohan, Vinay Satish Kumar, Vishal Mangla, Vlad Ionescu, Vlad Poenaru, Vlad Tiberiu Mihailescu, Vladimir Ivanov, Wei Li, Wenchen Wang, WenWen Jiang, Wes Bouaziz, Will Constable, Xiaocheng Tang, Xiaojian Wu, Xiaolan Wang, Xilun Wu, Xinbo Gao, Yaniv Kleinman, Yanjun Chen, Ye Hu, Ye Jia, Ye Qi, Yenda Li, Yilin Zhang, Ying Zhang, Yossi Adi, Youngjin Nam, Yu, Wang, Yu Zhao, Yuchen Hao, Yundi Qian, Yunlu Li, Yuzi He, Zach Rait, Zachary DeVito, Zef Rosnbrick, Zhaoduo Wen, Zhenyu Yang, Zhiwei Zhao, Zhiyu Ma

This paper presents a new set of foundation models, called Llama 3.

Language Modelling Multi-task Language Understanding +2

Towards scalable efficient on-device ASR with transfer learning

no code implementations23 Jul 2024 Laxmi Pandey, Ke Li, Jinxi Guo, Debjyoti Paul, Arthur Guo, Jay Mahadeokar, Xuedong Zhang

Multilingual pretraining for transfer learning significantly boosts the robustness of low-resource monolingual ASR models.

Transfer Learning

OmniGenome: Aligning RNA Sequences with Secondary Structures in Genomic Foundation Models

1 code implementation15 Jul 2024 Heng Yang, Ke Li

The alignment between RNA sequences and structures in foundation models (FMs) has yet to be thoroughly investigated.

Image-to-Text Logic Jailbreak: Your Imagination can Help You Do Anything

no code implementations1 Jul 2024 Xiaotian Zou, Ke Li, Yongkang Chen

However, no researchers evaluate whether logic understanding capabilities of VLMs in flowchart can influence jailbreak.

Image to text Language Modelling

Intrinsic PAPR for Point-level 3D Scene Albedo and Shading Editing

no code implementations29 Jun 2024 Alireza Moazeni, Shichong Peng, Ke Li

Recent advancements in neural rendering have excelled at novel view synthesis from multi-view RGB images.

Inverse Rendering Neural Rendering +1

Local Manifold Learning for No-Reference Image Quality Assessment

no code implementations27 Jun 2024 Timin Gao, Wensheng Pan, Yan Zhang, Sicheng Zhao, Shengchuan Zhang, Xiawu Zheng, Ke Li, Liujuan Cao, Rongrong Ji

This crop is then used to cluster other crops from the same image as the positive class, while crops from different images are treated as negative classes to increase inter-class distance.

Contrastive Learning NR-IQA

Adaptive Collaborative Correlation Learning-based Semi-Supervised Multi-Label Feature Selection

no code implementations18 Jun 2024 Yanyong Huang, Li Yang, Dongjie Wang, Ke Li, Xiuwen Yi, Fengmao Lv, Tianrui Li

Then, the instance correlation and label correlation are integrated into the proposed regression model to adaptively learn both the sample similarity graph and the label similarity graph, which mutually enhance feature selection performance.

feature selection Missing Labels +1

CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation

no code implementations17 Jun 2024 Yue Jiang, Jiawei Chen, Dingkang Yang, Mingcheng Li, Shunli Wang, Tong Wu, Ke Li, Lihua Zhang

Automatic medical report generation (MRG), which possesses significant research value as it can aid radiologists in clinical diagnosis and report composition, has garnered increasing attention.

Hallucination Medical Report Generation

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

1 code implementation17 Jun 2024 Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen

Notably, ASR models trained on GigaSpeech 2 can reduce the word error rate for Thai, Indonesian, and Vietnamese on our challenging and realistic YouTube test set by 25% to 40% compared to the Whisper large-v3 model, with merely 10% model parameters.

speech-recognition Speech Recognition

Detecting and Evaluating Medical Hallucinations in Large Vision Language Models

no code implementations14 Jun 2024 Jiawei Chen, Dingkang Yang, Tong Wu, Yue Jiang, Xiaolu Hou, Mingcheng Li, Shunli Wang, Dongling Xiao, Ke Li, Lihua Zhang

To bridge this gap, we introduce Med-HallMark, the first benchmark specifically designed for hallucination detection and evaluation within the medical multimodal domain.

Hallucination Medical Visual Question Answering +2

PAPR in Motion: Seamless Point-level 3D Scene Interpolation

no code implementations CVPR 2024 Shichong Peng, Yanshu Zhang, Ke Li

We propose the problem of point-level 3D scene interpolation, which aims to simultaneously reconstruct a 3D scene in two states from multiple views, synthesize smooth point-level interpolations between them, and render the scene from novel viewpoints, all without any supervision between the states.

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

no code implementations31 May 2024 Chaoyou Fu, Yuhan Dai, Yongdong Luo, Lei LI, Shuhuai Ren, Renrui Zhang, Zihan Wang, Chenyu Zhou, Yunhang Shen, Mengdan Zhang, Peixian Chen, Yanwei Li, Shaohui Lin, Sirui Zhao, Ke Li, Tong Xu, Xiawu Zheng, Enhong Chen, Rongrong Ji, Xing Sun

With Video-MME, we extensively evaluate various state-of-the-art MLLMs, including GPT-4 series and Gemini 1. 5 Pro, as well as open-source image models like InternVL-Chat-V1. 5 and video models like LLaVA-NeXT-Video.

PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications

1 code implementation29 May 2024 Dingkang Yang, Jinjie Wei, Dongling Xiao, Shunli Wang, Tong Wu, Gang Li, Mingcheng Li, Shuaibing Wang, Jiawei Chen, Yue Jiang, Qingyao Xu, Ke Li, Peng Zhai, Lihua Zhang

In the parameter-efficient secondary SFT phase, a mixture of universal-specific experts strategy is presented to resolve the competency conflict between medical generalist and pediatric expertise mastery.

Domain Adaptation

IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs

no code implementations5 May 2024 Yuzhen Mao, Martin Ester, Ke Li

One limitation of existing Transformer-based models is that they cannot handle very long sequences as input since their self-attention operations exhibit quadratic time and space complexity.

Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

no code implementations24 Apr 2024 Aobo Liang, Xingguo Jiang, Yan Sun, Xiaohou Shi, Ke Li

To enhance Mamba's ability to preserve historical information in a longer range, we design a novel Mamba+ block by adding a forget gate inside Mamba to selectively combine the new features with the historical features in a complementary manner.

Computational Efficiency Mamba +2

Multi-Modal Prompt Learning on Blind Image Quality Assessment

1 code implementation23 Apr 2024 Wensheng Pan, Timin Gao, Yan Zhang, Runze Hu, Xiawu Zheng, Enwei Zhang, Yuting Gao, Yutao Liu, Yunhang Shen, Ke Li, Shengchuan Zhang, Liujuan Cao, Rongrong Ji

Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.

A Survey of Decomposition-Based Evolutionary Multi-Objective Optimization: Part II -- A Data Science Perspective

no code implementations22 Apr 2024 Mingyu Huang, Ke Li

This paper presents the second part of the two-part survey series on decomposition-based evolutionary multi-objective optimization where we mainly focus on discussing the literature related to multi-objective evolutionary algorithms based on decomposition (MOEA/D).

Anatomy Descriptive +1

A Survey of Decomposition-Based Evolutionary Multi-Objective Optimization: Part I-Past and Future

no code implementations22 Apr 2024 Ke Li

In the first part, we present a comprehensive survey of the development of MOEA/D from its origin to the current state-of-the-art approaches.

Decision Making Survey

Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics

1 code implementation8 Apr 2024 Zhengde Zhang, Yiyu Zhang, Haodong Yao, Jianwen Luo, Rui Zhao, Bo Huang, Jiameng Zhao, Yipu Liao, Ke Li, Lina Zhao, Jun Cao, Fazhi Qi, Changzheng Yuan

To address this challenge, a sophisticated large language model system named as Xiwu has been developed, allowing you switch between the most advanced foundation models and quickly teach the model domain knowledge.

Code Generation Language Modelling +1

A General and Efficient Training for Transformer via Token Expansion

1 code implementation CVPR 2024 Wenxuan Huang, Yunhang Shen, Jiao Xie, Baochang Zhang, Gaoqi He, Ke Li, Xing Sun, Shaohui Lin

The remarkable performance of Vision Transformers (ViTs) typically requires an extremely large training cost.

Genetic Auto-prompt Learning for Pre-trained Code Intelligence Language Models

no code implementations20 Mar 2024 Chengzhe Feng, Yanan sun, Ke Li, Pan Zhou, Jiancheng Lv, Aojun Lu

We conduct GenAP on three popular code intelligence PLMs with three canonical code intelligence tasks including defect prediction, code summarization, and code translation.

Code Summarization Code Translation

RESTORE: Towards Feature Shift for Vision-Language Prompt Learning

1 code implementation10 Mar 2024 Yuncheng Yang, Chuyan Zhang, Zuopeng Yang, Yuting Gao, Yulei Qin, Ke Li, Xing Sun, Jie Yang, Yun Gu

Prompt learning is effective for fine-tuning foundation models to improve their generalization across a variety of downstream tasks.

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification

no code implementations8 Mar 2024 Dingkang Yang, Mingcheng Li, Dongling Xiao, Yang Liu, Kun Yang, Zhaoyu Chen, Yuzheng Wang, Peng Zhai, Ke Li, Lihua Zhang

In the inference phase, given a factual multimodal input, MCIS imagines two counterfactual scenarios to purify and mitigate these biases.

counterfactual Counterfactual Inference +1

Towards Multimodal Human Intention Understanding Debiasing via Subject-Deconfounding

no code implementations8 Mar 2024 Dingkang Yang, Dongling Xiao, Ke Li, Yuzheng Wang, Zhaoyu Chen, Jinjie Wei, Lihua Zhang

Multimodal intention understanding (MIU) is an indispensable component of human expression analysis (e. g., sentiment or humor) from heterogeneous modalities, including visual postures, linguistic contents, and acoustic behaviors.

Sinkhorn Distance Minimization for Knowledge Distillation

1 code implementation27 Feb 2024 Xiao Cui, Yulei Qin, Yuting Gao, Enwei Zhang, Zihan Xu, Tong Wu, Ke Li, Xing Sun, Wengang Zhou, Houqiang Li

We propose the Sinkhorn Knowledge Distillation (SinKD) that exploits the Sinkhorn distance to ensure a nuanced and precise assessment of the disparity between teacher and student distributions.

Decoder Knowledge Distillation

Is the System Message Really Important to Jailbreaks in Large Language Models?

no code implementations20 Feb 2024 Xiaotian Zou, Yongkang Chen, Ke Li

Furthermore, we propose the System Messages Evolutionary Algorithm (SMEA) to generate system messages that are more resistant to jailbreak prompts, even with minor changes.

Evolutionary Algorithms

Multi-Fidelity Methods for Optimization: A Survey

no code implementations15 Feb 2024 Ke Li, Fan Li

Real-world black-box optimization often involves time-consuming or costly experiments and simulations.

Benchmarking Computational Efficiency +4

ProvNeRF: Modeling per Point Provenance in NeRFs as a Stochastic Field

no code implementations16 Jan 2024 Kiyohiro Nakayama, Mikaela Angelina Uy, Yang You, Ke Li, Leonidas J. Guibas

Neural radiance fields (NeRFs) have gained popularity with multiple works showing promising results across various applications.

Novel View Synthesis

Human-in-the-Loop Policy Optimization for Preference-Based Multi-Objective Reinforcement Learning

no code implementations4 Jan 2024 Ke Li, Han Guo

The learned preference information is used to progressively guide policy optimization towards policies of interest.

Decision Making Management +1

Evolutionary Alternating Direction Method of Multipliers for Constrained Multi-Objective Optimization with Unknown Constraints

no code implementations2 Jan 2024 Shuang Li, Ke Li, Wei Li, Ming Yang

Constrained multi-objective optimization problems (CMOPs) pervade real-world applications in science, engineering, and design.

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

1 code implementation CVPR 2024 Xinzi Cao, Xiawu Zheng, Guanhong Wang, Weijiang Yu, Yunhang Shen, Ke Li, Yutong Lu, Yonghong Tian

The LER optimizes the distribution of potential known class samples in unlabeled data thus ensuring the preservation of knowledge related to known categories while learning novel classes.

Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection

no code implementations CVPR 2024 Ke Li, Di Wang, Zhangyuan Hu, Wenxuan Zhu, Shaofeng Li, Quan Wang

In this paper we propose an efficient convolution module for SAR object detection called SFS-Conv which increases feature diversity within each convolutional layer through a shunt-perceive-select strategy.

feature selection Model Compression +3

How Good Are Deep Generative Models for Solving Inverse Problems?

no code implementations20 Dec 2023 Shichong Peng, Alireza Moazeni, Ke Li

We assess the validity of these models' outputs as solutions to the inverse problems and conduct a thorough analysis of the reliability of the models' estimates of uncertainty over the solution.

Super-Resolution valid

Weakly Supervised Open-Vocabulary Object Detection

no code implementations19 Dec 2023 Jianghang Lin, Yunhang Shen, Bingquan Wang, Shaohui Lin, Ke Li, Liujuan Cao

Despite weakly supervised object detection (WSOD) being a promising step toward evading strong instance-level annotations, its capability is confined to closed-set categories within a single training dataset.

Attribute Novel Concepts +6

SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space

1 code implementation13 Dec 2023 Yunchen Li, Zhou Yu, Gaoqi He, Yunhang Shen, Ke Li, Xing Sun, Shaohui Lin

On the other hand, the model unconditionally learns the probability distribution of the data $p(X)$ and generates samples that conform to this distribution.

Denoising Traffic Prediction

MMICT: Boosting Multi-Modal Fine-Tuning with In-Context Examples

1 code implementation11 Dec 2023 Tao Chen, Enwei Zhang, Yuting Gao, Ke Li, Xing Sun, Yan Zhang, Hui Li, Rongrong Ji

Although In-Context Learning (ICL) brings remarkable performance gains to Large Language Models (LLMs), the improvements remain lower than fine-tuning on downstream tasks.

In-Context Learning

Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity

no code implementations11 Dec 2023 Xudong Li, Timin Gao, Runze Hu, Yan Zhang, Shengchuan Zhang, Xiawu Zheng, Jingyuan Zheng, Yunhang Shen, Ke Li, Yutao Liu, Pingyang Dai, Rongrong Ji

Specifically, QFM-IQM enhances the semantic noise distinguish capabilities by matching image pairs with similar quality scores but varying semantic features as adversarial semantic noise and adaptively adjusting the upstream task's features by reducing sensitivity to adversarial noise perturbation.

Contrastive Learning feature selection +1

Constrained Bayesian Optimization Under Partial Observations: Balanced Improvements and Provable Convergence

1 code implementation6 Dec 2023 Shengbo Wang, Ke Li

We endeavor to design an efficient and provable method for expensive POCOPs under the framework of constrained Bayesian optimization.

Bayesian Optimization

Towards the Inferrence of Structural Similarity of Combinatorial Landscapes

no code implementations5 Dec 2023 Mingyu Huang, Ke Li

However, due to the black-box nature of combinatorial optimization, it is far from trivial to infer such similarity in real-world scenarios.

Combinatorial Optimization

Aligning and Prompting Everything All at Once for Universal Visual Perception

2 code implementations CVPR 2024 Yunhang Shen, Chaoyou Fu, Peixian Chen, Mengdan Zhang, Ke Li, Xing Sun, Yunsheng Wu, Shaohui Lin, Rongrong Ji

However, predominant paradigms, driven by casting instance-level tasks as an object-word alignment, bring heavy cross-modality interaction, which is not effective in prompting object detection and visual grounding.

Object object-detection +6

Rethinking Urban Mobility Prediction: A Super-Multivariate Time Series Forecasting Approach

2 code implementations4 Dec 2023 Jinguo Cheng, Ke Li, Yuxuan Liang, Lijun Sun, Junchi Yan, Yuankai Wu

To address this challenge, we present the Super-Multivariate Urban Mobility Transformer (SUMformer), which utilizes a specially designed attention mechanism to calculate temporal and cross-variable correlations and reduce computational costs stemming from a large number of time series.

Multivariate Time Series Forecasting Time Series +1

Less is More: Learning Reference Knowledge Using No-Reference Image Quality Assessment

no code implementations1 Dec 2023 Xudong Li, Jingyuan Zheng, Xiawu Zheng, Runze Hu, Enwei Zhang, Yuting Gao, Yunhang Shen, Ke Li, Yutao Liu, Pingyang Dai, Yan Zhang, Rongrong Ji

Concretely, by innovatively introducing a novel feature distillation method in IQA, we propose a new framework to learn comparative knowledge from non-aligned reference images.

Inductive Bias NR-IQA

On the Hyperparameter Loss Landscapes of Machine Learning Models: An Exploratory Study

no code implementations23 Nov 2023 Mingyu Huang, Ke Li

Previous efforts on hyperparameter optimization (HPO) of machine learning (ML) models predominately focus on algorithmic advances, yet little is known about the topography of the underlying hyperparameter (HP) loss landscape, which plays a fundamental role in governing the search process of HPO.

Hyperparameter Optimization Transfer Learning

AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs

no code implementations12 Nov 2023 Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Ke Li, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

In this work, we extend the instruction-tuned Llama-2 model with end-to-end general-purpose speech processing and reasoning abilities while maintaining the wide range of original LLM capabilities, without using any carefully curated paired data.

Question Answering

NeRF Revisited: Fixing Quadrature Instability in Volume Rendering

no code implementations NeurIPS 2023 Mikaela Angelina Uy, Kiyohiro Nakayama, Guandao Yang, Rahul Krishna Thomas, Leonidas Guibas, Ke Li

Volume rendering requires evaluating an integral along each ray, which is numerically approximated with a finite sum that corresponds to the exact integral along the ray under piecewise constant volume density.

Woodpecker: Hallucination Correction for Multimodal Large Language Models

1 code implementation24 Oct 2023 Shukang Yin, Chaoyou Fu, Sirui Zhao, Tong Xu, Hao Wang, Dianbo Sui, Yunhang Shen, Ke Li, Xing Sun, Enhong Chen

Hallucination is a big shadow hanging over the rapidly evolving Multimodal Large Language Models (MLLMs), referring to the phenomenon that the generated text is inconsistent with the image content.

Hallucination

Solving Expensive Optimization Problems in Dynamic Environments with Meta-learning

1 code implementation19 Oct 2023 huan zhang, Jinliang Ding, Liang Feng, Kay Chen Tan, Ke Li

Although data-driven evolutionary optimization and Bayesian optimization (BO) approaches have shown promise in solving expensive optimization problems in static environments, the attempts to develop such approaches in dynamic environments remain rarely unexplored.

Bayesian Optimization Meta-Learning

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

no code implementations22 Sep 2023 Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli

In this work, we propose the use of an adaptive masking approach in two scenarios for pruning a multilingual ASR model efficiently, each resulting in sparse monolingual models or a sparse multilingual model (named as Dynamic ASR Pathways).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Masked Autoencoders are Efficient Class Incremental Learners

1 code implementation ICCV 2023 Jiang-Tian Zhai, Xialei Liu, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng

Moreover, MAEs can reliably reconstruct original input images from randomly selected patches, which we use to store exemplars from past tasks more efficiently for CIL.

class-incremental learning Class Incremental Learning +1

MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection

1 code implementation ICCV 2023 Junkai Xu, Liang Peng, Haoran Cheng, Hao Li, Wei Qian, Ke Li, Wenxiao Wang, Deng Cai

To the best of our knowledge, this work is the first to introduce volume rendering for M3D, and demonstrates the potential of implicit reconstruction for image-based 3D perception.

3D geometry Monocular 3D Object Detection +2

LiDAR Meta Depth Completion

1 code implementation24 Jul 2023 Wolfgang Boettcher, Lukas Hoyer, Ozan Unal, Ke Li, Dengxin Dai

While using a single model, our method yields significantly better results than a non-adaptive baseline trained on different LiDAR patterns.

Depth Completion Monocular Depth Estimation

Prompting Large Language Models with Speech Recognition Abilities

no code implementations21 Jul 2023 Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Furthermore, we perform ablation studies to investigate whether the LLM can be completely frozen during training to maintain its original capabilities, scaling up the audio encoder, and increasing the audio encoder striding to generate fewer embeddings.

Abstractive Text Summarization Automatic Speech Recognition +3

PAPR: Proximity Attention Point Rendering

1 code implementation NeurIPS 2023 Yanshu Zhang, Shichong Peng, Alireza Moazeni, Ke Li

PAPR effectively learns point cloud positions to represent the correct scene geometry, even when the initialization drastically differs from the target geometry.

Multi-View 3D Reconstruction Neural Rendering +1

Model-Assisted Probabilistic Safe Adaptive Control With Meta-Bayesian Learning

no code implementations3 Jul 2023 Shengbo Wang, Ke Li, Yin Yang, Yuting Cao, TingWen Huang, Shiping Wen

Specifically, with the help of CBF method, we learn the inherent and external uncertainties by a unified adaptive Bayesian linear regression (ABLR) model, which consists of a forward neural network (NN) and a Bayesian output layer.

Meta-Learning Safe Exploration

MEMD-ABSA: A Multi-Element Multi-Domain Dataset for Aspect-Based Sentiment Analysis

1 code implementation29 Jun 2023 Hongjie Cai, Nan Song, Zengzhi Wang, Qiming Xie, Qiankun Zhao, Ke Li, Siwei Wu, Shijie Liu, Jianfei Yu, Rui Xia

Aspect-based sentiment analysis is a long-standing research interest in the field of opinion mining, and in recent years, researchers have gradually shifted their focus from simple ABSA subtasks to end-to-end multi-element ABSA tasks.

Aspect-Based Sentiment Analysis Opinion Mining +1

A Survey on Multimodal Large Language Models

1 code implementation23 Jun 2023 Shukang Yin, Chaoyou Fu, Sirui Zhao, Ke Li, Xing Sun, Tong Xu, Enhong Chen

Recently, Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks.

Hallucination In-Context Learning +8

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

3 code implementations23 Jun 2023 Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, Yunsheng Wu, Rongrong Ji

Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image.

Benchmarking Language Modelling +4

Learning from Visual Observation via Offline Pretrained State-to-Go Transformer

no code implementations NeurIPS 2023 Bohan Zhou, Ke Li, Jiechuan Jiang, Zongqing Lu

Learning from visual observation (LfVO), aiming at recovering policies from only visual observation data, is promising yet a challenging problem.

Minecraft reinforcement-learning +1

Multi-modal Queried Object Detection in the Wild

1 code implementation NeurIPS 2023 Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu

To address the learning inertia problem brought by the frozen detector, a vision conditioned masked language prediction strategy is proposed.

Few-Shot Object Detection Object +2

READ: Recurrent Adaptation of Large Transformers

no code implementations24 May 2023 John Nguyen, Sid Wang, Ke Li, Carole-Jean Wu

However, fine-tuning all pre-trained model parameters becomes impractical as the model size and number of tasks increase.

Transfer Learning

Compressing neural network by tensor network with exponentially fewer variational parameters

no code implementations10 May 2023 Yong Qing, Ke Li, Peng-Fei Zhou, Shi-Ju Ran

In this work, we propose a general compression scheme that significantly reduces the variational parameters of NN by encoding them to deep automatically-differentiable tensor network (ADTN) that contains exponentially-fewer free parameters.

Tensor Networks

The Best Defense is Attack: Repairing Semantics in Textual Adversarial Examples

no code implementations6 May 2023 Heng Yang, Ke Li

Recent studies have revealed the vulnerability of pre-trained language models to adversarial attacks.

Adversarial Attack Adversarial Defense

DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

no code implementations ICCV 2023 Kiyohiro Nakayama, Mikaela Angelina Uy, Jiahui Huang, Shi-Min Hu, Ke Li, Leonidas J Guibas

We propose a factorization that models independent part style and part configuration distributions and presents a novel cross-diffusion network that enables us to generate coherent and plausible shapes under our proposed factorization.

Point Cloud Generation

SketchXAI: A First Look at Explainability for Human Sketches

no code implementations CVPR 2023 Zhiyu Qu, Yulia Gryaditskaya, Ke Li, Kaiyue Pang, Tao Xiang, Yi-Zhe Song

Following this, we design a simple explainability-friendly sketch encoder that accommodates the intrinsic properties of strokes: shape, location, and order.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +1

Diffusion models with location-scale noise

no code implementations12 Apr 2023 Alexia Jolicoeur-Martineau, Kilian Fatras, Ke Li, Tal Kachman

Diffusion Models (DMs) are powerful generative models that add Gaussian noise to the data and learn to remove it.

HybridFusion: LiDAR and Vision Cross-Source Point Cloud Fusion

no code implementations10 Apr 2023 Yu Wang, Shuhui Bu, Lin Chen, Yifei Dong, Kun Li, Xuefeng Cao, Ke Li

First, the point cloud is divided into small patches, and a matching patch set is selected based on global descriptors and spatial distribution, which constitutes the coarse matching process.

Point Cloud Registration

RFAConv: Innovating Spatial Attention and Standard Convolutional Operation

1 code implementation6 Apr 2023 Xin Zhang, Chen Liu, Degang Yang, Tingting Song, Yichen Ye, Ke Li, Yingze Song

In this paper, we propose a new perspective on the effectiveness of spatial attention, which is that the spatial attention mechanism essentially solves the problem of convolutional kernel parameter sharing.

Classification Object Detection +1

SoftCLIP: Softer Cross-modal Alignment Makes CLIP Stronger

no code implementations30 Mar 2023 Yuting Gao, Jinfeng Liu, Zihan Xu, Tong Wu Enwei Zhang, Wei Liu, Jie Yang, Ke Li, Xing Sun

During the preceding biennium, vision-language pre-training has achieved noteworthy success on several downstream tasks.

cross-modal alignment Zero-Shot Learning

SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates

no code implementations CVPR 2023 Mikaela Angelina Uy, Ricardo Martin-Brualla, Leonidas Guibas, Ke Li

To address this issue, we introduce SCADE, a novel technique that improves NeRF reconstruction quality on sparse, unconstrained input views for in-the-wild indoor scenes.

3D Reconstruction Monocular Depth Estimation +1

Practical Cross-System Shilling Attacks with Limited Access to Data

1 code implementation14 Feb 2023 Meifang Zeng, Ke Li, Bingchuan Jiang, Liujuan Cao, Hui Li

With the idea of Cross-system Attack, we design a Practical Cross-system Shilling Attack (PC-Attack) framework that requires little information about the victim RS model and the target RS data for conducting attacks.

Recommendation Systems

Quality Indicators for Preference-based Evolutionary Multi-objective Optimization Using a Reference Point: A Review and Analysis

1 code implementation28 Jan 2023 Ryoji Tanabe, Ke Li

Some quality indicators have been proposed for benchmarking preference-based evolutionary multi-objective optimization algorithms using a reference point.

Benchmarking Decision Making

Photo Pre-Training, but for Sketch

1 code implementation CVPR 2023 Ke Li, Kaiyue Pang, Yi-Zhe Song

This lack of sketch data has imposed on the community a few "peculiar" design choices -- the most representative of them all is perhaps the coerced utilisation of photo-based pre-training (i. e., no sketch), for many core tasks that otherwise dictates specific sketch understanding.

Sketch-Based Image Retrieval Triplet

Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation

no code implementations CVPR 2023 Phoenix Neale Williams, Ke Li

However, existing methods often struggle to simultaneously minimize the number of modified pixels and the size of the modifications, often requiring a large number of queries and assuming unrestricted access to the targeted DNN.

Adversarial Attack

Fewer is More: Efficient Object Detection in Large Aerial Images

1 code implementation26 Dec 2022 Xingxing Xie, Gong Cheng, Qingyang Li, Shicheng Miao, Ke Li, Junwei Han

Current mainstream object detection methods for large aerial images usually divide large images into patches and then exhaustively detect the objects of interest on all patches, no matter whether there exist objects or not.

4k Object +2

Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning

1 code implementation CVPR 2024 Xialei Liu, Jiang-Tian Zhai, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng

EFCIL is of interest because it mitigates concerns about privacy and long-term storage of data, while at the same time alleviating the problem of catastrophic forgetting in incremental learning.

class-incremental learning Class Incremental Learning +1

Improving Fast-slow Encoder based Transducer with Streaming Deliberation

no code implementations15 Dec 2022 Ke Li, Jay Mahadeokar, Jinxi Guo, Yangyang Shi, Gil Keren, Ozlem Kalinli, Michael L. Seltzer, Duc Le

Experiments on Librispeech and in-house data show relative WER reductions (WERRs) from 3% to 5% with a slight increase in model size and negligible extra token emission latency compared with fast-slow encoder based transducer.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

CHIMLE: Conditional Hierarchical IMLE for Multimodal Conditional Image Synthesis

1 code implementation25 Nov 2022 Shichong Peng, Alireza Moazeni, Ke Li

A persistent challenge in conditional image synthesis has been to generate diverse output images from the same input image despite only one output image being observed per input image.

Image Generation Image Super-Resolution

Immersive Neural Graphics Primitives

1 code implementation24 Nov 2022 Ke Li, Tim Rolff, Susanne Schmidt, Reinhard Bacher, Simone Frintrop, Wim Leemans, Frank Steinicke

In this paper, we present and evaluate a NeRF-based framework that is capable of rendering scenes in immersive VR allowing users to freely move their heads to explore complex real-world scenes.

Benchmarking Super-Resolution

A Data-Driven Evolutionary Transfer Optimization for Expensive Problems in Dynamic Environments

no code implementations5 Nov 2022 Ke Li, Renzhi Chen, Xin Yao

Many real-world problems are usually computationally costly and the objective functions evolve over time.

Transfer Learning

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition

no code implementations31 Oct 2022 Suyoun Kim, Ke Li, Lucas Kabela, Rongqing Huang, Jiedan Zhu, Ozlem Kalinli, Duc Le

In this work, we present our Joint Audio/Text training method for Transformer Rescorer, to leverage unpaired text-only data which is relatively cheaper than paired audio-text data.

speech-recognition Speech Recognition

Micro and Macro Level Graph Modeling for Graph Variational Auto-Encoders

1 code implementation30 Oct 2022 Kiarash Zahirnia, Oliver Schulte, Parmis Naddaf, Ke Li

We utilize the micro-macro objective to improve graph generation with a GraphVAE, a well-established model based on graph-level latent variables, that provides fast training and generation time for medium-sized graphs.

Graph Generation

BootAug: Boosting Text Augmentation via Hybrid Instance Filtering Framework

1 code implementation6 Oct 2022 Heng Yang, Ke Li

Our experimental results on three classification tasks and nine public datasets show that BootAug addresses the performance drop problem and outperforms state-of-the-art text augmentation methods.

Language Modelling Sentence +5

Long-Tailed Class Incremental Learning

1 code implementation1 Oct 2022 Xialei Liu, Yu-Song Hu, Xu-Sheng Cao, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng

However, conventional CIL methods consider a balanced distribution for each new task, which ignores the prevalence of long-tailed distributions in the real world.

class-incremental learning Class Incremental Learning +1

Defend Data Poisoning Attacks on Voice Authentication

no code implementations9 Sep 2022 Ke Li, Cameron Baird, Dan Lin

With the advances in deep learning, speaker verification has achieved very high accuracy and is gaining popularity as a type of biometric authentication option in many scenes of our daily life, especially the growing market of web services.

Data Poisoning Ensemble Learning +1

LAB-Net: LAB Color-Space Oriented Lightweight Network for Shadow Removal

1 code implementation27 Aug 2022 Hong Yang, Gongrui Nan, Mingbao Lin, Fei Chao, Yunhang Shen, Ke Li, Rongrong Ji

Finally, the LSA modules are further developed to fully use the prior information in non-shadow regions to cleanse the shadow regions.

Shadow Removal

PyABSA: A Modularized Framework for Reproducible Aspect-based Sentiment Analysis

2 code implementations2 Aug 2022 Heng Yang, Chen Zhang, Ke Li

The advancement of aspect-based sentiment analysis (ABSA) has urged the lack of a user-friendly framework that can largely lower the difficulty of reproducing state-of-the-art ABSA performance, especially for beginners.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +5

Learning Best Combination for Efficient N:M Sparsity

1 code implementation14 Jun 2022 Yuxin Zhang, Mingbao Lin, Zhihang Lin, Yiting Luo, Ke Li, Fei Chao, Yongjian Wu, Rongrong Ji

In this paper, we show that the N:M learning can be naturally characterized as a combinatorial problem which searches for the best combination candidate within a finite collection.

Efficient Decoder-free Object Detection with Transformers

2 code implementations14 Jun 2022 Peixian Chen, Mengdan Zhang, Yunhang Shen, Kekai Sheng, Yuting Gao, Xing Sun, Ke Li, Chunhua Shen

A natural usage of ViTs in detection is to replace the CNN-based backbone with a transformer-based backbone, which is straightforward and effective, with the price of bringing considerable computation burden for inference.

Decoder Object +1

Evolutionary Multi-Task Injection Testing on Web Application Firewalls

1 code implementation12 Jun 2022 Ke Li, Heng Yang, Willem Visser

In this paper, we propose DaNuoYi, an automatic injection testing tool that simultaneously generates test inputs for multiple types of injection attacks on a WAF.

Multi-Task Learning Translation

Do We Really Need to Use Constraint Violation in Constrained Evolutionary Multi-Objective Optimization?

no code implementations28 May 2022 Shuang Li, Ke Li, Wei Li

Constraint violation has been a building block to design evolutionary multi-objective optimization algorithms for solving constrained multi-objective optimization problems.

Data-Driven Evolutionary Multi-Objective Optimization Based on Multiple-Gradient Descent for Disconnected Pareto Fronts

no code implementations28 May 2022 Renzhi Chen, Ke Li

Data-driven evolutionary multi-objective optimization (EMO) has been recognized as an effective approach for multi-objective optimization problems with expensive objective functions.

C3KG: A Chinese Commonsense Conversation Knowledge Graph

1 code implementation6 Apr 2022 Dawei Li, Yanran Li, Jiayi Zhang, Ke Li, Chen Wei, Jianwei Cui, Bin Wang

Existing commonsense knowledge bases often organize tuples in an isolated manner, which is deficient for commonsense conversational models to plan the next steps.

Interactive Evolutionary Multi-Objective Optimization via Learning-to-Rank

no code implementations6 Apr 2022 Ke Li, Guiyu Lai, Xin Yao

Bearing this in mind, this paper develops a framework for designing preference-based EMO algorithms to find SOI in an interactive manner.

Decision Making Learning-To-Rank

Training-free Transformer Architecture Search

1 code implementation CVPR 2022 Qinqin Zhou, Kekai Sheng, Xiawu Zheng, Ke Li, Xing Sun, Yonghong Tian, Jie Chen, Rongrong Ji

Recently, Vision Transformer (ViT) has achieved remarkable success in several computer vision tasks.

Diversity

ARM: Any-Time Super-Resolution Method

1 code implementation21 Mar 2022 Bohong Chen, Mingbao Lin, Kekai Sheng, Mengdan Zhang, Peixian Chen, Ke Li, Liujuan Cao, Rongrong Ji

To that effect, we construct an Edge-to-PSNR lookup table that maps the edge score of an image patch to the PSNR performance for each subnet, together with a set of computation costs for the subnets.

Image Super-Resolution

Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks

1 code implementation8 Mar 2022 Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji

However, these methods suffer from severe performance degradation when quantizing the SR models to ultra-low precision (e. g., 2-bit and 3-bit) with the low-cost layer-wise quantizer.

Quantization Super-Resolution

CF-ViT: A General Coarse-to-Fine Method for Vision Transformer

1 code implementation8 Mar 2022 Mengzhao Chen, Mingbao Lin, Ke Li, Yunhang Shen, Yongjian Wu, Fei Chao, Rongrong Ji

Our proposed CF-ViT is motivated by two important observations in modern ViT models: (1) The coarse-grained patch splitting can locate informative regions of an input image.

Art-Attack: Black-Box Adversarial Attack via Evolutionary Art

no code implementations7 Mar 2022 Phoenix Williams, Ke Li

To evaluate the effectiveness of our proposed method, we attack three state-of-the-art image classification models trained on the CIFAR-10 dataset in a targeted manner.

Adversarial Attack Image Classification

Automated Few-Shot Time Series Forecasting based on Bi-level Programming

no code implementations7 Mar 2022 Jiangjiao Xu, Ke Li

One of the critical challenges of time series renewable energy forecasting is the lack of historical data to train an adequate predictive model.

Decision Making Few-Shot Learning +3

Variational Model Inversion Attacks

1 code implementation NeurIPS 2021 Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani

In this work, we provide a probabilistic interpretation of model inversion attacks, and formulate a variational objective that accounts for both diversity and accuracy.

Diversity

Unlocking the Secrets of Software Configuration Landscapes-Ruggedness, Accessibility, Escapability, and Transferability

no code implementations5 Jan 2022 Mingyu Huang, Peili Mao, Ke Li

Modern software systems are often highly configurable to tailor varied requirements from diverse stakeholders.

RMNet: Equivalently Removing Residual Connection from Networks

1 code implementation1 Nov 2021 Fanxu Meng, Hao Cheng, Jiaxin Zhuang, Ke Li, Xing Sun

In this paper, we aim to remedy this problem and propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock.

Network Pruning

Anchor-free Oriented Proposal Generator for Object Detection

1 code implementation5 Oct 2021 Gong Cheng, Jiabao Wang, Ke Li, Xingxing Xie, Chunbo Lang, Yanqing Yao, Junwei Han

Nowadays, oriented detectors mostly use horizontal boxes as intermedium to derive oriented boxes from them.

Object object-detection +2

A Two-Stage Framework to Generate Video Chapter

no code implementations29 Sep 2021 Canyu Le, Zhiyuan Tang, Ke Li, Jiandong Yang

On top of this dataset, we propose a two-stage framework to perform chapter localization and chapter title generation.

Vocal Bursts Valence Prediction