Search Results for author: Yang Li

Found 482 papers, 145 papers with code

Emotion Inference in Multi-Turn Conversations with Addressee-Aware Module and Ensemble Strategy

no code implementations EMNLP 2021 Dayu Li, Xiaodan Zhu, Yang Li, Suge Wang, Deyu Li, Jian Liao, Jianxing Zheng

Emotion inference in multi-turn conversations aims to predict the participant’s emotion in the next upcoming turn without knowing the participant’s response yet, and is a necessary step for applications such as dialogue planning.

ACFlow: Flow Models for Arbitrary Conditional Likelihoods

1 code implementation ICML 2020 Yang Li, Shoaib Akbar, Junier Oliva

However, a majority of generative modeling approaches are focused solely on the joint distribution $p(x)$ and utilize models where it is intractable to obtain the conditional distribution of some arbitrary subset of features $x_u$ given the rest of the observed covariates $x_o$: $p(x_u \mid x_o)$.

Imputation

TG-LMM: Enhancing Medical Image Segmentation Accuracy through Text-Guided Large Multi-Modal Model

no code implementations5 Sep 2024 Yihao Zhao, Enhao Zhong, Cuiyun Yuan, Yang Li, Man Zhao, Chunxia Li, Jun Hu, Chenbin Liu

We propose TG-LMM (Text-Guided Large Multi-Modal Model), a novel approach that leverages textual descriptions of organs to enhance segmentation accuracy in medical images.

Image Segmentation Medical Image Segmentation +2

DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios

no code implementations2 Sep 2024 Yang Li, Jianli Xiao

Accurate real-time object detection enhances the safety of advanced driver-assistance systems, making it an essential component in driving scenarios.

Object object-detection +1

Extraction of Typical Operating Scenarios of New Power System Based on Deep Time Series Aggregation

no code implementations23 Aug 2024 Zhaoyang Qu, Zhenming Zhang, Nan Qu, Yuguang Zhou, Yang Li, Tao Jiang, Min Li, Chao Long

This study proposed a novel deep time series aggregation scheme (DTSAs) to generate typical operational scenarios, considering the large amount of historical operational snapshot data.

Scheduling Time Series

Optimal Dispatch Strategy for a Multi-microgrid Cooperative Alliance Using a Two-Stage Pricing Mechanism

no code implementations23 Aug 2024 Yonghui Nie, Zhi Li, Jie Zhang, Lei Gao, Yang Li, Hengyu Zhou

To coordinate resources among multi-level stakeholders and enhance the integration of electric vehicles (EVs) into multi-microgrids, this study proposes an optimal dispatch strategy within a multi-microgrid cooperative alliance using a nuanced two-stage pricing mechanism.

Scheduling

P3P: Pseudo-3D Pre-training for Scaling 3D Masked Autoencoders

no code implementations19 Aug 2024 Xuechao Chen, Ying Chen, Jialin Li, Qiang Nie, Yong liu, QiXing Huang, Yang Li

Inspired by semi-supervised learning leveraging limited labeled data and a large amount of unlabeled data, in this work, we propose a novel self-supervised pre-training framework utilizing the real 3D data and the pseudo-3D data lifted from images by a large depth estimation model.

3D Classification Depth Estimation +1

OU-CoViT: Copula-Enhanced Bi-Channel Multi-Task Vision Transformers with Dual Adaptation for OU-UWF Images

no code implementations18 Aug 2024 Yang Li, Jianing Deng, Chong Zhong, Danjuan Yang, Meiyan Li, A. H. Welsh, Aiyi Liu, Xingtao Zhou, Catherine C. Liu, Bo Fu

Furthermore, the novel architecture of OU-CoViT allows generalizability and extensions of our dual adaptation and Copula Loss to various ViT variants and large DL models on small medical datasets.

Coordinated Spectral Efficiency Prediction for Real-World 5G CoMP Systems

no code implementations14 Aug 2024 Zhixing Chen, Zhaoyu Fan, Yang Li, Yibin Kang, Qi Yan, Qingjiang Shi

However, characterizing the CSE is intractable due to the inherent complexity of the CoMP channel model and the diversity of the 5G dynamic network environment, which poses a great challenge for CSE prediction in real-world 5G CoMP systems.

Diversity Management

Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches

no code implementations8 Aug 2024 Yongzhi Xu, Yonhon Ng, Yifu Wang, Inkyu Sa, Yunfei Duan, Yang Li, Pan Ji, Hongdong Li

From the generated isometric image, we use a pre-trained image understanding method to segment the image into meaningful parts, such as off-ground objects, trees, and buildings, and extract the 2D scene layout.

Denoising Unity

A Multi-Source Heterogeneous Knowledge Injected Prompt Learning Method for Legal Charge Prediction

no code implementations5 Aug 2024 Jingyun Sun, Chi Wei, Yang Li

Legal charge prediction, an essential task in legal AI, seeks to assign accurate charge labels to case descriptions, attracting significant recent interest.

Contrastive Learning

CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization

1 code implementation4 Aug 2024 Xiang He, Xiangxi Liu, Yang Li, Dongcheng Zhao, Guobin Shen, Qingqun Kong, Xin Yang, Yi Zeng

Specifically, we have enhanced the model's ability to discern subtle differences between event and background and improved the accuracy of event classification in our model.

audio-visual event localization

FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models

1 code implementation28 Jul 2024 Changgu Chen, Libing Yang, Xiaoyan Yang, Lianggangxu Chen, Gaoqi He, Changbo Wang, Yang Li

In this paper, we introduce a Fine-tuning Initial Noise Distribution (FIND) framework with policy optimization, which unleashes the powerful potential of pre-trained diffusion networks by directly optimizing the initial distribution to align the generated contents with user-input prompts.

Denoising Video Generation

Physical Adversarial Attack on Monocular Depth Estimation via Shape-Varying Patches

no code implementations24 Jul 2024 Chenxing Zhao, Yang Li, Shihao Wu, Wenyi Tan, Shuangju Zhou, Quan Pan

Adversarial attacks against monocular depth estimation (MDE) systems pose significant challenges, particularly in safety-critical applications such as autonomous driving.

Adversarial Attack Autonomous Driving +1

LawLuo: A Chinese Law Firm Co-run by LLM Agents

1 code implementation23 Jul 2024 Jingyun Sun, Chengxiao Dai, Zhongze Luo, Yangbo Chang, Yang Li

In response to these challenges, we propose a novel legal dialogue framework that leverages the collaborative capabilities of multiple LLM agents, termed LawLuo.

Hallucination Reading Comprehension

Transcranial low-level laser stimulation in near infrared-II region for brain safety and protection

no code implementations13 Jul 2024 Zhilin Li, Yongheng Zhao, Yiqing Hu, Yang Li, Keyao Zhang, Zhibing Gao, Lirou Tan, Hanli Liu, XiaoLi Li, Aihua Cao, Zaixu Cui, Chenguang Zhao

Background: The use of near-infrared lasers for transcranial photobiomodulation (tPBM) offers a non-invasive method for influencing brain activity and is beneficial for various neurological conditions.

EEG

Handling Distance Constraint in Movable Antenna Aided Systems: A General Optimization Framework

no code implementations11 Jul 2024 Yichen Jin, Qingfeng Lin, Yang Li, Yik-Chung Wu

The movable antenna (MA) is a promising technology to exploit more spatial degrees of freedom for enhancing wireless system performance.

SEED-Story: Multimodal Long Story Generation with Large Language Model

1 code implementation11 Jul 2024 Shuai Yang, Yuying Ge, Yang Li, Yukang Chen, Yixiao Ge, Ying Shan, Yingcong Chen

We further propose multimodal attention sink mechanism to enable the generation of stories with up to 25 sequences (only 10 for training) in a highly efficient autoregressive manner.

Image Generation Language Modelling +3

UICrit: Enhancing Automated Design Evaluation with a UICritique Dataset

1 code implementation11 Jul 2024 Peitong Duan, Chin-yi Chen, Gang Li, Bjoern Hartmann, Yang Li

We hypothesize that automatic evaluation can be improved by collecting a targeted UI feedback dataset and then using this dataset to enhance the performance of general-purpose LLMs.

Visual Prompting

Dy-mer: An Explainable DNA Sequence Representation Scheme using Sparse Recovery

no code implementations6 Jul 2024 Zhiyuan Peng, Yuanbo Tang, Yang Li

DNA sequences encode vital genetic and biological information, yet these unfixed-length sequences cannot serve as the input of common data mining algorithms.

SAM Fewshot Finetuning for Anatomical Segmentation in Medical Images

no code implementations5 Jul 2024 Weiyi Xie, Nathalie Willems, Shubham Patil, Yang Li, Mayank Kumar

With our method, users can manually segment a few 2D slices offline, and the embeddings of these annotated image regions serve as effective prompts for online segmentation tasks.

Decoder Segmentation

Deciphering interventional dynamical causality from non-intervention systems

no code implementations29 Jun 2024 Jifan Shi, Yang Li, Juan Zhao, Siyang Leng, Kazuyuki Aihara, Luonan Chen, Wei Lin

Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies.

Time Series

Directly Training Temporal Spiking Neural Network with Sparse Surrogate Gradient

no code implementations28 Jun 2024 Yang Li, Feifei Zhao, Dongcheng Zhao, Yi Zeng

Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features.

Enhancing Resilience of Power Systems against Typhoon Threats: A Hybrid Data-Model Driven Approach

no code implementations13 Jun 2024 Yang Li

This chapter addresses the increasing vulnerability of coastal regions to typhoons and the consequent power outages, emphasizing the critical role of power transmission systems in disaster resilience.

Spiking Neural Networks with Consistent Mapping Relations Allow High-Accuracy Inference

no code implementations8 Jun 2024 Yang Li, Xiang He, Qingqun Kong, Yi Zeng

Spike-based neuromorphic hardware has demonstrated substantial potential in low energy consumption and efficient inference.

object-detection Object Detection

Wings: Learning Multimodal LLMs without Text-only Forgetting

2 code implementations5 Jun 2024 Yi-Kai Zhang, Shiyin Lu, Yang Li, Yanqing Ma, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye

Initially, image and text inputs are aligned with visual learners operating alongside the main attention, balancing focus on visual elements.

Question Answering Visual Question Answering

Parrot: Multilingual Visual Instruction Tuning

2 code implementations4 Jun 2024 Hai-Long Sun, Da-Wei Zhou, Yang Li, Shiyin Lu, Chao Yi, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye

In this paper, we introduce Parrot, a novel method that utilizes textual guidance to drive visual token alignment at the language level.

Ovis: Structural Embedding Alignment for Multimodal Large Language Model

2 code implementations31 May 2024 Shiyin Lu, Yang Li, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, Han-Jia Ye

However, the misalignment between two embedding strategies in MLLMs -- the structural textual embeddings based on an embedding look-up table and the continuous embeddings generated directly by the vision encoder -- makes challenges for a more seamless fusion of visual and textual information.

Multimodal Large Language Model Visual Question Answering (VQA)

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

1 code implementation30 May 2024 Tenglong Liu, Yang Li, Yixing Lan, Hao Gao, Wei Pan, Xin Xu

Empirically, we conduct a series of experiments on the D4RL benchmark, where A2PR demonstrates state-of-the-art performance.

D4RL reinforcement-learning

EventZoom: A Progressive Approach to Event-Based Data Augmentation for Enhanced Neuromorphic Vision

no code implementations29 May 2024 Yiting Dong, Xiang He, Guobin Shen, Dongcheng Zhao, Yang Li, Yi Zeng

EventZoom employs a progressive temporal strategy that intelligently blends time and space to enhance the diversity and complexity of the data while maintaining its authenticity.

Data Augmentation Diversity

Revision Matters: Generative Design Guided by Revision Edits

no code implementations27 May 2024 Tao Li, Chin-Yi Cheng, Amber Xie, Gang Li, Yang Li

In contrast, self-revisions that fully rely on model's own judgement, lead to an echo chamber that prevents iterative improvement, and sometimes leads to generative degradation.

Layout Design

Causal-Aware Graph Neural Architecture Search under Distribution Shifts

no code implementations26 May 2024 Peiwen Li, Xin Wang, Zeyang Zhang, Yijian Qin, Ziwei Zhang, Jialong Wang, Yang Li, Wenwu Zhu

We propose to handle the distribution shifts in the graph architecture search process by discovering and exploiting the causal relationship between graphs and architectures to search for the optimal architectures that can generalize under distribution shifts.

Graph Embedding Neural Architecture Search +1

Devil's Advocate: Anticipatory Reflection for LLM Agents

no code implementations25 May 2024 Haoyu Wang, Tao Li, Zhiwei Deng, Dan Roth, Yang Li

The experimental results suggest that our introspection-driven approach not only enhances the agent's ability to navigate unanticipated challenges through a robust mechanism of plan execution, but also improves efficiency by reducing the number of trials and plan revisions by 45% needed to achieve a task.

Navigate

Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications

no code implementations24 May 2024 Yang Li, Changsheng Zhao, Hyungtak Lee, Ernie Chang, Yangyang Shi, Vikas Chandra

Large language models (LLMs) significantly enhance the performance of various applications, but they are computationally intensive and energy-demanding.

Code Generation Low-rank compression +1

VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks

1 code implementation24 May 2024 Yang Li, Shaobo Han, Shihao Ji

To further reduce stored parameters, we introduce a "divide-and-share" paradigm that breaks the barriers of low-rank decomposition across matrix dimensions, modules and layers by sharing parameters globally via a vector bank.

Natural Language Understanding parameter-efficient fine-tuning +1

LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image

no code implementations24 May 2024 Ruikai Cui, Xibin Song, Weixuan Sun, Senbo Wang, Weizhe Liu, Shenzhou Chen, Taizhang Shang, Yang Li, Nick Barnes, Hongdong Li, Pan Ji

Large Reconstruction Models have made significant strides in the realm of automated 3D content generation from single or multiple input images.

3D Reconstruction

CoLay: Controllable Layout Generation through Multi-conditional Latent Diffusion

no code implementations18 May 2024 Chin-Yi Cheng, Ruiqi Gao, Forrest Huang, Yang Li

Layout design generation has recently gained significant attention due to its potential applications in various fields, including UI, graphic, and floor plan design.

Layout Design

Leveraging Human Revisions for Improving Text-to-Layout Models

no code implementations16 May 2024 Amber Xie, Chin-Yi Cheng, Forrest Huang, Yang Li

Our method, Revision-Aware Reward Models ($\method$), allows a generative text-to-layout model to produce more modern, designer-aligned layouts, showing the potential for utilizing human revisions and stronger forms of feedback in improving generative models.

ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action Spaces

no code implementations5 May 2024 Libing Yang, Yang Li, Long Chen

In this paper, we introduce ClothPPO, a framework that employs a policy gradient algorithm based on actor-critic architecture to enhance a pre-trained model with huge 10^6 action spaces aligned with observation in the task of unfolding clothes.

Language Modelling Large Language Model

Imagine the Unseen: Occluded Pedestrian Detection via Adversarial Feature Completion

no code implementations2 May 2024 Shanshan Zhang, Mingqian Ji, Yang Li, Jian Yang

From the perspective of reducing intra-class variance, we propose to complete features for occluded regions so as to align the features of pedestrians across different occlusion patterns.

Pedestrian Detection

Probing Unlearned Diffusion Models: A Transferable Adversarial Attack Perspective

1 code implementation30 Apr 2024 Xiaoxuan Han, Songlin Yang, Wei Wang, Yang Li, Jing Dong

Specifically, we employ an adversarial search strategy to search for the adversarial embedding which can transfer across different unlearned models.

Adversarial Attack

NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

no code implementations25 Apr 2024 Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, HaoNing Wu, Yixuan Gao, Yuqin Cao, ZiCheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng, Jianquan Yang, Weigang Wang, Xi Fang, Xiaoxin Lv, Jun Yan, Tianwu Zhi, Yabin Zhang, Yaohui Li, Yang Li, Jingwen Xu, Jianzhao Liu, Yiting Liao, Junlin Li, Zihao Yu, Yiting Lu, Xin Li, Hossein Motamednia, S. Farhad Hosseini-Benvidi, Fengbin Guan, Ahmad Mahmoudi-Aznaveh, Azadeh Mansouri, Ganzorig Gankhuyag, Kihwan Yoon, Yifang Xu, Haotian Fan, Fangyuan Kong, Shiling Zhao, Weifeng Dong, Haibing Yin, Li Zhu, Zhiling Wang, Bingchen Huang, Avinab Saha, Sandeep Mishra, Shashank Gupta, Rajesh Sureddi, Oindrila Saha, Luigi Celona, Simone Bianco, Paolo Napoletano, Raimondo Schettini, Junfeng Yang, Jing Fu, Wei zhang, Wenzhi Cao, Limei Liu, Han Peng, Weijun Yuan, Zhan Li, Yihang Cheng, Yifan Deng, Haohui Li, Bowen Qu, Yao Li, Shuqing Luo, Shunzhou Wang, Wei Gao, Zihao Lu, Marcos V. Conde, Xinrui Wang, Zhibo Chen, Ruling Liao, Yan Ye, Qiulin Wang, Bing Li, Zhaokun Zhou, Miao Geng, Rui Chen, Xin Tao, Xiaoyu Liang, Shangkun Sun, Xingyuan Ma, Jiaze Li, Mengduo Yang, Haoran Xu, Jie zhou, Shiding Zhu, Bohan Yu, Pengfei Chen, Xinrui Xu, Jiabin Shen, Zhichao Duan, Erfan Asadi, Jiahe Liu, Qi Yan, Youran Qu, Xiaohui Zeng, Lele Wang, Renjie Liao

A total of 196 participants have registered in the video track.

Image Quality Assessment Image Restoration +2

RealTCD: Temporal Causal Discovery from Interventional Data with Large Language Model

no code implementations23 Apr 2024 Peiwen Li, Xin Wang, Zeyang Zhang, Yuan Meng, Fang Shen, Yue Li, Jialong Wang, Yang Li, Wenweu Zhu

In the field of Artificial Intelligence for Information Technology Operations, causal discovery is pivotal for operation and maintenance of graph construction, facilitating downstream industrial tasks such as root cause analysis.

Causal Discovery graph construction +2

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

no code implementations19 Apr 2024 Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao liu, Yuan Xie, Xiang Bai, Can Huang

Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive, high-quality instruction tuning data.

Hallucination Hallucination Evaluation +2

Counterfactual Explanations for Face Forgery Detection via Adversarial Removal of Artifacts

1 code implementation12 Apr 2024 Yang Li, Songlin Yang, Wei Wang, Ziwen He, Bo Peng, Jing Dong

We verify the effectiveness of the proposed explanations from two aspects: (1) Counterfactual Trace Visualization: the enhanced forgery images are useful to reveal artifacts by visually contrasting the original images and two different visualization methods; (2) Transferable Adversarial Attacks: the adversarial forgery images generated by attacking the detection model are able to mislead other detection models, implying the removed artifacts are general.

Adversarial Attack counterfactual

PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer

1 code implementation7 Apr 2024 Xingyu Su, Xiaojie Zhu, Yang Li, Yong Li, Chi Chen, Paulo Esteves-Veríssimo

Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist.

TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

1 code implementation28 Mar 2024 Xiaokang Zhang, Jing Zhang, Zeyao Ma, Yang Li, Bohan Zhang, Guanlin Li, Zijun Yao, Kangli Xu, Jinchang Zhou, Daniel Zhang-li, Jifan Yu, Shu Zhao, Juanzi Li, Jie Tang

We introduce TableLLM, a robust large language model (LLM) with 13 billion parameters, purpose-built for proficiently handling tabular data manipulation tasks, whether they are embedded within documents or spreadsheets, catering to real-world office scenarios.

Language Modelling Large Language Model

Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane

no code implementations24 Mar 2024 Han Yan, Yang Li, Zhennan Wu, Shenzhou Chen, Weixuan Sun, Taizhang Shang, Weizhe Liu, Tian Chen, Xiaqiang Dai, Chao Ma, Hongdong Li, Pan Ji

We present Frankenstein, a diffusion-based framework that can generate semantic-compositional 3D scenes in a single pass.

Denoising

FusionINN: Decomposable Image Fusion for Brain Tumor Monitoring

1 code implementation23 Mar 2024 Nishant Kumar, Ziyan Tao, Jaikirat Singh, Yang Li, Peiwen Sun, Binghui Zhao, Stefan Gumhold

Image fusion typically employs non-invertible neural networks to merge multiple source images into a single fused image.

Denoising Multi-Exposure Image Fusion

OUCopula: Bi-Channel Multi-Label Copula-Enhanced Adapter-Based CNN for Myopia Screening Based on OU-UWF Images

no code implementations18 Mar 2024 Yang Li, Qiuyi Huang, Chong Zhong, Danjuan Yang, Meiyan Li, A. H. Welsh, Aiyi Liu, Bo Fu, Catherien C. Liu, Xingtao Zhou

Inspired by the complex relationships between OU and the high correlation between the (continuous) outcome labels (Spherical Equivalent and Axial Length), we propose a framework of copula-enhanced adapter convolutional neural network (CNN) learning with OU UWF fundus images (OUCopula) for joint prediction of multiple clinical scores.

Matrix-Transformation Based Low-Rank Adaptation (MTLoRA): A Brain-Inspired Method for Parameter-Efficient Fine-Tuning

no code implementations12 Mar 2024 Yao Liang, Yuwei Wang, Yang Li, Yi Zeng

In response to this, inspired by the idea that the functions of the brain are shaped by its geometric structure, this paper integrates this idea into LoRA technology and proposes a new matrix transformation-based reparameterization method for efficient fine-tuning, named Matrix-Transformation based Low-Rank Adaptation (MTLoRA).

Natural Language Understanding parameter-efficient fine-tuning +1

Contrastive Continual Learning with Importance Sampling and Prototype-Instance Relation Distillation

1 code implementation7 Mar 2024 Jiyong Li, Dilshod Azizov, Yang Li, Shangsong Liang

Recently, because of the high-quality representations of contrastive learning methods, rehearsal-based contrastive continual learning has been proposed to explore how to continually learn transferable representation embeddings to avoid the catastrophic forgetting issue in traditional continual settings.

Continual Learning Contrastive Learning +2

Electrical Load Forecasting Model Using Hybrid LSTM Neural Networks with Online Correction

no code implementations6 Mar 2024 Nan Lu, Quan Ouyang, Yang Li, Changfu Zou

Accurate electrical load forecasting is of great importance for the efficient operation and control of modern power systems.

Load Forecasting Time Series

A Unified Model for Active Battery Equalization Systems

no code implementations6 Mar 2024 Quan Ouyang, Nourallah Ghaeminezhad, Yang Li, Torsten Wik, Changfu Zou

Lithium-ion battery packs demand effective active equalization systems to enhance their usable capacity and lifetime.

Pyramid Feature Attention Network for Monocular Depth Prediction

no code implementations3 Mar 2024 Yifang Xu, Chenglei Peng, Ming Li, Yang Li, Sidan Du

Deep convolutional neural networks (DCNNs) have achieved great success in monocular depth estimation (MDE).

Depth Prediction Monocular Depth Estimation

Enhancing Continuous Domain Adaptation with Multi-Path Transfer Curriculum

no code implementations26 Feb 2024 Hanbing Liu, Jingge Wang, Xuan Zhang, Ye Guo, Yang Li

Specifically, we construct a transfer curriculum over the source and intermediate domains based on Wasserstein distance, motivated by theoretical analysis of CDA.

Capacity Estimation Domain Adaptation +3

ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition

1 code implementation23 Feb 2024 Lu Ye, Ze Tao, Yong Huang, Yang Li

In this paper, we introduce ChunkAttention, a prefix-aware self-attention module that can detect matching prompt prefixes across multiple requests and share their key/value tensors in memory at runtime to improve the memory utilization of KV cache.

Open Ad Hoc Teamwork with Cooperative Game Theory

1 code implementation23 Feb 2024 Jianhong Wang, Yang Li, Yuan Zhang, Wei Pan, Samuel Kaski

Ad hoc teamwork poses a challenging problem, requiring the design of an agent to collaborate with teammates without prior coordination or joint training.

Unsupervised Text Style Transfer via LLMs and Attention Masking with Multi-way Interactions

no code implementations21 Feb 2024 Lei Pan, Yunshi Lan, Yang Li, Weining Qian

Among existing methods for UTST tasks, attention masking approach and Large Language Models (LLMs) are deemed as two pioneering methods.

In-Context Learning Knowledge Distillation +4

Flexible Physical Camouflage Generation Based on a Differential Approach

no code implementations21 Feb 2024 Yang Li, Wenyi Tan, Chenxing Zhao, Shuangju Zhou, Xinkai Liang, Quan Pan

This involves incorporating a specially designed adversarial loss and covert constraint loss to guarantee the adversarial and covert nature of the camouflage in the physical world.

Neural Rendering

A Geometric Algorithm for Tubular Shape Reconstruction from Skeletal Representation

1 code implementation20 Feb 2024 Guoqing Zhang, Yang Li

We introduce a novel approach for the reconstruction of tubular shapes from skeletal representations.

Aligning Individual and Collective Objectives in Multi-Agent Cooperation

no code implementations19 Feb 2024 Yang Li, WenHao Zhang, Jianhong Wang, Shao Zhang, Yali Du, Ying Wen, Wei Pan

Among the research topics in multi-agent learning, mixed-motive cooperation is one of the most prominent challenges, primarily due to the mismatch between individual and collective goals.

SMAC+ Starcraft +1

FGeo-HyperGNet: Geometric Problem Solving Integrating Formal Symbolic System and Hypergraph Neural Network

1 code implementation18 Feb 2024 Xiaokai Zhang, Na Zhu, Cheng Qin, Yang Li, Zhenbing Zeng, Tuo Leng

The symbolic part is a formal system built on FormalGeo, which can automatically perform geomertic relational reasoning and algebraic calculations and organize the solving process into a solution hypertree with conditions as hypernodes and theorems as hyperedges.

Geometry Problem Solving Relational Reasoning

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

no code implementations12 Feb 2024 Mateusz Łajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszyńska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman

Echoing the widely-reported "emergent abilities" of large language models when trained on increasing volume of data, we show that BASE TTS variants built with 10K+ hours and 500M+ parameters begin to demonstrate natural prosody on textually complex sentences.

Decoder Disentanglement +1

Beyond Inserting: Learning Identity Embedding for Semantic-Fidelity Personalized Diffusion Generation

no code implementations31 Jan 2024 Yang Li, Songlin Yang, Wei Wang, Jing Dong

The previous methods either failed to accurately fit the face region or lost the interactive generative ability with other existing concepts in T2I models.

Image Generation Personalized Image Generation

BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation

1 code implementation30 Jan 2024 Zhennan Wu, Yang Li, Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, Pan Ji

A variational auto-encoder is employed to compress the tri-planes into the latent tri-plane space, on which the denoising diffusion process is performed.

Denoising Scene Generation

Localization of Dummy Data Injection Attacks in Power Systems Considering Incomplete Topological Information: A Spatio-Temporal Graph Wavelet Convolutional Neural Network Approach

no code implementations27 Jan 2024 Zhaoyang Qu, Yunchang Dong, Yang Li, Siqi Song, Tao Jiang, Min Li, Qiming Wang, Lei Wang, Xiaoyong Bo, Jiye Zang, Qi Xu

Unfortunately, this approach tends to overlook the inherent topological correlations within the non-Euclidean spatial attributes of power grid data, consequently leading to diminished accuracy in attack localization.

A New Method for Vehicle Logo Recognition Based on Swin Transformer

no code implementations27 Jan 2024 Yang Li, Doudou Zhang, Jianli Xiao

Additionally, the use of a transfer learning strategy enables our method to be on par with state-of-the-art VLR methods.

Logo Recognition Transfer Learning

Label-free detection of exosomes from different cellular sources based on surface-enhanced Raman spectroscopy combined with machine learning models

no code implementations25 Jan 2024 Yang Li, Xiaoming Lyu, Kuo Zhan, Haoyu Ji, Lei Qin, JianAn Huang

In comparison to other machine learning analysis, our method used small amount of SERS data to allow a simple and rapid exosome detection, which enables a timely subsequent study of cell-cell interactions, communication mechanisms, and disease mechanisms in life sciences.

SCNet: Sparse Compression Network for Music Source Separation

2 code implementations24 Jan 2024 Weinan Tong, Jiaxu Zhu, Jun Chen, Shiyin Kang, Tao Jiang, Yang Li, Zhiyong Wu, Helen Meng

We use a higher compression ratio on subbands with less information to improve the information density and focus on modeling subbands with more information.

Music Source Separation

Computation Rate Maximization for Wireless Powered Edge Computing With Multi-User Cooperation

1 code implementation22 Jan 2024 Yang Li, Xing Zhang, Bo Lei, Qianying Zhao, Min Wei, Zheyan Qu, Wenbo Wang

Simulation results show that the performance of the proposed algorithms is comparable to that of the exhaustive search method, and the deep learning-based algorithm significantly reduces the execution time of the algorithm.

Edge-computing

Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation

no code implementations18 Jan 2024 Changgu Chen, Junwei Shu, Lianggangxu Chen, Gaoqi He, Changbo Wang, Yang Li

However, exerting control over the motion of objects in videos generated by any video diffusion model is a challenging problem.

Denoising Position +1

Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

no code implementations11 Jan 2024 Xijun Li, Fangzhou Zhu, Hui-Ling Zhen, Weilin Luo, Meng Lu, Yimin Huang, Zhenan Fan, Zirui Zhou, Yufei Kuang, Zhihai Wang, Zijie Geng, Yang Li, Haoyang Liu, Zhiwu An, Muming Yang, Jianshu Li, Jie Wang, Junchi Yan, Defeng Sun, Tao Zhong, Yong Zhang, Jia Zeng, Mingxuan Yuan, Jianye Hao, Jun Yao, Kun Mao

To this end, we present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI Solver, which aims to mitigate the scarcity of real-world mathematical programming instances, and to surpass the capabilities of traditional optimization techniques.

Decision Making Management

GloTSFormer: Global Video Text Spotting Transformer

1 code implementation8 Jan 2024 Han Wang, Yanjie Wang, Yang Li, Can Huang

In this paper, we propose a novel Global Video Text Spotting Transformer GloTSFormer to model the tracking problem as global associations and utilize the Gaussian Wasserstein distance to guide the morphological correlation between frames.

Text Spotting

Learning Persistent Community Structures in Dynamic Networks via Topological Data Analysis

1 code implementation6 Jan 2024 Dexu Kong, Anping Zhang, Yang Li

Dynamic community detection methods often lack effective mechanisms to ensure temporal consistency, hindering the analysis of network evolution.

Clustering Community Detection +3

PPBFL: A Privacy Protected Blockchain-based Federated Learning Model

no code implementations2 Jan 2024 Yang Li, Chunhe Xia, Wanshuang Lin, Tianbo Wang

Therefore, we propose A Privacy Protected Blockchain-based Federated Learning Model (PPBFL) to enhance the security of federated learning and encourage active participation of nodes in model training.

Federated Learning

DOEPatch: Dynamically Optimized Ensemble Model for Adversarial Patches Generation

no code implementations28 Dec 2023 Wenyi Tan, Yang Li, Chenxing Zhao, ZhunGa Liu, Quan Pan

While ensemble models have proven effective, current research in the field of object detection typically focuses on the simple fusion of the outputs of all models, with limited attention being given to developing general adversarial patches that can function effectively in the physical world.

Autonomous Driving Object +2

Semantic Draw Engineering for Text-to-Image Creation

no code implementations23 Dec 2023 Yang Li, Huaqiang Jiang, Yangkai Wu

Text-to-image generation is conducted through Generative Adversarial Networks (GANs) or transformer models.

Computational Efficiency Text-to-Image Generation

Multi-Granularity Information Interaction Framework for Incomplete Utterance Rewriting

no code implementations19 Dec 2023 Haowei Du, Dinghao Zhang, Chen Li, Yang Li, Dongyan Zhao

Recent approaches in Incomplete Utterance Rewriting (IUR) fail to capture the source of important words, which is crucial to edit the incomplete utterance, and introduce words from irrelevant utterances.

Relation-Aware Question Answering for Heterogeneous Knowledge Graphs

1 code implementation19 Dec 2023 Haowei Du, Quzhe Huang, Chen Li, Chen Zhang, Yang Li, Dongyan Zhao

To address this issue, we construct a \textbf{dual relation graph} where each node denotes a relation in the original KG (\textbf{primal entity graph}) and edges are constructed between relations sharing same head or tail entities.

Knowledge Base Question Answering Knowledge Graphs +1

H-ensemble: An Information Theoretic Approach to Reliable Few-Shot Multi-Source-Free Transfer

no code implementations19 Dec 2023 Yanru Wu, Jianning Wang, Weida Wang, Yang Li

In this work, we adopt an information theoretic perspective on it and propose a framework named H-ensemble, which dynamically learns the optimal linear combination, or ensemble, of source models for the target task, using a generalization of maximal correlation regression.

Transfer Learning

Hypergrah-Enhanced Dual Convolutional Network for Bundle Recommendation

1 code implementation18 Dec 2023 Kangbo Liu, Yang Li, Yaoxin Wu, Zhaoxuan Wang, Xiaoxu Wang

While previous approaches have demonstrated notable performance, we argue that they may compromise the ternary relationship among users, items, and bundles.

Rich Human Feedback for Text-to-Image Generation

1 code implementation CVPR 2024 Youwei Liang, Junfeng He, Gang Li, Peizhao Li, Arseniy Klimovskiy, Nicholas Carolan, Jiao Sun, Jordi Pont-Tuset, Sarah Young, Feng Yang, Junjie Ke, Krishnamurthy Dj Dvijotham, Katie Collins, Yiwen Luo, Yang Li, Kai J Kohlhoff, Deepak Ramachandran, Vidhya Navalpakkam

We show that the predicted rich human feedback can be leveraged to improve image generation, for example, by selecting high-quality training data to finetune and improve the generative models, or by creating masks with predicted heatmaps to inpaint the problematic regions.

Text-to-Image Generation

ALOHA: from Attention to Likes -- a unified mOdel for understanding HumAn responses to diverse visual content

no code implementations15 Dec 2023 Peizhao Li, Junfeng He, Gang Li, Rachit Bhargava, Shaolei Shen, Nachiappan Valliappan, Youwei Liang, Hongxiang Gu, Venky Ramachandran, Golnaz Farhadi, Yang Li, Kai J Kohlhoff, Vidhya Navalpakkam

Progress in human behavior modeling involves understanding both implicit, early-stage perceptual behavior such as human attention and explicit, later-stage behavior such as subjective preferences/likes.

Learning a Low-Rank Feature Representation: Achieving Better Trade-Off between Stability and Plasticity in Continual Learning

1 code implementation14 Dec 2023 Zhenrong Liu, Yang Li, Yi Gong, Yik-Chung Wu

This approach optimizes network parameters in the null space of the past tasks' feature representation matrix to guarantee the stability.

Continual Learning

Explainable Trajectory Representation through Dictionary Learning

no code implementations13 Dec 2023 Yuanbo Tang, Zhiyuan Peng, Yang Li

A hierarchical dictionary learning scheme is also proposed to ensure the algorithm's scalability on large networks, leading to a multi-scale trajectory representation.

Data Compression Dictionary Learning +1

Astrocyte-Enabled Advancements in Spiking Neural Networks for Large Language Modeling

no code implementations12 Dec 2023 Guobin Shen, Dongcheng Zhao, Yiting Dong, Yang Li, Jindong Li, Kang Sun, Yi Zeng

Within the complex neuroarchitecture of the brain, astrocytes play crucial roles in development, structure, and metabolism.

Language Modelling Text Generation

Fine-Grained Extraction of Road Networks via Joint Learning of Connectivity and Segmentation

1 code implementation7 Dec 2023 Yijia Xu, Liqiang Zhang, Wuming Zhang, Suhong Liu, Jingwen Li, Xingang Li, Yuebin Wang, Yang Li

Road network extraction from satellite images is widely applicated in intelligent traffic management and autonomous driving fields.

Autonomous Driving Management +2

Resource Allocation for Semantic Communication under Physical-layer Security

no code implementations7 Dec 2023 Yang Li, Xinyu Zhou, Jun Zhao

The secrecy rate is the communication rate at which no information is disclosed to an eavesdropper.

Semantic Communication

Factor-Assisted Federated Learning for Personalized Optimization with Heterogeneous Data

no code implementations7 Dec 2023 Feifei Wang, Huiyun Tang, Yang Li

To address this issue, we develop a novel personalized federated learning framework for heterogeneous data, which we refer to as FedSplit.

Personalized Federated Learning

Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future

2 code implementations6 Dec 2023 Hongyang Li, Yang Li, Huijie Wang, Jia Zeng, Huilin Xu, Pinlong Cai, Li Chen, Junchi Yan, Feng Xu, Lu Xiong, Jingdong Wang, Futang Zhu, Chunjing Xu, Tiancai Wang, Fei Xia, Beipeng Mu, Zhihui Peng, Dahua Lin, Yu Qiao

With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem.

Autonomous Driving

Deep Reinforcement Learning Based Optimal Energy Management of Multi-energy Microgrids with Uncertainties

no code implementations30 Nov 2023 Yang Cui, Yang Xu, Yang Li, Yijian Wang, Xinpeng Zou

To help EMS formulate optimal dispatching schemes, a deep reinforcement learning (DRL)-based MEMG energy management scheme with renewable energy source (RES) uncertainty is proposed in this paper.

energy management Management +1

Perceptual Group Tokenizer: Building Perception with Iterative Grouping

no code implementations30 Nov 2023 Zhiwei Deng, Ting Chen, Yang Li

In this paper, we propose the Perceptual Group Tokenizer, a model that entirely relies on grouping operations to extract visual features and perform self-supervised representation learning, where a series of grouping operations are used to iteratively hypothesize the context for pixels or superpixels to refine feature representations.

Representation Learning Self-Supervised Image Classification +2

Advancing Attack-Resilient Scheduling of Integrated Energy Systems with Demand Response via Deep Reinforcement Learning

no code implementations28 Nov 2023 Yang Li, Wenjie Ma, Yuanzheng Li, Sen Li, Zhe Chen

Simulation results demonstrate that our method is capable of adequately addressing the uncertainties resulting from RES and loads, mitigating the impact of cyber-attacks on the scheduling strategy, and ensuring a stable demand supply for various energy sources.

Scheduling

Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach

no code implementations23 Nov 2023 Bin Zhang, Hangyu Mao, Jingqing Ruan, Ying Wen, Yang Li, Shao Zhang, Zhiwei Xu, Dapeng Li, Ziyue Li, Rui Zhao, Lijuan Li, Guoliang Fan

The remarkable progress in Large Language Models (LLMs) opens up new avenues for addressing planning and decision-making problems in Multi-Agent Systems (MAS).

Decision Making Hallucination +3

GENET: Unleashing the Power of Side Information for Recommendation via Hypergraph Pre-training

no code implementations22 Nov 2023 Yang Li, Qi'ao Zhao, Chen Lin, Zhenjie Zhang, Xiaomin Zhu

(2) The diverse semantics of side information that describes items and users from multi-level in a context different from recommendation systems.

Sequential Recommendation

FBChain: A Blockchain-based Federated Learning Model with Efficiency and Secure Communication

no code implementations21 Nov 2023 Yang Li, Chunhe Xia, Wei Liu, Chen Chen, Tianbo Wang

This article proposes Blockchain-based Federated Learning (FBChain) model for federated learning parameter communication to overcome the above two problems.

Federated Learning

There is No Silver Bullet: Benchmarking Methods in Predictive Combinatorial Optimization

no code implementations13 Nov 2023 Haoyu Geng, Hang Ruan, Runzhong Wang, Yang Li, Yang Wang, Lei Chen, Junchi Yan

Our study shows that PnO approaches are better than PtO on 7 out of 8 benchmarks, but there is no silver bullet found for the specific design choices of PnO.

Benchmarking Combinatorial Optimization +3

CeCNN: Copula-enhanced convolutional neural networks in joint prediction of refraction error and axial length based on ultra-widefield fundus images

1 code implementation7 Nov 2023 Chong Zhong, Yang Li, Danjuan Yang, Meiyan Li, Xingyao Zhou, Bo Fu, Catherine C. Liu, A. H. Welsh

The CeCNN formulates a multiresponse regression that relates multiple dependent discrete-continuous responses and the image covariate, where the nonlinearity of the association is modeled by a backbone CNN.

regression

AdaptMVSNet: Efficient Multi-View Stereo with adaptive convolution and attention fusion

1 code implementation journal 2023 Pengfei Jiang, Xiaoyan Yang, Yuanjie Chen, Wenjie Song, Yang Li

To this end, adaptive convolution is introduced to significantly improve the efficiency in speed and metrics compared to current methods.

In-Context Prompt Editing For Conditional Audio Generation

no code implementations1 Nov 2023 Ernie Chang, Pin-Jie Lin, Yang Li, Sidd Srinivasan, Gael Le Lan, David Kant, Yangyang Shi, Forrest Iandola, Vikas Chandra

We show that the framework enhanced the audio quality across the set of collected user prompts, which were edited with reference to the training captions as exemplars.

Audio Generation Retrieval

R$^3$ Prompting: Review, Rephrase and Resolve for Chain-of-Thought Reasoning in Large Language Models under Noisy Context

no code implementations25 Oct 2023 Qingyuan Tian, Hanlun Zhu, Lei Wang, Yang Li, Yunshi Lan

More analyses and ablation studies show the robustness and generalization of R$^3$ prompting method in solving reasoning tasks in LLMs under noisy context.

Sentence

BRFL: A Blockchain-based Byzantine-Robust Federated Learning Model

no code implementations20 Oct 2023 Yang Li, Chunhe Xia, Chang Li, Tianbo Wang

With the increasing importance of machine learning, the privacy and security of training data have become critical.

Federated Learning

Kernel Learning in Ridge Regression "Automatically" Yields Exact Low Rank Solution

1 code implementation18 Oct 2023 Yunlu Chen, Yang Li, Keli Liu, Feng Ruan

Assuming that the covariates have nonzero explanatory power for the response only through a low dimensional subspace (central mean subspace), we find that the global minimizer of the finite sample kernel learning objective is also low rank with high probability.

regression

MM-BigBench: Evaluating Multimodal Models on Multimodal Content Comprehension Tasks

2 code implementations13 Oct 2023 Xiaocui Yang, Wenfang Wu, Shi Feng, Ming Wang, Daling Wang, Yang Li, Qi Sun, Yifei Zhang, XiaoMing Fu, Soujanya Poria

Consequently, our work complements research on the performance of MLLMs in multimodal comprehension tasks, achieving a more comprehensive and holistic evaluation of MLLMs.

Multimodal Reasoning

A Zero-Shot Language Agent for Computer Control with Structured Reflection

no code implementations12 Oct 2023 Tao Li, Gang Li, Zhiwei Deng, Bryan Wang, Yang Li

To perform a task, recent works often require a model to learn from trace examples of the task via either supervised learning or few/many-shot prompting.

Management

Language Models As Semantic Indexers

1 code implementation11 Oct 2023 Bowen Jin, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, Zhengyang Wang, Zheng Li, Yang Li, Hanqing Lu, Suhang Wang, Jiawei Han, Xianfeng Tang

Semantic identifier (ID) is an important concept in information retrieval that aims to preserve the semantics of objects such as documents and items inside their IDs.

Contrastive Learning Information Retrieval +2

Automatic Macro Mining from Interaction Traces at Scale

1 code implementation10 Oct 2023 Forrest Huang, Gang Li, Tao Li, Yang Li

Macros are building block tasks of our everyday smartphone activity (e. g., "login", or "booking a flight").

Secondary frequency control of islanded microgrid considering wind and solar stochastics

no code implementations8 Oct 2023 Cheng Zhong, Zhifu Jiang, Xiangyu Zhang, Jikai Chen, Yang Li

Finally, a microgrid simulation model including multiple PV and wind DGs is built and performed in various scenarios compared to the traditional secondary frequency control method.

Model Predictive Control

Distribution-free risk assessment of regression-based machine learning algorithms

no code implementations5 Oct 2023 Sukrita Singh, Neeraj Sarna, Yuanyuan Li, Yang Li, Agni Orfanoudaki, Michael Berger

We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability.

Conformal Prediction Prediction Intervals +1

Low-carbon optimal dispatch of integrated energy system considering demand response under the tiered carbon trading mechanism

no code implementations4 Oct 2023 Limeng Wang, Xuemeng Liu, Yang Li, Duo Chang, Xing Ren

The example results show that considering the carbon trading cost and demand response under the tiered carbon trading mechanism, the total operating cost of IES is reduced by 5. 69% and the carbon emission is reduced by 17. 06%, which significantly improves the reliability, economy and low carbon performance of IES.

Scheduling

A Demand-Supply Cooperative Responding Strategy in Power System with High Renewable Energy Penetration

no code implementations26 Sep 2023 Yuanzheng Li, Xinxin Long, Yang Li, Yizhou Ding, Tao Yang, Zhigang Zeng

In this context, unreasonable profit distributions on the demand-supply side would lead to the conflict of interests and diminish the effectiveness of cooperative responses.

Bad Actor, Good Advisor: Exploring the Role of Large Language Models in Fake News Detection

1 code implementation21 Sep 2023 Beizhe Hu, Qiang Sheng, Juan Cao, Yuhui Shi, Yang Li, Danding Wang, Peng Qi

To instantiate this proposal, we design an adaptive rationale guidance network for fake news detection (ARG), in which SLMs selectively acquire insights on news analysis from the LLMs' rationales.

Fake News Detection

Learning Point-wise Abstaining Penalty for Point Cloud Anomaly Detection

1 code implementation19 Sep 2023 Shaocong Xu, Pengfei Li, Xinyu Liu, Qianpu Sun, Yang Li, Shihui Guo, Zhen Wang, Bo Jiang, Rui Wang, Kehua Sheng, Bo Zhang, Hao Zhao

We demonstrate that learning different abstaining penalties, apart from point-wise penalty, for different types of (synthesized) outliers can further improve the performance.

Anomaly Detection Autonomous Driving +1

For A More Comprehensive Evaluation of 6DoF Object Pose Tracking

no code implementations14 Sep 2023 Yang Li, Fan Zhong, Xin Wang, Shuangbing Song, Jiachen Li, Xueying Qin, Changhe Tu

The limitations of previous scoring methods and error metrics are analyzed, based on which we introduce our improved evaluation methods.

Pose Tracking

Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition

no code implementations14 Sep 2023 Yang Li, Liangzhen Lai, Yuan Shangguan, Forrest N. Iandola, Zhaoheng Ni, Ernie Chang, Yangyang Shi, Vikas Chandra

Instead, the bottleneck lies in the linear projection layers of multi-head attention and feedforward networks, constituting a substantial portion of the model size and contributing significantly to computation, memory, and power usage.

speech-recognition Speech Recognition