Search Results for author: Bin Li

Found 253 papers, 87 papers with code

A Knowledge storage and semantic space alignment Method for Multi-documents dialogue generation

no code implementations dialdoc (ACL) 2022 Minjun Zhu, Bin Li, Yixuan Weng, Fei Xia

Question Answering (QA) is a Natural Language Processing (NLP) task that can measure language and semantics understanding ability, it requires a system not only to retrieve relevant documents from a large number of articles but also to answer corresponding questions according to documents.

Dialogue Generation Language Modelling +3

基于大规模语料库的《古籍汉字分级字表》研究(The Formulation of The graded Chinese character list of ancient books Based on Large-scale Corpus)

no code implementations CCL 2021 Changwei Xu, Minxuan Feng, Bin Li, Yiguo Yuan

"《古籍汉字分级字表》是基于大规模古籍文本语料库、为辅助学习者古籍文献阅读而研制的分级字表。该字表填补了古籍字表研究成果的空缺, 依据各汉字学习优先级别的不同, 实现了古籍汉字的等级划分, 目前收录一级字105个, 二级字340个, 三级字555个。本文介绍了该字表研制的主要依据和基本步骤, 并将其与传统识字教材“三百千”及《现代汉语常用字表》进行比较, 验证了其收字的合理性。该字表有助于学习者优先掌握古籍文本常用字, 提升古籍阅读能力, 从而促进中华优秀传统文化的继承与发展。”

VPAI_Lab at MedVidQA 2022: A Two-Stage Cross-modal Fusion Method for Medical Instructional Video Classification

1 code implementation BioNLP (ACL) 2022 Bin Li, Yixuan Weng, Fei Xia, Bin Sun, Shutao Li

Given an input video, the MedVidCL task aims to correctly classify it into one of three following categories: Medical Instructional, Medical Non-instructional, and Non-medical.

Video Classification

Knowledge Transfer with Visual Prompt in multi-modal Dialogue Understanding and Generation

no code implementations TU (COLING) 2022 Minjun Zhu, Yixuan Weng, Bin Li, Shizhu He, Kang Liu, Jun Zhao

In this work, we propose a knowledge transfer method with visual prompt (VPTG) fusing multi-modal data, which is a flexible module that can utilize the text-only seq2seq model to handle visual dialogue tasks.

Dialogue Understanding Knowledge Distillation +2

Continuing Pre-trained Model with Multiple Training Strategies for Emotional Classification

no code implementations WASSA (ACL) 2022 Bin Li, Yixuan Weng, Qiya Song, Bin Sun, Shutao Li

This paper describes the contribution of the LingJing team’s method to the Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA) 2022 shared task on Emotion Classification.

Attribute Classification +4

中文连动句语义关系识别研究(Research on Semantic Relation Recognition of Chinese Serial-verb Sentences)

no code implementations CCL 2021 Chao Sun, Weiguang Qu, Tingxin Wei, Yanhui Gu, Bin Li, Junsheng Zhou

“连动句是形如“NP+VP1+VP2”的句子, 句中含有两个或两个以上的动词(或动词结构)且动词的施事为同一对象。相同结构的连动句可以表示多种不同的语义关系。本文基于前人对连动句中VP1和VP2之间的语义关系分类, 标注了连动句语义关系数据集, 基于神经网络完成了对连动句语义关系的识别。该方法将连动句语义识别任务进行分解, 基于BERT进行编码, 利用BiLSTM-CRF先识别出连动句中连动词(VP)及其主语(NP), 再基于融合连动词信息的编码, 利用BiLSTM-Attention对连动词进行关系判别, 实验结果验证了所提方法的有效性。”

The First International Ancient Chinese Word Segmentation and POS Tagging Bakeoff: Overview of the EvaHan 2022 Evaluation Campaign

no code implementations LT4HALA (LREC) 2022 Bin Li, Yiguo Yuan, Jingya Lu, Minxuan Feng, Chao Xu, Weiguang Qu, Dongbo Wang

This paper presents the results of the First Ancient Chinese Word Segmentation and POS Tagging Bakeoff (EvaHan), which was held at the Second Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) 2022, in the context of the 13th Edition of the Language Resources and Evaluation Conference (LREC 2022).

Chinese Word Segmentation POS +2

先秦词网构建及梵汉对比研究(The Construction of Pre-Qin Ancient Chinese WordNet and Cross Language Comparative Study between Ancient Sanskrit WordNet and Pre-Qin Ancient Chinese WordNet)

no code implementations CCL 2021 Xuehui Lu, Huidan Xu, Siyu Chen, Bin Li

“先秦汉语在汉语史研究上具有重要地位, 然而以往的研究始终没有形成结构化的先秦词汇资源, 难以满足古汉语信息处理和跨语言对比的研究需要。国际上以英文词网(WordNet)的义类架构为基础, 已经建立了数十种语言的词网, 已经成为多语言自然语言处理和跨语言对比的基础资源。本文综述了国内外各种词网的构建情况, 特别是古代语言的词网和汉语词网, 然后详细介绍了先秦词网的构建和校正过程, 构建起了涵盖43591个词语、61227个义项、17975个义类的先秦汉语词网。本文还通过与古梵语词网的跨语言对比, 尝试分析这两种古老语言在词汇上的共性和差异, 初步验证先秦词网的有效性。”

中文词语离合现象识别研究(Research on Recognition of the Separation and Reunion Phenomena of Words in Chinese)

no code implementations CCL 2021 Lou Zhou, Weiguang Qu, Tingxin Wei, Junsheng Zhou, Bin Li, Yanhui Gu

“汉语词语的离合现象是汉语中一种词语可分可合的特殊现象。本文采用字符级序列标注方法解决二字动词离合现象的自动识别问题, 以避免中文分词及词性标注的错误传递, 节省制定匹配规则与特征模板的人工开支。在训练过程中微调BERT中文预训练模型, 获取面向目标任务的字符向量表示, 并引入掩码机制对模型隐藏离用法中分离的词语, 减轻词语本身对识别结果的影响, 强化中间插入成分的学习, 并对前后语素采用不同的掩码以强调其出现顺序, 进而使模型具备了识别复杂及偶发性离用法的能力。为获得含有上下文信息的句子表达, 将原始的句子表达与采用掩码的句子表达分别输入两个不同参数的BiLSTM层进行训练, 最后采用CRF算法捕捉句子标签序列的依赖关系。本文提出的BERT MASK + 2BiLSTMs + CRF模型比现有最优的离合词识别模型提高了2. 85%的F1值。”

基于深度学习的实体关系抽取研究综述(Review of Entity Relation Extraction based on deep learning)

no code implementations CCL 2020 Zhentao Xia, Weiguang Qu, Yanhui Gu, Junsheng Zhou, Bin Li

作为信息抽取的一项核心子任务, 实体关系抽取对于知识图谱、智能问答、语义搜索等自然语言处理应用都十分重要。关系抽取在于从非结构化文本中自动地识别实体之间具有的某种语义关系。该文聚焦句子级别的关系抽取研究, 介绍用于关系抽取的主要数据集并对现有的技术作了阐述, 主要分为:有监督的关系抽取、远程监督的关系抽取和实体关系联合抽取。我们对比用于该任务的各种模型, 分析它们的贡献与缺 陷。最后介绍中文实体关系抽取的研究现状和方法。

Relation Extraction

DynGL-SDP: Dynamic Graph Learning for Semantic Dependency Parsing

1 code implementation COLING 2022 Bin Li, Miao Gao, Yunlong Fan, Yikemaiti Sataer, Zhiqiang Gao, Yaocheng Gui

A recent success in semantic dependency parsing shows that graph neural networks can make significant accuracy improvements, owing to its powerful ability in learning expressive graph representations.

Dependency Parsing graph construction +3

Align-smatch: A Novel Evaluation Method for Chinese Abstract Meaning Representation Parsing based on Alignment of Concept and Relation

no code implementations LREC 2022 Liming Xiao, Bin Li, Zhixing Xu, Kairui Huo, Minxuan Feng, Junsheng Zhou, Weiguang Qu

Therefore, to make up for the vacancy of Chinese AMR parsing evaluation methods, based on AMR evaluation metric smatch, we have improved the algorithm of generating triples so that to make it compatible with concept alignment and relation alignment.

AMR Parsing Concept Alignment +2

Building a Chinese AMR Bank with Concept and Relation Alignments

no code implementations LILT 2019 Bin Li, Yuan Wen, Li Song, Weiguang Qu, Nianwen Xue

One significant change we have made to the AMR annotation methodology is the inclusion of the alignment between word tokens in the sentence and the concepts/relations in the CAMR annotation to make it easier for automatic parsers to model the correspondence between a sentence and its meaning representation.

Relation Sentence

多轮对话的篇章级抽象语义表示标注体系研究(Research on Discourse-level Abstract Meaning Representation Annotation framework in Multi-round Dialogue)

no code implementations CCL 2020 Tong Huang, Bin Li, Peiyi Yan, Tingting Ji, Weiguang Qu

对话分析是智能客服、聊天机器人等自然语言对话应用的基础课题, 而对话语料与常规书面语料有较大差异, 存在大量的称谓、情感短语、省略、语序颠倒、冗余等复杂现象, 对句法和语义分析器的影响较大, 对话自动分析的准确率相对书面语料一直不高。其主要原因在于对多轮对话缺乏严整的形式化描写方式, 不利于后续的分析计算。因此, 本文在梳理国内外针对对话的标注体系和语料库的基础上, 提出了基于抽象语义表示的篇章级多轮对话标注体系。具体探讨了了篇章级别的语义结构标注方法, 给出了词语和概念关系的对齐方案, 针对称谓语和情感短语增加了相应的语义关系和概念, 调整了表示主观情感词语的论元结构, 并对对话中一些特殊现象进行了规定, 设计了人工标注平台, 为大规模的多轮对话语料库标注与计算研究奠定基础。

基于抽象语义表示的汉语疑问句的标注与分析(Chinese Interrogative Sentences Annotation and Analysis Based on the Abstract Meaning Representation)

no code implementations CCL 2020 Peiyi Yan, Bin Li, Tong Huang, Kairui Huo, Jin Chen, Weiguang Qu

疑问句的句法语义分析在搜索引擎、信息抽取和问答系统等领域有着广泛的应用。计算语言学多采取问句分类和句法分析相结合的方式来处理疑问句, 精度和效率还不理想。而疑问句的语言学研究成果丰富, 比如疑问句的结构类型、疑问焦点和疑问代词的非疑问用法等, 但缺乏系统的形式化表示。本文致力于解决这一难题, 采用基于图结构的汉语句子语义的整体表示方法—中文抽象语义表示(CAMR)来标注疑问句的语义结构, 将疑问焦点和整句语义一体化表示出来。然后选取了宾州中文树库CTB8. 0网络媒体语料、小学语文教材以及《小王子》中文译本的2万句语料中共计2071句疑问句, 统计了疑问句的主要特点。统计表明, 各种疑问代词都可以通过疑问概念amr-unknown和语义关系的组合来表示, 能够完整地表示出疑问句的关键信息、疑问焦点和语义结构。最后, 根据疑问代词所关联的语义关系, 统计了疑问焦点的概率分布, 其中原因、修饰语和受事的占比最高, 分别占26. 53%、16. 73%以及16. 44%。基于抽象语义表示的疑问句标注与分析可以为汉语疑问句研究提供基础理论与资源。

基于神经网络的连动句识别(Recognition of serial-verb sentences based on Neural Network)

no code implementations CCL 2020 Chao Sun, Weiguang Qu, Tingxin Wei, Yanhui Gu, Bin Li, Junsheng Zhou

连动句是具有连动结构的句子, 是汉语中的特殊句法结构, 在现代汉语中十分常见且使用频繁。连动句语法结构和语义关系都很复杂, 在识别中存在许多问题, 对此本文针对连动句的识别问题进行了研究, 提出了一种基于神经网络的连动句识别方法。本方法分两步:第一步, 运用简单的规则对语料进行预处理;第二步, 用文本分类的思想, 使用BERT编码, 利用多层CNN与BiLSTM模型联合提取特征进行分类, 进而完成连动句识别任务。在人工标注的语料上进行实验, 实验结果达到92. 71%的准确率, F1值为87. 41%。

Chaos in Motion: Unveiling Robustness in Remote Heart Rate Measurement through Brain-Inspired Skin Tracking

no code implementations11 Apr 2024 Jie Wang, Jing Lian, Minjie Ma, Junqiang Lei, Chunbiao Li, Bin Li, Jizhao Liu

To address these issues, we regard the remote heart rate measurement as the process of analyzing the spatiotemporal characteristics of the optical flow signal in the video.

Uncertainty-Aware Deep Video Compression with Ensembles

no code implementations28 Mar 2024 Wufei Ma, Jiahao Li, Bin Li, Yan Lu

Deep learning-based video compression is a challenging task, and many previous state-of-the-art learning-based video codecs use optical flows to exploit the temporal correlation between successive frames and then compress the residual error.

Motion Estimation Quantization +1

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

no code implementations18 Mar 2024 Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang, Bin Li

Drawing on recent advancements in diffusion models for text-to-image generation, identity-preserved personalization has made significant progress in accurately capturing specific identities with just a single reference image.

Text-to-Image Generation

Learning-augmented Online Minimization of Age of Information and Transmission Costs

no code implementations5 Mar 2024 Zhongdong Liu, Keyuan Zhang, Bin Li, Yin Sun, Y. Thomas Hou, Bo Ji

To address this challenge, we develop a robust online algorithm to minimize the sum of transmission and staleness costs, ensuring a worst-case performance guarantee.

Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport

1 code implementation28 Feb 2024 Bin Li, Ye Shi, Qian Yu, Jingya Wang

This paper introduces ProtoOT, a novel Optimal Transport formulation explicitly tailored for UCIR, which integrates intra-domain feature representation learning and cross-domain alignment into a unified framework.

Contrastive Learning Image Retrieval +2

Neural Video Compression with Feature Modulation

1 code implementation27 Feb 2024 Jiahao Li, Bin Li, Yan Lu

This results in a better learning of the quantization scaler and helps our NVC support about 11. 4 dB PSNR range.

Blocking Quantization +1

SA-MDKIF: A Scalable and Adaptable Medical Domain Knowledge Injection Framework for Large Language Models

no code implementations1 Feb 2024 Tianhan Xu, Zhe Hu, Ling Chen, Bin Li

In the next stage, we train the skill router using task-specific downstream data and use this router to integrate the acquired skills with LLMs during inference.

A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research

no code implementations26 Jan 2024 Sicong Cao, Xiaobing Sun, Ratnadira Widyasari, David Lo, Xiaoxue Wu, Lili Bo, Jiale Zhang, Bin Li, Wei Liu, Di wu, Yixin Chen

The remarkable achievements of Artificial Intelligence (AI) algorithms, particularly in Machine Learning (ML) and Deep Learning (DL), have fueled their extensive deployment across multiple sectors, including Software Engineering (SE).

Decision Making Vulnerability Detection

Towards Generative Abstract Reasoning: Completing Raven's Progressive Matrix via Rule Abstraction and Selection

no code implementations18 Jan 2024 Fan Shi, Bin Li, xiangyang xue

In the odd-one-out task and two held-out configurations, RAISE can leverage acquired latent concepts and atomic rules to find the rule-breaking image in a matrix and handle problems with unseen combinations of rules and attributes.

Answer Generation Attribute +2

Accelerating Data Generation for Neural Operators via Krylov Subspace Recycling

no code implementations17 Jan 2024 Hong Wang, Zhongkai Hao, Jie Wang, Zijie Geng, Zhen Wang, Bin Li, Feng Wu

To the best of our knowledge, SKR is the first attempt to address the time-consuming nature of data generation for learning neural operators.

A Study on Training and Developing Large Language Models for Behavior Tree Generation

no code implementations16 Jan 2024 Fu Li, Xueying Wang, Bin Li, Yunlong Wu, Yanzhen Wang, Xiaodong Yi

The core contribution of this paper lies in the design of a BT generation framework based on LLM, which encompasses the entire process, from data synthesis and model training to application developing and data verification.

Unsupervised Object-Centric Learning from Multiple Unspecified Viewpoints

no code implementations3 Jan 2024 Jinyang Yuan, Tonglin Chen, Zhimeng Shen, Bin Li, xiangyang xue

This ability is essential for humans to identify the same object while moving and to learn from vision efficiently.

Object

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

2 code implementations21 Dec 2023 Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai

However, the progress in vision and vision-language foundation models, which are also critical elements of multi-modal AGI, has not kept pace with LLMs.

 Ranked #1 on Zero-Shot Video Retrieval on MSR-VTT-full (using extra training data)

Image Retrieval Image-to-Text Retrieval +10

Adapter is All You Need for Tuning Visual Tasks

1 code implementation25 Nov 2023 Dongshuo Yin, Leiyi Hu, Bin Li, Youqun Zhang

To fully demonstrate the practicality and generality of Mona, we conduct experiments on multiple representative visual tasks, including instance segmentation on COCO, semantic segmentation on ADE20K, object detection on Pascal VOC, and image classification on several common datasets.

Image Classification Instance Segmentation +4

Federated Transformed Learning for a Circular, Secure, and Tiny AI

no code implementations24 Nov 2023 Weisi Guo, Schyler Sun, Bin Li, Sam Blakeman

Deep Learning (DL) is penetrating into a diverse range of mass mobility, smart living, and industrial applications, rapidly transforming the way we live and work.

Efficient Trigger Word Insertion

no code implementations23 Nov 2023 Yueqi Zeng, Ziqiang Li, Pengfei Xia, Lei Liu, Bin Li

With the boom in the natural language processing (NLP) field these years, backdoor attacks pose immense threats against deep neural network models.

text-classification Text Classification

FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients

1 code implementation19 Nov 2023 Shangchao Su, Bin Li, xiangyang xue

The implementation of FedRA is straightforward and can be seamlessly integrated into any transformer-based model without the need for further modification to the original model.

Federated Learning

One-Shot Federated Learning with Classifier-Guided Diffusion Models

no code implementations15 Nov 2023 Mingzhao Yang, Shangchao Su, Bin Li, xiangyang xue

Leveraging the extensive knowledge stored in the pre-trained diffusion model, the synthetic datasets can assist us in surpassing the knowledge limitations of the client samples, resulting in aggregation models that even outperform the performance ceiling of centralized training in some cases, which is convincingly demonstrated in the sufficient quantification and visualization experiments conducted on three large-scale multi-domain image datasets.

Federated Learning

Adaptive Digital Twin for UAV-Assisted Integrated Sensing, Communication, and Computation Networks

no code implementations26 Oct 2023 Bin Li, Wenshuai Liu, Wancheng Xie, Ning Zhang, Yan Zhang

In this paper, we study a digital twin (DT)-empowered integrated sensing, communication, and computation network.

Edge-computing

Evading Detection Actively: Toward Anti-Forensics against Forgery Localization

no code implementations16 Oct 2023 Long Zhuo, Shenghai Luo, Shunquan Tan, Han Chen, Bin Li, Jiwu Huang

In adversarial training, SEAR employs a forgery localization model as a supervisor to explore tampering features and constructs a deep-learning concealer to erase corresponding traces.

Adversarial Attack Self-Supervised Learning

Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks

no code implementations15 Oct 2023 Ziqiang Li, Pengfei Xia, Hong Sun, Yueqi Zeng, Wei zhang, Bin Li

In this study, we focus on improving the poisoning efficiency of backdoor attacks from the sample selection perspective.

Audio Classification Image Classification +2

Orientation-Independent Chinese Text Recognition in Scene Images

1 code implementation3 Sep 2023 Haiyang Yu, Xiaocong Wang, Bin Li, xiangyang xue

We conduct experiments on a scene dataset for benchmarking Chinese text recognition, and the results demonstrate that the proposed method can indeed improve performance through disentangling content and orientation information.

Benchmarking Image Reconstruction +1

Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning

1 code implementation ICCV 2023 Haiyang Yu, Xiaocong Wang, Bin Li, xiangyang xue

However, despite Chinese characters possessing different characteristics from Latin characters, such as complex inner structures and large categories, few methods have been proposed for Chinese Text Recognition (CTR).

Scene Text Recognition

Robust Computation Offloading and Trajectory Optimization for Multi-UAV-Assisted MEC: A Multi-Agent DRL Approach

no code implementations24 Aug 2023 Bin Li, Rongrong Yang, Lei Liu, Junyi Wang, Ning Zhang, Mianxiong Dong

For multiple Unmanned-Aerial-Vehicles (UAVs) assisted Mobile Edge Computing (MEC) networks, we study the problem of combined computation and communication for user equipments deployed with multi-type tasks.

Edge-computing Robust Design

Rethinking Person Re-identification from a Projection-on-Prototypes Perspective

no code implementations21 Aug 2023 Qizao Wang, Xuelin Qian, Bin Li, Yanwei Fu, xiangyang xue

In this paper, we rethink the role of the classifier in person Re-ID, and advocate a new perspective to conceive the classifier as a projection from image features to class prototypes.

Person Re-Identification Person Retrieval +1

ForensicsForest Family: A Series of Multi-scale Hierarchical Cascade Forests for Detecting GAN-generated Faces

no code implementations2 Aug 2023 Jiucui Lu, Yuezun Li, Jiaran Zhou, Bin Li, Junyu Dong, Siwei Lyu

The proposed ForensicsForest family is composed of three variants, which are {\em ForensicsForest}, {\em Hybrid ForensicsForest} and {\em Divide-and-Conquer ForensicsForest} respectively.

Abstracting Concept-Changing Rules for Solving Raven's Progressive Matrix Problems

1 code implementation15 Jul 2023 Fan Shi, Bin Li, xiangyang xue

Finally, we conduct experiments to illustrate the interpretability of CRAB in concept learning, answer selection, and global rule abstraction.

Answer Generation Answer Selection +1

GujiBERT and GujiGPT: Construction of Intelligent Information Processing Foundation Language Models for Ancient Texts

no code implementations11 Jul 2023 Dongbo Wang, Chang Liu, Zhixiao Zhao, Si Shen, Liu Liu, Bin Li, Haotian Hu, Mengcheng Wu, Litao Lin, Xue Zhao, Xiyu Wang

In the context of the rapid development of large language models, we have meticulously trained and introduced the GujiBERT and GujiGPT language models, which are foundational models specifically designed for intelligent information processing of ancient texts.

Model Selection Part-Of-Speech Tagging +2

Prototypes as Explanation for Time Series Anomaly Detection

no code implementations4 Jul 2023 Bin Li, Carsten Jentsch, Emmanuel Müller

Detecting abnormal patterns that deviate from a certain regular repeating pattern in time series is essential in many big data applications.

Anomaly Detection Time Series +1

Parameter-efficient is not sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions

no code implementations16 Jun 2023 Dongshuo Yin, Xueting Han, Bin Li, Hao Feng, Jing Bai

We provide a gradient backpropagation highway for low-rank adapters which eliminates the need for expensive backpropagation through the frozen pre-trained model, resulting in substantial savings of training memory and training time.

Transfer Learning

OCTScenes: A Versatile Real-World Dataset of Tabletop Scenes for Object-Centric Learning

no code implementations16 Jun 2023 Yinxuan Huang, Tonglin Chen, Zhimeng Shen, Jinghao Huang, Bin Li, xiangyang xue

The results demonstrate the shortcomings of state-of-the-art methods for learning meaningful representations from real-world data, despite their impressive performance on complex synthesis datasets.

Object Representation Learning

Efficient Backdoor Attacks for Deep Neural Networks in Real-world Scenarios

no code implementations14 Jun 2023 Hong Sun, Ziqiang Li, Pengfei Xia, Heng Li, Beihao Xia, Yi Wu, Bin Li

However, existing backdoor attack methods make unrealistic assumptions, assuming that all training data comes from a single source and that attackers have full access to the training data.

Backdoor Attack

Enhanced Fine-grained Motion Diffusion for Text-driven Human Motion Synthesis

no code implementations23 May 2023 Dong Wei, Xiaoning Sun, Huaijiang Sun, Bin Li, Shengxiang Hu, Weiqing Li, Jianfeng Lu

The emergence of text-driven motion synthesis technique provides animators with great potential to create efficiently.

Motion Synthesis valid

Collaborative Chinese Text Recognition with Personalized Federated Learning

no code implementations9 May 2023 Shangchao Su, Haiyang Yu, Bin Li, xiangyang xue

In Chinese text recognition, to compensate for the insufficient local data and improve the performance of local few-shot character recognition, it is often necessary for one organization to collect a large amount of data from similar organizations.

Personalized Federated Learning Privacy Preserving

Large Language Models Need Holistically Thought in Medical Conversational QA

1 code implementation9 May 2023 Yixuan Weng, Bin Li, Fei Xia, Minjun Zhu, Bin Sun, Shizhu He, Kang Liu, Jun Zhao

The medical conversational question answering (CQA) system aims at providing a series of professional medical services to improve the efficiency of medical care.

Conversational Question Answering

Meta-Auxiliary Learning for Adaptive Human Pose Prediction

no code implementations13 Apr 2023 Qiongjie Cui, Huaijiang Sun, Jianfeng Lu, Bin Li, Weiqing Li

Predicting high-fidelity future human poses, from a historically observed sequence, is decisive for intelligent robots to interact with humans.

Auxiliary Learning Pose Prediction +1

RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments

no code implementations10 Apr 2023 Drew Penney, Bin Li, Lizhong Chen, Jaroslaw J. Sydir, Anna Drewek-Ossowicka, Ramesh Illikkal, Charlie Tai, Ravi Iyer, Andrew Herdrich

Resource sharing between multiple workloads has become a prominent practice among cloud service providers, motivated by demand for improved resource utilization and reduced cost of ownership.

FengWu: Pushing the Skillful Global Medium-range Weather Forecast beyond 10 Days Lead

1 code implementation6 Apr 2023 Kang Chen, Tao Han, Junchao Gong, Lei Bai, Fenghua Ling, Jing-Jia Luo, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, Yuanzheng Ci, Bin Li, Xiaokang Yang, Wanli Ouyang

We present FengWu, an advanced data-driven global medium-range weather forecast system based on Artificial Intelligence (AI).

Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks

3 code implementations4 Apr 2023 Yixuan Weng, Minjun Zhu, Fei Xia, Bin Li, Shizhu He, Kang Liu, Jun Zhao

Our work highlights the potential of seamlessly unifying explicit rule learning via CoNNs and implicit pattern learning in LMs, paving the way for true symbolic comprehension capabilities.

Arithmetic Reasoning Language Modelling

Mechanism Design for Ad Auctions with Display Prices

no code implementations23 Mar 2023 Bin Li, Yahui Lei

In this paper, we study ad auctions with display prices from the perspective of mechanism design, in which advertisers are asked to submit both the costs and prices of their products.

Uncertainty-aware U-Net for Medical Landmark Detection

no code implementations18 Mar 2023 Ziyang Ye, Haiyang Yu, Bin Li

To estimate the uncertainty, we propose a module named Pyramid Covariance Predictor to predict the covariance matrices of the target Gaussian distributions, which determine the distributions of landmarks and represent the uncertainty of landmark annotation.

Provably Convergent Subgraph-wise Sampling for Fast GNN Training

no code implementations17 Mar 2023 Jie Wang, Zhihao Shi, Xize Liang, Shuiwang Ji, Bin Li, Feng Wu

During the message passing (MP) in GNNs, subgraph-wise sampling methods discard messages outside the mini-batches in backward passes to avoid the well-known neighbor explosion problem, i. e., the exponentially increasing dependencies of nodes with the number of MP iterations.

Neural Video Compression with Diverse Contexts

2 code implementations CVPR 2023 Jiahao Li, Bin Li, Yan Lu

Better yet, our codec has surpassed the under-developing next generation traditional codec/ECM in both RGB and YUV420 colorspaces, in terms of PSNR.

Optical Flow Estimation Video Compression

Generalization in Visual Reinforcement Learning with the Reward Sequence Distribution

1 code implementation19 Feb 2023 Jie Wang, Rui Yang, Zijie Geng, Zhihao Shi, Mingxuan Ye, Qi Zhou, Shuiwang Ji, Bin Li, Yongdong Zhang, Feng Wu

The appealing features of RSD-OA include that: (1) RSD-OA is invariant to visual distractions, as it is conditioned on the predefined subsequent action sequence without task-irrelevant information from transition dynamics, and (2) the reward sequence captures long-term task-relevant information in both rewards and transition dynamics.

reinforcement-learning Reinforcement Learning (RL) +1

Energy Efficient Computation Offloading in Aerial Edge Networks With Multi-Agent Cooperation

no code implementations14 Feb 2023 Wenshuai Liu, Bin Li, Wancheng Xie, Yueyue Dai, Zesong Fei

With the high flexibility of supporting resource-intensive and time-sensitive applications, unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) is proposed as an innovational paradigm to support the mobile users (MUs).

Edge-computing Scheduling

EVC: Towards Real-Time Neural Image Compression with Mask Decay

1 code implementation10 Feb 2023 Guo-Hua Wang, Jiahao Li, Bin Li, Yan Lu

Both mask decay and residual representation learning greatly improve the RD performance of our scalable encoder.

Image Compression Representation Learning

Kainate receptor modulation by NETO2

no code implementations2 Feb 2023 Lingli He, Jiahui Sun, Yiwei Gao, Bin Li, Yuhang Wang, Yanli Dong, Weidong An, Hang Li, Bei Yang, Yuhan Ge, Xuejun Cai Zhang, Yun Stone Shi, Yan Zhao

Glutamate-gated kainate receptors (KARs) are ubiquitous in the central nervous system of vertebrates, mediate synaptic transmission on post-synapse, and modulate transmitter release on pre-synapse.

Learning Trustworthy Model from Noisy Labels based on Rough Set for Surface Defect Detection

no code implementations25 Jan 2023 Tongzhi Niu, Bin Li, Kai Li, Yufeng Lin, Yuwei Li, Weifeng Li, Zhenrong Wang

In the surface defect detection, there are some suspicious regions that cannot be uniquely classified as abnormal or normal.

Defect Detection

Time-Conditioned Generative Modeling of Object-Centric Representations for Video Decomposition and Prediction

1 code implementation21 Jan 2023 Chengmin Gao, Bin Li

To reconstruct the complete shape of an object accurately, we enhance the disentanglement between the latent representations of objects and views, where the latent representations of time-conditioned views are jointly inferred with a Transformer and then are input to a sequential extension of Slot Attention to learn object-centric representations.

Disentanglement Gaussian Processes +2

Motion Information Propagation for Neural Video Compression

no code implementations CVPR 2023 Linfeng Qi, Jiahao Li, Bin Li, Houqiang Li, Yan Lu

Meanwhile, besides assisting frame coding at the current time step, the feature from context generation will be propagated as motion condition when coding the subsequent motion latent.

Video Compression

Learning Accurate 3D Shape Based on Stereo Polarimetric Imaging

no code implementations CVPR 2023 Tianyu Huang, Haoang Li, Kejing He, Congying Sui, Bin Li, Yun-hui Liu

As to the orthographic projection problem, we propose a novel Viewing Direction-aided Positional Encoding (VDPE) strategy.

Test-time Personalizable Forecasting of 3D Human Poses

no code implementations ICCV 2023 Qiongjie Cui, Huaijiang Sun, Jianfeng Lu, Weiqing Li, Bin Li, Hongwei Yi, Haofan Wang

Current motion forecasting approaches typically train a deep end-to-end model from the source domain data, and then apply it directly to target subjects.

Motion Forecasting

Large Language Models are Better Reasoners with Self-Verification

1 code implementation19 Dec 2022 Yixuan Weng, Minjun Zhu, Fei Xia, Bin Li, Shizhu He, Shengping Liu, Bin Sun, Kang Liu, Jun Zhao

By performing a backward verification of the answers that LLM deduced for itself, we can obtain interpretable answer validation scores to select the candidate answer with the highest score.

Arithmetic Reasoning Common Sense Reasoning +3

Adversarial Example Defense via Perturbation Grading Strategy

no code implementations16 Dec 2022 Shaowei Zhu, Wanli Lyu, Bin Li, Zhaoxia Yin, Bin Luo

In addition, the proposed method does not modify any task model, which can be used as a preprocessing module, which significantly reduces the deployment cost in practical applications.

Artificial Text Detection with Multiple Training Strategies

no code implementations10 Dec 2022 Bin Li, Yixuan Weng, Qiya Song, Hanjun Deng

As the deep learning rapidly promote, the artificial texts created by generative models are commonly used in news and social media.

Language Modelling Text Detection

Chinese Character Recognition with Radical-Structured Stroke Trees

no code implementations24 Nov 2022 Haiyang Yu, Jingye Chen, Bin Li, xiangyang xue

In this paper, we represent each Chinese character as a stroke tree, which is organized according to its radical structures, to fully exploit the merits of both radical and stroke levels in a decent way.

Compositional Scene Modeling with Global Object-Centric Representations

no code implementations21 Nov 2022 Tonglin Chen, Bin Li, Zhimeng Shen, xiangyang xue

Inspired by such an ability of humans, this paper proposes a compositional scene modeling method to infer global representations of canonical images of objects without any supervision.

Object Patch Matching +1

Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information

1 code implementation CVPR 2023 Weijie Su, Xizhou Zhu, Chenxin Tao, Lewei Lu, Bin Li, Gao Huang, Yu Qiao, Xiaogang Wang, Jie zhou, Jifeng Dai

It has been proved that combining multiple pre-training strategies and data from various modalities/sources can greatly boost the training of large-scale models.

Ranked #2 on Semantic Segmentation on ADE20K (using extra training data)

Image Classification Long-tailed Object Detection +3

Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning

1 code implementation15 Nov 2022 Shangchao Su, Mingzhao Yang, Bin Li, xiangyang xue

In this paper, we propose a federated adaptive prompt tuning algorithm, FedAPT, for multi-domain collaborative image classification with powerful foundation models, like CLIP.

Federated Learning Image Classification

Visual Answer Localization with Cross-modal Mutual Knowledge Transfer

1 code implementation26 Oct 2022 Yixuan Weng, Bin Li

In this paper, we propose a cross-modal mutual knowledge transfer span localization (MutualSL) method to reduce the knowledge deviation.

Transfer Learning

Slippage-robust Gaze Tracking for Near-eye Display

no code implementations20 Oct 2022 Wei zhang, Jiaxi Cao, Xiang Wang, Enqi Tian, Bin Li

In recent years, head-mounted near-eye display devices have become the key hardware foundation for virtual reality and augmented reality.

Human Joint Kinematics Diffusion-Refinement for Stochastic Motion Prediction

no code implementations12 Oct 2022 Dong Wei, Huaijiang Sun, Bin Li, Jianfeng Lu, Weiqing Li, Xiaoning Sun, Shengxiang Hu

This process offers a natural way to obtain the "whitened" latents without any trainable parameters, and human motion prediction can be regarded as the reverse diffusion process that converts the noise distribution into realistic future motions conditioned on the observed sequence.

motion prediction Stochastic Human Motion Prediction

Learning to Locate Visual Answer in Video Corpus Using Question

1 code implementation11 Oct 2022 Bin Li, Yixuan Weng, Bin Sun, Shutao Li

We introduce a new task, named video corpus visual answer localization (VCVAL), which aims to locate the visual answer in a large collection of untrimmed instructional videos using a natural language question.

Contrastive Learning Language Modelling +2

Domain Discrepancy Aware Distillation for Model Aggregation in Federated Learning

no code implementations4 Oct 2022 Shangchao Su, Bin Li, xiangyang xue

In this paper, we first analyze the generalization bound of the aggregation model produced from knowledge distillation for the client domains, and then describe two challenges, server-to-client discrepancy and client-to-client discrepancy, brought to the aggregation model by the domain discrepancies.

Federated Learning Knowledge Distillation

Domain-Unified Prompt Representations for Source-Free Domain Generalization

1 code implementation29 Sep 2022 Hongjing Niu, Hanting Li, Feng Zhao, Bin Li

The proposed scheme generates diverse prompts from a domain bank that contains many more diverse domains than existing DG datasets.

Source-free Domain Generalization

TODE-Trans: Transparent Object Depth Estimation with Transformer

1 code implementation18 Sep 2022 Kang Chen, Shaochen Wang, Beihao Xia, Dongxu Li, Zhen Kan, Bin Li

We observe that the global characteristics of the transformer make it easier to extract contextual information to perform depth estimation of transparent areas.

Depth Estimation Object +2

Compositional Law Parsing with Latent Random Functions

1 code implementation15 Sep 2022 Fan Shi, Bin Li, xiangyang xue

The automatic parsing of these laws indicates the model's ability to understand the scene, which makes law parsing play a central role in many visual tasks.

Position Visual Reasoning

Rain Removal from Light Field Images with 4D Convolution and Multi-scale Gaussian Process

1 code implementation16 Aug 2022 Tao Yan, Mingyue Li, Bin Li, Yang Yang, Rynson W. H. Lau

However, making full use of the abundant information available from LFIs, such as 2D array of sub-views and the disparity map of each sub-view, for effective rain removal is still a challenging problem.

Depth Estimation Rain Removal

Style Spectroscope: Improve Interpretability and Controllability through Fourier Analysis

no code implementations12 Aug 2022 Zhiyu Jin, Xuli Shen, Bin Li, xiangyang xue

We connect Fourier amplitude and phase with Gram matrices and a content reconstruction loss in style transfer, respectively.

Style Transfer

Clear Memory-Augmented Auto-Encoder for Surface Defect Detection

no code implementations8 Aug 2022 Wei Luo, Tongzhi Niu, Lixin Tang, Wenyong Yu, Bin Li

At first, we propose a novel clear memory-augmented module (CMAM), which combines the encoding and memoryencoding in a way of forgetting and inputting, thereby repairing abnormal foregrounds and preserving clear backgrounds.

Anomaly Detection Defect Detection

Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction

no code implementations2 Aug 2022 Xiaoning Sun, Qiongjie Cui, Huaijiang Sun, Bin Li, Weiqing Li, Jianfeng Lu

Previous works on human motion prediction follow the pattern of building a mapping relation between the sequence observed and the one to be predicted.

Human motion prediction motion prediction +3

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs

1 code implementation18 Jul 2022 Ziqiang Li, Chaoyue Wang, Heliang Zheng, Jing Zhang, Bin Li

Since data augmentation strategies have largely alleviated the training instability, how to further improve the generative performance of DE-GANs becomes a hotspot.

Contrastive Learning Data Augmentation

Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression

1 code implementation13 Jul 2022 Jiahao Li, Bin Li, Yan Lu

Besides estimating the probability distribution, our entropy model also generates the quantization step at spatial-channel-wise.

Quantization Video Compression

Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation

no code implementations5 Jul 2022 Bin Li, Yixuan Weng, Ziyu Ma, Bin Sun, Shutao Li

To fully leverage the visual information for both scene understanding and dialogue generation, we propose the scene-aware prompt for the MDUG task.

Dialogue Generation Dialogue Understanding +2

Cross-domain Federated Object Detection

no code implementations30 Jun 2022 Shangchao Su, Bin Li, Chengzhi Zhang, Mingzhao Yang, xiangyang xue

Federated learning can enable multi-party collaborative learning without leaking client data.

Autonomous Driving Federated Learning +3

Adversarial Reconfigurable Intelligent Surface Against Physical Layer Key Generation

no code implementations22 Jun 2022 Zhuangkun Wei, Bin Li, Weisi Guo

The development of reconfigurable intelligent surfaces (RIS) has recently advanced the research of physical layer security (PLS).

STD-NET: Search of Image Steganalytic Deep-learning Architecture via Hierarchical Tensor Decomposition

1 code implementation12 Jun 2022 Shunquan Tan, Qiushi Li, Laiyuan Li, Bin Li, Jiwu Huang

We propose a normalized distortion threshold to evaluate the sensitivity of each involved convolutional layer of the base model to guide STD-NET to compress target network in an efficient and unsupervised approach, and obtain two network structures of different shapes with low computation cost and similar performance compared with the original one.

Model Compression Steganalysis +1

Siamese Image Modeling for Self-Supervised Vision Representation Learning

2 code implementations CVPR 2023 Chenxin Tao, Xizhou Zhu, Weijie Su, Gao Huang, Bin Li, Jie zhou, Yu Qiao, Xiaogang Wang, Jifeng Dai

Driven by these analysis, we propose Siamese Image Modeling (SiameseIM), which predicts the dense representations of an augmented view, based on another masked view from the same image but with different augmentations.

Representation Learning Self-Supervised Learning +1

Dog nose print matching with dual global descriptor based on Contrastive Learning

1 code implementation1 Jun 2022 Bin Li, Zhongan Wang, Nan Wu, Shuai Shi, Qijun Ma

These methods generally extract the global features as descriptor to represent the original image.

Contrastive Learning

Learning Task-relevant Representations for Generalization via Characteristic Functions of Reward Sequence Distributions

1 code implementation20 May 2022 Rui Yang, Jie Wang, Zijie Geng, Mingxuan Ye, Shuiwang Ji, Bin Li, Feng Wu

Generalization across different environments with the same tasks is critical for successful applications of visual reinforcement learning (RL) in real scenarios.

Reinforcement Learning (RL)

One-shot Federated Learning without Server-side Training

1 code implementation26 Apr 2022 Shangchao Su, Bin Li, xiangyang xue

Federated Learning (FL) has recently made significant progress as a new machine learning paradigm for privacy protection.

Federated Learning Image Classification +1

Hierarchical Locality Sensitive Hashing for Structured Data: A Survey

no code implementations24 Apr 2022 Wei Wu, Bin Li

Data similarity (or distance) computation is a fundamental research topic which fosters a variety of similarity-based machine learning and data mining applications.

Data-Efficient Backdoor Attacks

1 code implementation22 Apr 2022 Pengfei Xia, Ziqiang Li, Wei zhang, Bin Li

Recent studies have proven that deep neural networks are vulnerable to backdoor attacks.

LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs

1 code implementation20 Apr 2022 Fei Xia, Bin Li, Yixuan Weng, Shizhu He, Kang Liu, Bin Sun, Shutao Li, Jun Zhao

The medical conversational system can relieve the burden of doctors and improve the efficiency of healthcare, especially during the pandemic.

Conversational Question Answering Dialogue Generation +3

A Comprehensive Survey on Data-Efficient GANs in Image Generation

no code implementations18 Apr 2022 Ziqiang Li, Beihao Xia, Jing Zhang, Chaoyue Wang, Bin Li

Generative Adversarial Networks (GANs) have achieved remarkable achievements in image synthesis.

Image Generation

Towards Better Chinese-centric Neural Machine Translation for Low-resource Languages

1 code implementation9 Apr 2022 Bin Li, Yixuan Weng, Fei Xia, Hanjun Deng

The last decade has witnessed enormous improvements in science and technology, stimulating the growing demand for economic and cultural exchanges in various countries.

Machine Translation NMT +3

Prompt-based System for Personality and Interpersonal Reactivity Prediction

no code implementations WASSA (ACL) 2022 Bin Li, Yixuan Weng

This paper describes our proposed method for the Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA) 2022 shared task on Personality Prediction (PER) and Reactivity Index Prediction (IRI).

Data Augmentation Language Modelling

Neural Compression-Based Feature Learning for Video Restoration

no code implementations CVPR 2022 Cong Huang, Jiahao Li, Bin Li, Dong Liu, Yan Lu

The temporal features usually contain various noisy and uncorrelated information, and they may interfere with the restoration of the current frame.

Denoising Quantization +3

Multi-Unit Diffusion Auctions with Intermediaries

no code implementations15 Mar 2022 Bin Li, Dong Hao, Dengji Zhao

This paper studies multi-unit auctions powered by intermediaries, where each intermediary owns a private set of unit-demand buyers and all intermediaries are networked with each other.

Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional Video

no code implementations13 Mar 2022 Bin Li, Yixuan Weng, Bin Sun, Shutao Li

However, due to the weak correlations and huge gaps of the semantic features between the textual question and visual answer, existing methods adopting visual span predictor perform poorly in the TAGV task.

Language Modelling Question Answering +2

Remote blood pressure measurement via spatiotemporal mapping of a short-time facial video

no code implementations7 Mar 2022 Jialiang Zhuang, Bin Li, Yun Zhang, YuHeng Chen, Xiujuan Zheng

Blood pressure (BP) monitoring is vital in daily healthcare, especially for cardiovascular diseases.

Compositional Scene Representation Learning via Reconstruction: A Survey

no code implementations15 Feb 2022 Jinyang Yuan, Tonglin Chen, Bin Li, xiangyang xue

In this survey, we first outline the current progress on reconstruction-based compositional scene representation learning with deep neural networks, including development history and categorizations of existing methods from the perspectives of the modeling of visual scenes and the inference of scene representations; then provide benchmarks, including an open source toolbox to reproduce the benchmark experiments, of representative methods that consider the most extensively studied problem setting and form the foundation for other methods; and finally discuss the limitations of existing methods and future directions of this research topic.

Representation Learning

A Higher-Order Semantic Dependency Parser

1 code implementation27 Jan 2022 Bin Li, Yunlong Fan, Yikemaiti Sataer, Zhiqiang Gao

Higher-order features bring significant accuracy gains in semantic dependency parsing.

Dependency Parsing Graph Learning +1

PROMPT: Learning Dynamic Resource Allocation Policies for Network Applications

no code implementations19 Jan 2022 Drew Penney, Bin Li, Jaroslaw Sydir, Lizhong Chen, Charlie Tai, Stefan Lee, Eoin Walsh, Thomas Long

A growing number of service providers are exploring methods to improve server utilization and reduce power consumption by co-scheduling high-priority latency-critical workloads with best-effort workloads.

Scheduling

Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

1 code implementation30 Dec 2021 Haiyang Yu, Jingye Chen, Bin Li, jianqi ma, Mengnan Guan, Xixi Xu, Xiaocong Wang, Shaobo Qu, xiangyang xue

The experimental results indicate that the performance of baselines on CTR datasets is not as good as that on English datasets due to the characteristics of Chinese texts that are quite different from the Latin alphabet.

Attribute Benchmarking +1

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization

no code implementations20 Dec 2021 Yufei Kuang, Miao Lu, Jie Wang, Qi Zhou, Bin Li, Houqiang Li

Many existing algorithms learn robust policies by modeling the disturbance and applying it to source environments during training, which usually requires prior knowledge about the disturbance and control of simulators.

ADBCMM : Acronym Disambiguation by Building Counterfactuals and Multilingual Mixing

1 code implementation8 Dec 2021 Yixuan Weng, Fei Xia, Bin Li, Xiusheng Huang, Shizhu He

To address the above issue, this paper proposes an new method for acronym disambiguation, named as ADBCMM, which can significantly improve the performance of low-resource languages by building counterfactuals and multilingual mixing.

Task 2

Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints

no code implementations7 Dec 2021 Jinyang Yuan, Bin Li, xiangyang xue

When observing a visual scene that contains multiple objects from multiple viewpoints, humans are able to perceive the scene in a compositional way from each viewpoint, while achieving the so-called "object constancy" across different viewpoints, even though the exact viewpoints are untold.

MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification

1 code implementation3 Dec 2021 Jingye Chen, Jieneng Chen, Zongwei Zhou, Bin Li, Alan Yuille, Yongyi Lu

However, these approaches formulated skin cancer diagnosis as a simple classification task, dismissing the potential benefit from lesion segmentation.

Classification Computational Efficiency +4

Continuous-time edge modelling using non-parametric point processes

no code implementations NeurIPS 2021 Xuhui Fan, Bin Li, Feng Zhou, Scott Sisson

The mutually-exciting Hawkes process (ME-HP) is a natural choice to model reciprocity, which is an important attribute of continuous-time edge (dyadic) data.

Attribute Gaussian Processes +2

SimCLAD: A Simple Framework for Contrastive Learning of Acronym Disambiguation

no code implementations29 Nov 2021 Bin Li, Fei Xia, Yixuan Weng, Xiusheng Huang, Bin Sun

In this paper, we propose a Simple framework for Contrastive Learning of Acronym Disambiguation (SimCLAD) method to better understand the acronym meanings.

Contrastive Learning document understanding +1

PSG: Prompt-based Sequence Generation for Acronym Extraction

no code implementations29 Nov 2021 Bin Li, Fei Xia, Yixuan Weng, Xiusheng Huang, Bin Sun, Shutao Li

In this paper, we propose a Prompt-based Sequence Generation (PSG) method for the acronym extraction task.

document understanding Language Modelling +1

Temporal Context Mining for Learned Video Compression

1 code implementation27 Nov 2021 Xihua Sheng, Jiahao Li, Bin Li, Li Li, Dong Liu, Yan Lu

From the stored propagated features, we propose to learn multi-scale temporal contexts, and re-fill the learned temporal contexts into the modules of our compression scheme, including the contextual encoder-decoder, the frame generator, and the temporal context encoder.

MS-SSIM SSIM +1

Enhancing Backdoor Attacks with Multi-Level MMD Regularization

1 code implementation9 Nov 2021 Pengfei Xia, Hongjing Niu, Ziqiang Li, Bin Li

Then, ML-MMDR, a difference reduction method that adds multi-level MMD regularization into the loss, is proposed, and its effectiveness is testified on three typical difference-based defense methods.

Backdoor Attack

Tightening the Approximation Error of Adversarial Risk with Auto Loss Function Search

no code implementations9 Nov 2021 Pengfei Xia, Ziqiang Li, Bin Li

The most common solution for this is to compute an approximate risk by replacing the 0-1 loss with a surrogate one.

Adversarial Robustness AutoML

RelationRS: Relationship Representation Network for Object Detection in Aerial Images

no code implementations13 Oct 2021 Zhiming Liu, Xuefei Zhang, Chongyang Liu, Hao Wang, Chao Sun, Bin Li, Weifeng Sun, Pu Huang, Qingjun Li, Yu Liu, Haipeng Kuang, Jihong Xiu

To address these issues, we propose a relationship representation network for object detection in aerial images (RelationRS): 1) Firstly, multi-scale features are fused and enhanced by a dual relationship module (DRM) with conditional convolution.

Object object-detection +1

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

no code implementations11 Oct 2021 Weiming Liu, Huacong Jiang, Bin Li, Houqiang Li

Follow-the-Regularized-Lead (FTRL) and Online Mirror Descent (OMD) are regret minimization algorithms for Online Convex Optimization (OCO), they are mathematically elegant but less practical in solving Extensive-Form Games (EFGs).

counterfactual

Stereo Dense Scene Reconstruction and Accurate Localization for Learning-Based Navigation of Laparoscope in Minimally Invasive Surgery

no code implementations8 Oct 2021 Ruofeng Wei, Bin Li, Hangjie Mo, Bo Lu, Yonghao Long, Bohan Yang, Qi Dou, Yunhui Liu, Dong Sun

Then, we develop a dense visual reconstruction algorithm to represent the scene by surfels, estimate the laparoscope poses and fuse the depth maps into a unified reference coordinate for tissue reconstruction.

Anatomy Depth Estimation

Deep Contextual Video Compression

1 code implementation NeurIPS 2021 Jiahao Li, Bin Li, Yan Lu

In this paper, we propose a deep contextual video compression framework to enable a paradigm shift from predictive coding to conditional coding.

Video Compression

Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game

no code implementations ICLR 2022 Haobo Fu, Weiming Liu, Shuang Wu, Yijia Wang, Tao Yang, Kai Li, Junliang Xing, Bin Li, Bo Ma, Qiang Fu, Yang Wei

The deep policy gradient method has demonstrated promising results in many large-scale games, where the agent learns purely from its own experience.

counterfactual Policy Gradient Methods

SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

1 code implementation30 Aug 2021 Jiaqi Xu, Bin Li, Bo Lu, Yun-hui Liu, Qi Dou, Pheng-Ann Heng

Ten learning-based surgical tasks are built in the platform, which are common in the real autonomous surgical execution.

Reinforcement Learning (RL)

DuCN: Dual-children Network for Medical Diagnosis and Similar Case Recommendation towards COVID-19

no code implementations3 Aug 2021 Chengtao Peng, Yunfei Long, Senhua Zhu, Dandan Tu, Bin Li

Our proposed network contains two stages: the first one is a lung region segmentation step and is used to exclude irrelevant factors, and the second is a detection and recommendation stage.

Medical Diagnosis

More but Correct: Generating Diversified and Entity-revised Medical Response

no code implementations3 Aug 2021 Bin Li, Encheng Chen, Hongru Liu, Yixuan Weng, Bin Sun, Shutao Li, Yongping Bai, Meiling Hu

Medical Dialogue Generation (MDG) is intended to build a medical dialogue system for intelligent consultation, which can communicate with patients in real-time, thereby improving the efficiency of clinical diagnosis with broad application prospects.

Dialogue Generation

Self-Adversarial Training incorporating Forgery Attention for Image Forgery Localization

1 code implementation6 Jul 2021 Long Zhuo, Shunquan Tan, Bin Li, Jiwu Huang

In this paper, we propose a self-adversarial training strategy and a reliable coarse-to-fine network that utilizes a self-attention mechanism to localize forged regions in forgery images.

Zero-Shot Chinese Character Recognition with Stroke-Level Decomposition

1 code implementation22 Jun 2021 Jingye Chen, Bin Li, xiangyang xue

Inspired by the fact that humans can generalize to know how to write characters unseen before if they have learned stroke orders of some characters, we propose a stroke-based method by decomposing each character into a sequence of strokes, which are the most basic units of Chinese characters.

Bilateral Personalized Dialogue Generation with Contrastive Learning

1 code implementation15 Jun 2021 Bin Li, Hanjun Deng

Generating personalized responses is one of the major challenges in natural human-robot interaction.

Contrastive Learning Dialogue Generation +2

Improving Cost Learning for JPEG Steganography by Exploiting JPEG Domain Knowledge

no code implementations9 May 2021 Weixuan Tang, Bin Li, Mauro Barni, Jin Li, Jiwu Huang

To address the issue, in this paper we extend an existing automatic cost learning scheme to JPEG, where the proposed scheme called JEC-RL (JPEG Embedding Cost with Reinforcement Learning) is explicitly designed to tailor the JPEG DCT structure.

reinforcement-learning Reinforcement Learning (RL)

Context-Based Soft Actor Critic for Environments with Non-stationary Dynamics

1 code implementation7 May 2021 Yuan Pu, Shaochen Wang, Xin Yao, Bin Li

The performance of deep reinforcement learning methods prone to degenerate when applied to environments with non-stationary dynamics.

Continuous Control

Sparse online relative similarity learning

no code implementations15 Apr 2021 Dezhong Yao, Peilin Zhao, Chen Yu, Hai Jin, Bin Li

This is clearly inefficient for high dimensional tasks due to its high memory and computational complexity.

Metric Learning

MCTSteg: A Monte Carlo Tree Search-based Reinforcement Learning Framework for Universal Non-additive Steganography

1 code implementation25 Mar 2021 Xianbo Mo, Shunquan Tan, Bin Li, Jiwu Huang

Recent research has shown that non-additive image steganographic frameworks effectively improve security performance through adjusting distortion distribution.

Self-Learning

Raven's Progressive Matrices Completion with Latent Gaussian Process Priors

2 code implementations22 Mar 2021 Fan Shi, Bin Li, xiangyang xue

In this paper we aim to solve the latter one by proposing a deep latent variable model, in which multiple Gaussian processes are employed as priors of latent variables to separately learn underlying abstract concepts from RPMs; thus the proposed model is interpretable in terms of concept-specific latent variables.

Answer Selection Gaussian Processes +1

Exploring The Effect of High-frequency Components in GANs Training

2 code implementations20 Mar 2021 Ziqiang Li, Pengfei Xia, Xue Rui, Bin Li

Generative Adversarial Networks (GANs) have the ability to generate images that are visually indistinguishable from real images.

Vocal Bursts Intensity Prediction

Knowledge-Guided Object Discovery with Acquired Deep Impressions

1 code implementation19 Mar 2021 Jinyang Yuan, Bin Li, xiangyang xue

The proposed ADI framework focuses on the acquisition and utilization of knowledge, and is complementary to existing deep generative models proposed for compositional scene representation.

Object Object Discovery +1

PENet: Towards Precise and Efficient Image Guided Depth Completion

3 code implementations1 Mar 2021 Mu Hu, Shuling Wang, Bin Li, Shiyu Ning, Li Fan, Xiaojin Gong

More specifically, one branch inputs a color image and a sparse depth map to predict a dense depth map.

Depth Completion

Serial-parallel Multi-Scale Feature Fusion for Anatomy-Oriented Hand Joint Detection

no code implementations19 Feb 2021 Bin Li, Hong Fu, Ruimin Li, Wendi Wang

Accurate hand joints detection from images is a fundamental topic which is essential for many applications in computer vision and human computer interaction.

Anatomy

An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints

no code implementations NeurIPS 2021 Xin Liu, Bin Li, Pengyi Shi, Lei Ying

Thus, the overall computational complexity of our algorithm is similar to that of the linear UCB for unconstrained stochastic linear bandits.

Infant Cry Classification with Graph Convolutional Networks

no code implementations31 Jan 2021 Chunyan Ji, Ming Chen, Bin Li, Yi Pan

We propose an approach of graph convolutional networks for robust infant cry classification.

Classification General Classification +1

Image Steganography based on Iteratively Adversarial Samples of A Synchronized-directions Sub-image

no code implementations13 Jan 2021 Xinghong Qin, Shunquan Tan, Bin Li, Weixuan Tang, Jiwu Huang

In this paper, we present a novel steganography scheme denoted as ITE-SYN (based on ITEratively adversarial perturbations onto a SYNchronized-directions sub-image), by which security data is embedded with synchronizing modification directions to enhance security and then iteratively increased perturbations are added onto a sub-image to reduce loss with cover class label of the target CNN classifier.

Image Steganography Steganalysis