Search Results for author: Bin Sun

Found 48 papers, 16 papers with code

Continuing Pre-trained Model with Multiple Training Strategies for Emotional Classification

no code implementations WASSA (ACL) 2022 Bin Li, Yixuan Weng, Qiya Song, Bin Sun, Shutao Li

This paper describes the contribution of the LingJing team’s method to the Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA) 2022 shared task on Emotion Classification.

Attribute Classification +4

VPAI_Lab at MedVidQA 2022: A Two-Stage Cross-modal Fusion Method for Medical Instructional Video Classification

1 code implementation BioNLP (ACL) 2022 Bin Li, Yixuan Weng, Fei Xia, Bin Sun, Shutao Li

Given an input video, the MedVidCL task aims to correctly classify it into one of three following categories: Medical Instructional, Medical Non-instructional, and Non-medical.

Video Classification

GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering

no code implementations4 Feb 2024 Ziyu Ma, Shutao Li, Bin Sun, Jianfei Cai, Zuxiang Long, Fuyan Ma

Therefore, we propose GeReA, a generate-reason framework that prompts a MLLM like InstructBLIP with question relevant vision and language information to generate knowledge-relevant descriptions and reasons those descriptions for knowledge-based VQA.

Language Modelling Large Language Model +3

Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

1 code implementation20 Dec 2023 Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, HeDa Wang, Kan Li

In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones.

Arithmetic Reasoning

EXMODD: An EXplanatory Multimodal Open-Domain Dialogue dataset

1 code implementation17 Oct 2023 Hang Yin, Pinren Lu, Ziang Li, Bin Sun, Kan Li

The need for high-quality data has been a key issue hindering the research of dialogue tasks.

Language Modelling

Large Language Models Need Holistically Thought in Medical Conversational QA

1 code implementation9 May 2023 Yixuan Weng, Bin Li, Fei Xia, Minjun Zhu, Bin Sun, Shizhu He, Kang Liu, Jun Zhao

The medical conversational question answering (CQA) system aims at providing a series of professional medical services to improve the efficiency of medical care.

Conversational Question Answering

LOGO-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition

no code implementations5 May 2023 Fuyan Ma, Bin Sun, Shutao Li

Previous methods for dynamic facial expression recognition (DFER) in the wild are mainly based on Convolutional Neural Networks (CNNs), whose local operations ignore the long-range dependencies in videos.

Dynamic Facial Expression Recognition Facial Expression Recognition

Heterogeneous-Branch Collaborative Learning for Dialogue Generation

no code implementations21 Mar 2023 Yiwei Li, Shaoxiong Feng, Bin Sun, Kan Li

Collaborative learning, also known as online knowledge distillation, is an effective way to conduct one-stage group distillation in the absence of a well-trained large teacher model.

Attribute Dialogue Generation +1

Image as Set of Points

2 code implementations2 Mar 2023 Xu Ma, Yuqian Zhou, Huan Wang, Can Qin, Bin Sun, Chang Liu, Yun Fu

Context clusters (CoCs) view an image as a set of unorganized points and extract features via simplified clustering algorithm.

Clustering

Large Language Models are Better Reasoners with Self-Verification

1 code implementation19 Dec 2022 Yixuan Weng, Minjun Zhu, Fei Xia, Bin Li, Shizhu He, Shengping Liu, Bin Sun, Kang Liu, Jun Zhao

By performing a backward verification of the answers that LLM deduced for itself, we can obtain interpretable answer validation scores to select the candidate answer with the highest score.

Arithmetic Reasoning Common Sense Reasoning +3

Towards Diverse, Relevant and Coherent Open-Domain Dialogue Generation via Hybrid Latent Variables

no code implementations2 Dec 2022 Bin Sun, Yitong Li, Fei Mi, Weichao Wang, Yiwei Li, Kan Li

Specifically, HLV constrains the global semantics of responses through discrete latent variables and enriches responses with continuous latent variables.

Dialogue Generation Response Generation

Modeling Complex Dialogue Mappings via Sentence Semantic Segmentation Guided Conditional Variational Auto-Encoder

no code implementations1 Dec 2022 Bin Sun, Shaoxiong Feng, Yiwei Li, Weichao Wang, Fei Mi, Yitong Li, Kan Li

Complex dialogue mappings (CDM), including one-to-many and many-to-one mappings, tend to make dialogue models generate incoherent or dull responses, and modeling these mappings remains a huge challenge for neural dialogue systems.

Dialogue Generation Semantic Segmentation +1

Learning to Locate Visual Answer in Video Corpus Using Question

1 code implementation11 Oct 2022 Bin Li, Yixuan Weng, Bin Sun, Shutao Li

We introduce a new task, named video corpus visual answer localization (VCVAL), which aims to locate the visual answer in a large collection of untrimmed instructional videos using a natural language question.

Contrastive Learning Language Modelling +2

TransFiner: A Full-Scale Refinement Approach for Multiple Object Tracking

no code implementations26 Jul 2022 Bin Sun

Multiple object tracking (MOT) is the task containing detection and association.

Multiple Object Tracking

Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation

no code implementations5 Jul 2022 Bin Li, Yixuan Weng, Ziyu Ma, Bin Sun, Shutao Li

To fully leverage the visual information for both scene understanding and dialogue generation, we propose the scene-aware prompt for the MDUG task.

Dialogue Generation Dialogue Understanding +2

Explicit and implicit models in infrared and visible image fusion

no code implementations20 Jun 2022 Zixuan Wang, Bin Sun

Infrared and visible images, as multi-modal image pairs, show significant differences in the expression of the same scene.

Infrared And Visible Image Fusion

Towards Layer-wise Image Vectorization

1 code implementation CVPR 2022 Xu Ma, Yuqian Zhou, Xingqian Xu, Bin Sun, Valerii Filev, Nikita Orlov, Yun Fu, Humphrey Shi

Image rasterization is a mature technique in computer graphics, while image vectorization, the reverse path of rasterization, remains a major challenge.

Stop Filtering: Multi-View Attribute-Enhanced Dialogue Learning

no code implementations23 May 2022 Yiwei Li, Bin Sun, Shaoxiong Feng, Kan Li

However, the discarded samples may obtain high scores in other perspectives and can provide regularization effects on the model learning, which causes the performance improvement to be sensitive to the filtering ratio.

Attribute

Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild

no code implementations10 May 2022 Fuyan Ma, Bin Sun, Shutao Li

Previous methods for dynamic facial expression in the wild are mainly based on Convolutional Neural Networks (CNNs), whose local operations ignore the long-range dependencies in videos.

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Diversifying Neural Dialogue Generation via Negative Distillation

no code implementations NAACL 2022 Yiwei Li, Shaoxiong Feng, Bin Sun, Kan Li

Generative dialogue models suffer badly from the generic response problem, limiting their applications to a few toy scenarios.

Dialogue Generation

LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs

1 code implementation20 Apr 2022 Fei Xia, Bin Li, Yixuan Weng, Shizhu He, Kang Liu, Bin Sun, Shutao Li, Jun Zhao

The medical conversational system can relieve the burden of doctors and improve the efficiency of healthcare, especially during the pandemic.

Conversational Question Answering Dialogue Generation +3

Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution

1 code implementation16 Mar 2022 Bin Sun, Yulun Zhang, Songyao Jiang, Yun Fu

In this paper, we propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.

Image Super-Resolution

Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional Video

no code implementations13 Mar 2022 Bin Li, Yixuan Weng, Bin Sun, Shutao Li

However, due to the weak correlations and huge gaps of the semantic features between the textual question and visual answer, existing methods adopting visual span predictor perform poorly in the TAGV task.

Language Modelling Question Answering +2

SimCLAD: A Simple Framework for Contrastive Learning of Acronym Disambiguation

no code implementations29 Nov 2021 Bin Li, Fei Xia, Yixuan Weng, Xiusheng Huang, Bin Sun

In this paper, we propose a Simple framework for Contrastive Learning of Acronym Disambiguation (SimCLAD) method to better understand the acronym meanings.

Contrastive Learning document understanding +1

PSG: Prompt-based Sequence Generation for Acronym Extraction

no code implementations29 Nov 2021 Bin Li, Fei Xia, Yixuan Weng, Xiusheng Huang, Bin Sun, Shutao Li

In this paper, we propose a Prompt-based Sequence Generation (PSG) method for the acronym extraction task.

document understanding Language Modelling +1

Hybrid Mutimodal Fusion for Dimensional Emotion Recognition

no code implementations16 Oct 2021 Ziyu Ma, Fuyan Ma, Bin Sun, Shutao Li

For the MuSe-Stress sub-challenge, we highlight our solutions in three aspects: 1) the audio-visual features and the bio-signal features are used for emotional state recognition.

Emotion Recognition

Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble

2 code implementations12 Oct 2021 Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li, Yun Fu

Current Sign Language Recognition (SLR) methods usually extract features via deep neural networks and suffer overfitting due to limited and noisy data.

Action Recognition Sign Language Recognition +1

Grassmannian Graph-attentional Landmark Selection for Domain Adaptation

no code implementations7 Sep 2021 Bin Sun, Shaofan Wang, Dehui Kong, Jinghua Li, BaoCai Yin

GGLS presents a landmark selection scheme using attention-induced neighbors of the graphical structure of samples and performs distribution adaptation and knowledge adaptation over Grassmann manifold.

Domain Adaptation

More but Correct: Generating Diversified and Entity-revised Medical Response

no code implementations3 Aug 2021 Bin Li, Encheng Chen, Hongru Liu, Yixuan Weng, Bin Sun, Shutao Li, Yongping Bai, Meiling Hu

Medical Dialogue Generation (MDG) is intended to build a medical dialogue system for intelligent consultation, which can communicate with patients in real-time, thereby improving the efficiency of clinical diagnosis with broad application prospects.

Dialogue Generation

Generating Relevant and Coherent Dialogue Responses using Self-separated Conditional Variational AutoEncoders

no code implementations ACL 2021 Bin Sun, Shaoxiong Feng, Yiwei Li, Jiamou Liu, Kan Li

Conditional Variational AutoEncoder (CVAE) effectively increases the diversity and informativeness of responses in open-ended dialogue generation tasks through enriching the context vector with sampled latent variables.

Dialogue Generation Informativeness

THINK: A Novel Conversation Model for Generating Grammatically Correct and Coherent Responses

no code implementations28 May 2021 Bin Sun, Shaoxiong Feng, Yiwei Li, Jiamou Liu, Kan Li

In this work, we proposed a conversation model named "THINK" (Teamwork generation Hover around Impressive Noticeable Keywords) to make the decoder more complicated and avoid generating duplicated and self-contradicting responses.

Informativeness

GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot Action Recognition

no code implementations25 May 2021 Bin Sun, Dehui Kong, Shaofan Wang, Jinghua Li, BaoCai Yin, Xiaonan Luo

In the sampling stage, we utilize a generative adversarial networks (GAN) trained by action features and word vectors of seen classes to synthesize the action features of unseen classes, which can balance the training sample data of seen classes and unseen classes.

Action Recognition Classification +3

Real-time Human Action Recognition Using Locally Aggregated Kinematic-Guided Skeletonlet and Supervised Hashing-by-Analysis Model

no code implementations24 May 2021 Bin Sun, Shaofan Wang, Dehui Kong, LiChun Wang, BaoCai Yin

To tackle all these problems, we propose a real-time 3D action recognition framework by integrating the locally aggregated kinematic-guided skeletonlet (LAKS) with a supervised hashing-by-analysis (SHA) model.

3D Action Recognition Denoising

Facial Expression Recognition with Visual Transformers and Attentional Selective Fusion

no code implementations31 Mar 2021 Fuyan Ma, Bin Sun, Shutao Li

Facial Expression Recognition (FER) in the wild is extremely challenging due to occlusions, variant head poses, face deformation and motion blur under unconstrained conditions.

Facial Expression Recognition Facial Expression Recognition (FER)

Skeleton Aware Multi-modal Sign Language Recognition

3 code implementations16 Mar 2021 Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li, Yun Fu

Sign language is commonly used by deaf or speech impaired people to communicate but requires significant effort to master.

Sign Language Recognition Skeleton Based Action Recognition

Regularizing Dialogue Generation by Imitating Implicit Scenarios

no code implementations EMNLP 2020 Shaoxiong Feng, Xuancheng Ren, Hongshen Chen, Bin Sun, Kan Li, Xu sun

Human dialogues are scenario-based and appropriate responses generally relate to the latent context knowledge entailed by the specific scenario.

Dialogue Generation Imitation Learning

Recent Advances and New Guidelines on Hyperspectral and Multispectral Image Fusion

no code implementations8 Aug 2020 Renwei Dian, Shutao Li, Bin Sun, Anjing Guo

Hyperspectral image (HSI) with high spectral resolution often suffers from low spatial resolution owing to the limitations of imaging sensors.

LPRNet: Lightweight Deep Network by Low-rank Pointwise Residual Convolution

no code implementations25 Oct 2019 Bin Sun, Jun Li, Ming Shao, Yun Fu

To reduce the computation and memory costs, we propose a novel lightweight deep learning module by low-rank pointwise residual (LPR) convolution, called LPRNet.

Face Alignment Image Classification +1

Real-time Memory Efficient Large-pose Face Alignment via Deep Evolutionary Network

no code implementations25 Oct 2019 Bin Sun, Ming Shao, Siyu Xia, Yun Fu

To accelerate the model, we propose an efficient network structure to accelerate the evolutionary learning process through a factorization strategy.

Face Alignment Face Recognition

Cohomology of group theoretic Dehn fillings II

no code implementations4 Aug 2019 Nansen Petrosyan, Bin Sun

We apply these results to obtain hyperbolic and acylindrically hyperbolic quotients with special properties.

Group Theory 20F67, 20F10, 20E06

Cohomology of group theoretic Dehn fillings II: A spectral sequence

no code implementations29 Jul 2019 Bin Sun

This is the second paper in a series of three papers aiming to study cohomology of group theoretic Dehn fillings.

Group Theory

EV-Action: Electromyography-Vision Multi-Modal Action Dataset

1 code implementation20 Apr 2019 Lichen Wang, Bin Sun, Joseph Robinson, Taotao Jing, Yun Fu

To make up this, we introduce a new, large-scale EV-Action dataset in this work, which consists of RGB, depth, electromyography (EMG), and two skeleton modalities.

Action Analysis Action Recognition +3

GeoCapsNet: Aerial to Ground view Image Geo-localization using Capsule Network

no code implementations12 Apr 2019 Bin Sun, Chen Chen, Yingying Zhu, Jianmin Jiang

The task of cross-view image geo-localization aims to determine the geo-location (GPS coordinates) of a query ground-view image by matching it with the GPS-tagged aerial (satellite) images in a reference dataset.

Image Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.