Search Results for author: Dawei Dai

Found 12 papers, 4 papers with code

HumanVLM: Foundation for Human-Scene Vision-Language Model

no code implementations5 Nov 2024 Dawei Dai, Xu Long, Li Yutang, Zhang YuanHui, Shuyin Xia

Specifically, (1) we create a large-scale human-scene multimodal image-text dataset (HumanCaption-10M) sourced from the Internet to facilitate domain-specific alignment; (2) develop a captioning approach for human-centered images, capturing human faces, bodies, and backgrounds, and construct a high-quality Human-Scene image-text dataset (HumanCaptionHQ, about 311k pairs) that contain as much detailed information as possible about human; (3) Using HumanCaption-10M and HumanCaptionHQ, we train a HumanVLM.

Language Modeling Language Modelling +1

Granular-ball Representation Learning for Deep CNN on Learning with Label Noise

no code implementations5 Sep 2024 Dawei Dai, Hao Zhu, Shuyin Xia, Guoyin Wang

Specifically, considering the classification task: (1) in forward process, we split the input samples as $gb$ samples at feature-level, each of which can correspond to multiple samples with varying numbers and share one single label; (2) during the backpropagation process, we modify the gradient allocation strategy of the GBC module to enable it to propagate normally; and (3) we develop an experience replay policy to ensure the stability of the training process.

Attribute Representation Learning

15M Multimodal Facial Image-Text Dataset

no code implementations11 Jul 2024 Dawei Dai, Yutang Li, Yingge Liu, Mingming Jia, Zhang YuanHui, Guoyin Wang

FaceCaption-15M comprises over 15 million pairs of facial images and their corresponding natural language descriptions of facial features, making it the largest facial image-caption dataset to date.

Image to text

Sketch Less Face Image Retrieval: A New Challenge

1 code implementation11 Feb 2023 Dawei Dai, Yutang Li, Liang Wang, Shiyu Fu, Shuyin Xia, Guoyin Wang

In this study, we proposed a new task named sketch less face image retrieval (SLFIR), in which the retrieval was carried out at each stroke and aim to retrieve the target face photo using a partial sketch with as few strokes as possible (see Fig. 1).

Face Image Retrieval Retrieval +1

A Study of Deep CNN Model with Labeling Noise Based on Granular-ball Computing

no code implementations17 Jul 2022 Dawei Dai, Donggen Li, Zhiguo Zhuang

In this paper, we pioneered a granular ball neural network algorithm model, which adopts the idea of multi-granular to filter label noise samples during model training, solving the current problem of model instability caused by label noise in the field of deep learning, greatly reducing the proportion of label noise in training samples and improving the robustness of neural network models.

Decision Making

One-Stage Deep Edge Detection Based on Dense-Scale Feature Fusion and Pixel-Level Imbalance Learning

no code implementations17 Mar 2022 Dawei Dai, Chunjie Wang, Shuyin Xia, Yingge Liu, Guoyin Wang

Edge detection, a basic task in the field of computer vision, is an important preprocessing operation for the recognition and understanding of a visual scene.

Decoder Edge Detection

Rethinking the Image Feature Biases Exhibited by Deep CNN Models

no code implementations3 Nov 2021 Dawei Dai, Yutang Li, Huanan Bao, Sy Xia, Guoyin Wang, Xiaoli Ma

From the results, we conclude that (1) the combined effect of certain features is typically far more influential than any single feature; (2) in different tasks, neural models can perform different biases, that is, we can design a specific task to make a neural model biased toward a specific anticipated feature.

Understanding the Feedforward Artificial Neural Network Model From the Perspective of Network Flow

no code implementations26 Apr 2017 Dawei Dai, Weimin Tan, Hong Zhan

Experiments for two types of ANN model including multi-layer MLP and CNN verify that the network flow based on class-pathway is a reasonable explanation for ANN models.

Cannot find the paper you are looking for? You can Submit a new open access paper.