Search Results for author: Fu Li

Found 47 papers, 24 papers with code

A Study on Training and Developing Large Language Models for Behavior Tree Generation

no code implementations16 Jan 2024 Fu Li, Xueying Wang, Bin Li, Yunlong Wu, Yanzhen Wang, Xiaodong Yi

The core contribution of this paper lies in the design of a BT generation framework based on LLM, which encompasses the entire process, from data synthesis and model training to application developing and data verification.

Investigating the Use of Traveltime and Reflection Tomography for Deep Learning-Based Sound-Speed Estimation in Ultrasound Computed Tomography

no code implementations16 Nov 2023 Gangwon Jeong, Fu Li, Umberto Villa, Mark A. Anastasio

Deep learning-based image-to-image learned reconstruction (IILR) methods are being investigated as scalable and computationally efficient alternatives.

Retinex-guided Channel-grouping based Patch Swap for Arbitrary Style Transfer

no code implementations19 Sep 2023 Chang Liu, Yi Niu, Mingming Ma, Fu Li, Guangming Shi

The basic principle of the patch-matching based style transfer is to substitute the patches of the content image feature maps by the closest patches from the style image feature maps.

Patch Matching Style Transfer

VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation

no code implementations1 Sep 2023 Xin Li, Wenqing Chu, Ye Wu, Weihang Yuan, Fanglong Liu, Qi Zhang, Fu Li, Haocheng Feng, Errui Ding, Jingdong Wang

In this paper, we present VideoGen, a text-to-video generation approach, which can generate a high-definition video with high frame fidelity and strong temporal consistency using reference-guided latent diffusion.

Decoder Text-to-Image Generation +2

EEG-based Emotion Style Transfer Network for Cross-dataset Emotion Recognition

no code implementations9 Aug 2023 Yijin Zhou, Fu Li, Yang Li, Youshuo Ji, Lijian Zhang, Yuanfang Chen, Wenming Zheng, Guangming Shi

The transfer module encodes the domain-specific information of source and target domains and then re-constructs the source domain's emotional pattern and the target domain's statistical characteristics into the new stylized EEG representations.

EEG EEG Emotion Recognition +1

Revisiting Neural Retrieval on Accelerators

2 code implementations6 Jun 2023 Jiaqi Zhai, Zhaojie Gong, Yueming Wang, Xiao Sun, Zheng Yan, Fu Li, Xing Liu

A key component of retrieval is to model (user, item) similarity, which is commonly represented as the dot product of two learned embeddings.

Information Retrieval Retrieval

Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer

no code implementations CVPR 2023 Hao Tang, Songhua Liu, Tianwei Lin, Shaoli Huang, Fu Li, Dongliang He, Xinchao Wang

On the other hand, different from the vanilla version, we adopt a learnable scaling operation on content features before content-style feature interaction, which better preserves the original similarity between a pair of content features while ensuring the stylization quality.

Meta-Learning Style Transfer

DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation

1 code implementation CVPR 2023 Yueming Lyu, Tianwei Lin, Fu Li, Dongliang He, Jing Dong, Tieniu Tan

Our key idea is to investigate and identify a space, namely delta image and text space that has well-aligned distribution between CLIP visual feature differences of two images and CLIP textual embedding differences of source and target texts.

Image Manipulation

AdaCM: Adaptive ColorMLP for Real-Time Universal Photo-realistic Style Transfer

1 code implementation3 Dec 2022 Tianwei Lin, Honglin Lin, Fu Li, Dongliang He, Wenhao Wu, Meiling Wang, Xin Li, Yong liu

Then, in \textbf{AdaCM}, we adopt a CNN encoder to adaptively predict all parameters for the ColorMLP conditioned on each input content and style image pair.

4k Style Transfer

RRSR:Reciprocal Reference-based Image Super-Resolution with Progressive Feature Alignment and Selection

no code implementations8 Nov 2022 Lin Zhang, Xin Li, Dongliang He, Fu Li, Yili Wang, Zhaoxiang Zhang

While previous state-of-the-art RefSR methods mainly focus on improving the efficacy and robustness of reference feature transfer, it is generally overlooked that a well reconstructed SR image should enable better SR reconstruction for its similar LR images when it is referred to as.

feature selection Image Super-Resolution

It Takes Two: Masked Appearance-Motion Modeling for Self-supervised Video Transformer Pre-training

no code implementations11 Oct 2022 Yuxin Song, Min Yang, Wenhao Wu, Dongliang He, Fu Li, Jingdong Wang

In order to guide the encoder to fully excavate spatial-temporal features, two separate decoders are used for two pretext tasks of disentangled appearance and motion prediction.

Decoder motion prediction

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval

no code implementations21 Aug 2022 Haoran Wang, Dongliang He, Wenhao Wu, Boyang xia, Min Yang, Fu Li, Yunlong Yu, Zhong Ji, Errui Ding, Jingdong Wang

We introduce dynamic dictionaries for both modalities to enlarge the scale of image-text pairs, and diversity-sensitiveness is achieved by adaptive negative pair weighting.

Clustering Contrastive Learning +4

Boosting Video-Text Retrieval with Explicit High-Level Semantics

no code implementations8 Aug 2022 Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding

Video-text retrieval (VTR) is an attractive yet challenging task for multi-modal understanding, which aims to search for relevant video (text) given a query (video).

Retrieval Text Retrieval +3

Neural Color Operators for Sequential Image Retouching

2 code implementations17 Jul 2022 Yili Wang, Xin Li, Kun Xu, Dongliang He, Qi Zhang, Fu Li, Errui Ding

The neural color operator mimics the behavior of traditional color operators and learns pixelwise color transformation while its strength is controlled by a scalar.

Decoder Image Enhancement +1

GMSS: Graph-Based Multi-Task Self-Supervised Learning for EEG Emotion Recognition

1 code implementation12 Apr 2022 Yang Li, Ji Chen, Fu Li, Boxun Fu, Hao Wu, Youshuo Ji, Yijin Zhou, Yi Niu, Guangming Shi, Wenming Zheng

GMSS has the ability to learn more general representations by integrating multiple self-supervised tasks, including spatial and frequency jigsaw puzzle tasks, and contrastive learning tasks.

Contrastive Learning EEG +2

OSOP: A Multi-Stage One Shot Object Pose Estimation Framework

no code implementations CVPR 2022 Ivan Shugurov, Fu Li, Benjamin Busam, Slobodan Ilic

We present a novel one-shot method for object detection and 6 DoF pose estimation, that does not require training on target objects.

Object object-detection +2

Progressive Graph Convolution Network for EEG Emotion Recognition

no code implementations14 Dec 2021 Yijin Zhou, Fu Li, Yang Li, Youshuo Ji, Guangming Shi, Wenming Zheng, Lijian Zhang, Yuanfang Chen, Rui Cheng

Moreover, motivated by the observation of the relationship between coarse- and fine-grained emotions, we adopt a dual-head module that enables the PGCN to progressively learn more discriminative EEG features, from coarse-grained (easy) to fine-grained categories (difficult), referring to the hierarchical characteristic of emotion.

EEG EEG Emotion Recognition

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

1 code implementation CVPR 2022 Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc van Gool, Errui Ding

We propose a novel framework, i. e., Predict, Prevent, and Evaluate (PPE), for disentangled text-driven image manipulation that requires little manual annotation while being applicable to a wide variety of manipulations.

Image Manipulation Language Modelling

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

2 code implementations ICCV 2021 Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, Hao Wang

Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks.

Object Detection Reinforcement Learning (RL) +1

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer

3 code implementations ICCV 2021 Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Meiling Wang, Xin Li, Zhengxing Sun, Qian Li, Errui Ding

Finally, the content feature is normalized so that they demonstrate the same local feature statistics as the calculated per-point weighted style feature statistics.

Style Transfer Video Style Transfer

Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness

1 code implementation28 Apr 2021 Manyu Zhu, Dongliang He, Xin Li, Chao Li, Fu Li, Xiao Liu, Errui Ding, Zhaoxiang Zhang

Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.

Decoder Image Inpainting +1

Learning Semantic Person Image Generation by Region-Adaptive Normalization

1 code implementation CVPR 2021 Zhengyao Lv, Xiaoming Li, Xin Li, Fu Li, Tianwei Lin, Dongliang He, WangMeng Zuo

In the first stage, we predict the target semantic parsing maps to eliminate the difficulties of pose transfer and further benefit the latter translation of per-region appearance style.

Pose Transfer Semantic Parsing +1

Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer

2 code implementations CVPR 2021 Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao

Inspired by the common painting process of drawing a draft and revising the details, we introduce a novel feed-forward method named Laplacian Pyramid Network (LapStyle).

Style Transfer

Experimental study of decoherence of the two-mode squeezed vacuum state via second harmonic generation

no code implementations22 Dec 2020 Fu Li, Tian Li, Girish S. Agarwal

Such a correlation is the most important characteristic of a two-mode squeezed state.

Optics Quantum Physics

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

3 code implementations13 Dec 2020 Wenhao Wu, Dongliang He, Tianwei Lin, Fu Li, Chuang Gan, Errui Ding

Existing state-of-the-art methods have achieved excellent accuracy regardless of the complexity meanwhile efficient spatiotemporal modeling solutions are slightly inferior in performance.

Action Classification Action Recognition +2

A Novel Transferability Attention Neural Network Model for EEG Emotion Recognition

no code implementations21 Sep 2020 Yang Li, Boxun Fu, Fu Li, Guangming Shi, Wenming Zheng

So it is necessary to give more attention to the EEG samples with strong transferability rather than forcefully training a classification model by all the samples.

EEG EEG Emotion Recognition

Multi-Label Classification with Label Graph Superimposing

2 code implementations21 Nov 2019 Ya Wang, Dongliang He, Fu Li, Xiang Long, Zhichao Zhou, Jinwen Ma, Shilei Wen

In this paper, we propose a label graph superimposing framework to improve the conventional GCN+CNN framework developed for multi-label recognition in the following two aspects.

Attribute Classification +3

TruNet: Short Videos Generation from Long Videos via Story-Preserving Truncation

no code implementations14 Oct 2019 Fan Yang, Xiao Liu, Dongliang He, Chuang Gan, Jian Wang, Chao Li, Fu Li, Shilei Wen

In this work, we introduce a new problem, named as {\em story-preserving long video truncation}, that requires an algorithm to automatically truncate a long-duration video into multiple short and attractive sub-videos with each one containing an unbroken story.

Highlight Detection Video Summarization

Deep Concept-wise Temporal Convolutional Networks for Action Localization

2 code implementations26 Aug 2019 Xin Li, Tianwei Lin, Xiao Liu, Chuang Gan, WangMeng Zuo, Chao Li, Xiang Long, Dongliang He, Fu Li, Shilei Wen

In this paper, we empirically find that stacking more conventional temporal convolution layers actually deteriorates action classification performance, possibly ascribing to that all channels of 1D feature map, which generally are highly abstract and can be regarded as latent concepts, are excessively recombined in temporal convolution.

Action Classification Action Localization

Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos

1 code implementation21 Jan 2019 Dongliang He, Xiang Zhao, Jizhou Huang, Fu Li, Xiao Liu, Shilei Wen

The task of video grounding, which temporally localizes a natural language description in a video, plays an important role in understanding videos.

Decision Making Multi-Task Learning +3

StNet: Local and Global Spatial-Temporal Modeling for Action Recognition

8 code implementations5 Nov 2018 Dongliang He, Zhichao Zhou, Chuang Gan, Fu Li, Xiao Liu, Yandong Li, Li-Min Wang, Shilei Wen

In this paper, in contrast to the existing CNN+RNN or pure 3D convolution based approaches, we explore a novel spatial temporal network (StNet) architecture for both local and global spatial-temporal modeling in videos.

Action Recognition Temporal Action Localization

Exploiting Spatial-Temporal Modelling and Multi-Modal Fusion for Human Action Recognition

no code implementations27 Jun 2018 Dongliang He, Fu Li, Qijie Zhao, Xiang Long, Yi Fu, Shilei Wen

In this challenge, we propose spatial-temporal network (StNet) for better joint spatial-temporal modelling and comprehensively video understanding.

Action Recognition Temporal Action Localization +1

Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification

no code implementations12 Aug 2017 Yunlong Bian, Chuang Gan, Xiao Liu, Fu Li, Xiang Long, Yandong Li, Heng Qi, Jie zhou, Shilei Wen, Yuanqing Lin

Experiment results on the challenging Kinetics dataset demonstrate that our proposed temporal modeling approaches can significantly improve existing approaches in the large-scale video recognition tasks.

Action Classification General Classification +2

Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding

1 code implementation14 Jul 2017 Fu Li, Chuang Gan, Xiao Liu, Yunlong Bian, Xiang Long, Yandong Li, Zhichao Li, Jie zhou, Shilei Wen

This paper describes our solution for the video recognition task of the Google Cloud and YouTube-8M Video Understanding Challenge that ranked the 3rd place.

Video Recognition Video Understanding

Combinatorial Multi-Armed Bandit with General Reward Functions

no code implementations NeurIPS 2016 Wei Chen, Wei Hu, Fu Li, Jian Li, Yu Liu, Pinyan Lu

Our framework enables a much larger class of reward functions such as the $\max()$ function and nonlinear utility functions.

Cannot find the paper you are looking for? You can Submit a new open access paper.