Search Results for author: Jie Wu

Found 54 papers, 18 papers with code

Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis

no code implementations21 Apr 2024 Yuxi Ren, Xin Xia, Yanzuo Lu, Jiacheng Zhang, Jie Wu, Pan Xie, Xing Wang, Xuefeng Xiao

Current distillation techniques often dichotomize into two distinct aspects: i) ODE Trajectory Preservation; and ii) ODE Trajectory Reformulation.

Image Generation

LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Perception System

no code implementations16 Apr 2024 Shijing Hu, Ruijun Deng, Xin Du, Zhihui Lu, Qiang Duan, Yi He, Shih-Chia Huang, Jie Wu

We propose to update the edge model and its collaboration strategy with the cloud under the supervision of the large vision model, so as to adapt to the dynamic IoT data streams.

Autonomous Driving Semantic Segmentation

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

1 code implementation11 Apr 2024 Ming Li, Taojiannan Yang, Huafeng Kuang, Jie Wu, Zhaoning Wang, Xuefeng Xiao, Chen Chen

To this end, we propose ControlNet++, a novel approach that improves controllable generation by explicitly optimizing pixel-level cycle consistency between generated images and conditional controls.

SSIM

UniFL: Improve Stable Diffusion via Unified Feedback Learning

no code implementations8 Apr 2024 Jiacheng Zhang, Jie Wu, Yuxi Ren, Xin Xia, Huafeng Kuang, Pan Xie, Jiashi Li, Xuefeng Xiao, Weilin Huang, Min Zheng, Lean Fu, Guanbin Li

Diffusion models have revolutionized the field of image generation, leading to the proliferation of high-quality models and diverse downstream applications.

Image Generation

ByteEdit: Boost, Comply and Accelerate Generative Image Editing

no code implementations7 Apr 2024 Yuxi Ren, Jie Wu, Yanzuo Lu, Huafeng Kuang, Xin Xia, Xionghui Wang, Qianqian Wang, Yixing Zhu, Pan Xie, Shiyin Wang, Xuefeng Xiao, Yitong Wang, Min Zheng, Lean Fu

Recent advancements in diffusion-based generative image editing have sparked a profound revolution, reshaping the landscape of image outpainting and inpainting tasks.

Image Outpainting

ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models

1 code implementation4 Mar 2024 Jiaxiang Cheng, Pan Xie, Xin Xia, Jiashi Li, Jie Wu, Yuxi Ren, Huixia Li, Xuefeng Xiao, Min Zheng, Lean Fu

Especially, after learning a deep understanding of pure resolution priors, ResAdapter trained on the general dataset, generates resolution-free images with personalized diffusion models while preserving their original style domain.

Image Generation

DiffusionGPT: LLM-Driven Text-to-Image Generation System

no code implementations18 Jan 2024 Jie Qin, Jie Wu, Weifeng Chen, Yuxi Ren, Huixia Li, Hefeng Wu, Xuefeng Xiao, Rui Wang, Shilei Wen

Diffusion models have opened up new avenues for the field of image generation, resulting in the proliferation of high-quality models shared on open-source platforms.

Model Selection Text-to-Image Generation

MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation

no code implementations9 Jan 2024 Weimin WANG, Jiawei Liu, Zhijie Lin, Jiangqiao Yan, Shuo Chen, Chetwin Low, Tuyen Hoang, Jie Wu, Jun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng

The growing demand for high-fidelity video generation from textual descriptions has catalyzed significant research in this field.

MORPH Video Generation

Diversity-Based Recruitment in Crowdsensing By Combinatorial Multi-Armed Bandits

no code implementations25 Dec 2023 Abdalaziz Sawwan, Jie Wu

Additionally, it accommodates the variability in task completion quality caused by overlaps in the same round, which can range from the maximum individual worker's quality to the summation of qualities of all assigned workers in the overlap.

Multi-Armed Bandits

DreamTuner: Single Image is Enough for Subject-Driven Generation

no code implementations21 Dec 2023 Miao Hua, Jiawei Liu, Fei Ding, Wei Liu, Jie Wu, Qian He

Diffusion-based models have demonstrated impressive capabilities for text-to-image generation and are expected for personalized applications of subject-driven generation, which require the generation of customized concepts with one or a few reference images.

Text-to-Image Generation

FuXi-S2S: An accurate machine learning model for global subseasonal forecasts

no code implementations15 Dec 2023 Lei Chen, Xiaohui Zhong, Jie Wu, Deliang Chen, Shangping Xie, Qingchen Chao, Chensen Lin, Zixin Hu, Bo Lu, Hao Li, Yuan Qi

Skillful subseasonal forecasts beyond 2 weeks are crucial for a wide range of applications across various sectors of society.

Weather Forecasting

On the Communication Complexity of Decentralized Bilevel Optimization

no code implementations19 Nov 2023 Yihan Zhang, My T. Thai, Jie Wu, Hongchang Gao

To the best of our knowledge, this is the first stochastic algorithm achieving these theoretical results under the heterogeneous setting.

Bilevel Optimization

Instructed Language Models with Retrievers Are Powerful Entity Linkers

1 code implementation6 Nov 2023 Zilin Xiao, Ming Gong, Jie Wu, Xingyao Zhang, Linjun Shou, Jian Pei, Daxin Jiang

Generative approaches powered by large language models (LLMs) have demonstrated emergent abilities in tasks that require complex reasoning abilities.

Entity Linking In-Context Learning

Coherent Entity Disambiguation via Modeling Topic and Categorical Dependency

no code implementations6 Nov 2023 Zilin Xiao, Linjun Shou, Xingyao Zhang, Jie Wu, Ming Gong, Jian Pei, Daxin Jiang

We propose CoherentED, an ED system equipped with novel designs aimed at enhancing the coherence of entity predictions.

Entity Disambiguation

AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration

1 code implementation ICCV 2023 Lijiang Li, Huixia Li, Xiawu Zheng, Jie Wu, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan, Fei Chao, Rongrong Ji

Therefore, we propose to search the optimal time steps sequence and compressed model architecture in a unified framework to achieve effective image generation for diffusion models without any further training.

Image Generation single-image-generation

DLIP: Distilling Language-Image Pre-training

no code implementations24 Aug 2023 Huafeng Kuang, Jie Wu, Xiawu Zheng, Ming Li, Xuefeng Xiao, Rui Wang, Min Zheng, Rongrong Ji

Furthermore, DLIP succeeds in retaining more than 95% of the performance with 22. 4% parameters and 24. 8% FLOPs compared to the teacher model and accelerates inference speed by 2. 7x.

Image Captioning Knowledge Distillation +5

AlignDet: Aligning Pre-training and Fine-tuning in Object Detection

1 code implementation ICCV 2023 Ming Li, Jie Wu, Xionghui Wang, Chen Chen, Jie Qin, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan

To this end, we propose AlignDet, a unified pre-training framework that can be adapted to various existing detectors to alleviate the discrepancies.

object-detection Object Detection

You Can Trade Your Experience in Distributed Multi-Agent Multi-Armed Bandits

no code implementations 2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS) 2023 Guoju Gao, He Huang, Jie Wu, Sijie Huang, Yang Du

In this paper, we propose a transaction-based multi-agent MAB framework, where agents can trade their bandit experience with each other to improve their total individual rewards.

Decision Making Multi-Armed Bandits

When Decentralized Optimization Meets Federated Learning

no code implementations5 Jun 2023 Hongchang Gao, My T. Thai, Jie Wu

Federated learning is a new learning paradigm for extracting knowledge from distributed data.

Federated Learning

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models

1 code implementation23 May 2023 Weifeng Chen, Yatai Ji, Jie Wu, Hefeng Wu, Pan Xie, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin

Based on a pre-trained conditional text-to-image (T2I) diffusion model, our model aims to generate videos conditioned on a sequence of control signals, such as edge or depth maps.

Optical Flow Estimation Style Transfer +4

FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation

no code implementations CVPR 2023 Jie Qin, Jie Wu, Pengxiang Yan, Ming Li, Ren Yuxi, Xuefeng Xiao, Yitong Wang, Rui Wang, Shilei Wen, Xin Pan, Xingang Wang

Recently, open-vocabulary learning has emerged to accomplish segmentation for arbitrary categories of text-based descriptions, which popularizes the segmentation system to more general-purpose application scenarios.

Image Segmentation Instance Segmentation +3

Masked Vision-Language Transformers for Scene Text Recognition

1 code implementation9 Nov 2022 Jie Wu, Ying Peng, Shengming Zhang, Weigang Qi, Jian Zhang

MVLT is trained in two stages: in the first stage, we design a STR-tailored pretraining method based on a masking strategy; in the second stage, we fine-tune our model and adopt an iterative correction method to improve the performance.

Scene Text Recognition

FedVeca: Federated Vectorized Averaging on Non-IID Data with Adaptive Bi-directional Global Objective

no code implementations28 Sep 2022 Ping Luo, Jieren Cheng, Zhenhao Liu, N. Xiong, Jie Wu

However, the clients' Non-Independent and Identically Distributed (Non-IID) data negatively affect the trained model, and clients with different numbers of local updates may cause significant gaps to the local gradients in each communication round.

Federated Learning

Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation

1 code implementation22 Aug 2022 Jie Qin, Jie Wu, Ming Li, Xuefeng Xiao, Min Zheng, Xingang Wang

Consequently, we offer the first attempt to provide lightweight SSSS models via a novel multi-granularity distillation (MGD) scheme, where multi-granularity is captured from three aspects: i) complementary teacher structure; ii) labeled-unlabeled data cooperative distillation; iii) hierarchical and multi-levels loss setting.

Knowledge Distillation Semi-Supervised Semantic Segmentation

Parallel Pre-trained Transformers (PPT) for Synthetic Data-based Instance Segmentation

no code implementations22 Jun 2022 Ming Li, Jie Wu, Jinhang Cai, Jie Qin, Yuxi Ren, Xuefeng Xiao, Min Zheng, Rui Wang, Xin Pan

Recently, Synthetic data-based Instance Segmentation has become an exceedingly favorable optimization paradigm since it leverages simulation rendering and physics to generate high-quality image-annotation pairs.

Instance Segmentation Segmentation +1

TRT-ViT: TensorRT-oriented Vision Transformer

no code implementations19 May 2022 Xin Xia, Jiashi Li, Jie Wu, Xing Wang, Xuefeng Xiao, Min Zheng, Rui Wang

We revisit the existing excellent Transformers from the perspective of practical application.

Image Classification object-detection +2

ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer

2 code implementations21 Mar 2022 Rui Yang, Hailong Ma, Jie Wu, Yansong Tang, Xuefeng Xiao, Min Zheng, Xiu Li

The vanilla self-attention mechanism inherently relies on pre-defined and steadfast computational dimensions.

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

1 code implementation16 Dec 2021 Jie Qin, Jie Wu, Xuefeng Xiao, Lujun Li, Xingang Wang

Extensive experiments show that AMR establishes a new state-of-the-art performance on the PASCAL VOC 2012 dataset, surpassing not only current methods trained with the image-level of supervision but also some methods relying on stronger supervision, such as saliency label.

Feature Importance Scene Understanding +3

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme

1 code implementation NeurIPS 2021 Shaojie Li, Jie Wu, Xuefeng Xiao, Fei Chao, Xudong Mao, Rongrong Ji

In this work, we revisit the role of discriminator in GAN compression and design a novel generator-discriminator cooperative compression scheme for GAN compression, termed GCC.

Online Multi-Granularity Distillation for GAN Compression

1 code implementation ICCV 2021 Yuxi Ren, Jie Wu, Xuefeng Xiao, Jianchao Yang

It reveals that OMGD provides a feasible solution for the deployment of real-time image translation on resource-constrained devices.

Translation

Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video

no code implementations9 Aug 2021 Jie Wu, Wei zhang, Guanbin Li, Wenhao Wu, Xiao Tan, YingYing Li, Errui Ding, Liang Lin

In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video.

Anomaly Detection

Spoken Language Understanding for Task-oriented Dialogue Systems with Augmented Memory Networks

no code implementations NAACL 2021 Jie Wu, Ian Harris, Hongzhi Zhao

We adopt a key-value memory network to model slot context dynamically and to track more important slot tags decoded before, which are then fed into our decoder for slot tagging.

Intent Detection slot-filling +3

Auction-Based Combinatorial Multi-Armed Bandit Mechanisms with Strategic Arms

1 code implementation IEEE Conference on Computer Communications 2021 Guoju Gao, He Huang, Mingjun Xiao, Jie Wu, Yu-E Sun, Sheng Zhang

The multi-armed bandit (MAB) model has been deeply studied to solve many online learning problems, such as rate allocation in communication networks, Ad recommendation in social networks, etc.

Computational Efficiency

Robust Sequence Submodular Maximization

no code implementations NeurIPS 2020 Gamal Sallam, Zizhan Zheng, Jie Wu, Bo Ji

Compared to robust submodular maximization for set function, new challenges arise when sequence functions are concerned.

Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos

no code implementations18 Sep 2020 Jie Wu, Guanbin Li, Xiaoguang Han, Liang Lin

Temporal grounding of natural language in untrimmed videos is a fundamental yet challenging multimedia task facilitating cross-media visual content retrieval.

reinforcement-learning Reinforcement Learning (RL) +2

Fine-Grained Image Captioning with Global-Local Discriminative Objective

1 code implementation21 Jul 2020 Jie Wu, Tianshui Chen, Hefeng Wu, Zhi Yang, Guangchun Luo, Liang Lin

This is primarily due to (i) the conservative characteristic of traditional training objectives that drives the model to generate correct but hardly discriminative captions for similar images and (ii) the uneven word distribution of the ground-truth captions, which encourages generating highly frequent words/phrases while suppressing the less frequent but more concrete ones.

Descriptive Image Captioning +2

Combinatorial Multi-Armed Bandit Based Unknown Worker Recruitment in Heterogeneous Crowdsensing

1 code implementation IEEE INFOCOM 2020 - IEEE Conference on Computer Communications 2020 Guoju Gao, Jie Wu, Mingjun Xiao, Guoliang Chen

In each round, every task may be covered by more than one recruited workers, but its completion quality only depends on these workers' maximum sensing quality.

Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer

no code implementations18 Jun 2020 Jie Wu, Jian Luan

This paper presents a high quality singing synthesizer that is able to model a voice with limited available recordings.

XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System

no code implementations11 Jun 2020 Peiling Lu, Jie Wu, Jian Luan, Xu Tan, Li Zhou

This paper presents XiaoiceSing, a high-quality singing voice synthesis system which employs an integrated network for spectrum, F0 and duration modeling.

Singing Voice Synthesis Vocal Bursts Intensity Prediction

PMC-GANs: Generating Multi-Scale High-Quality Pedestrian with Multimodal Cascaded GANs

no code implementations30 Dec 2019 Jie Wu, Ying Peng, Chenghao Zheng, Zongbo Hao, Jian Zhang

Recently, generative adversarial networks (GANs) have shown great advantages in synthesizing images, leading to a boost of explorations of using faked images to augment data.

Data Augmentation Pedestrian Detection

Data-driven prediction of vortex-induced vibration response of marine risers subjected to three-dimensional current

no code implementations24 Jun 2019 Signe Riemer-Sørensen, Jie Wu, Halvor Lie, Svein Sævik, Sang-Woo Kim

The load model and hydrodynamic parameters in present VIV prediction tools are developed based on two-dimensional (2D) flow conditions, as it is challenging to consider the effect of 3D flow along the risers.

Clustering

Hand Gesture Recognition with Leap Motion

no code implementations12 Nov 2017 Youchen Du, Shenglan Liu, Lin Feng, Menghui Chen, Jie Wu

The recent introduction of depth cameras like Leap Motion Controller allows researchers to exploit the depth information to recognize hand gesture more robustly.

Dimensionality Reduction Hand Gesture Recognition +1

Polarimetric Hierarchical Semantic Model and Scattering Mechanism Based PolSAR Image Classification

no code implementations1 Jul 2015 Fang Liu, Junfei Shi, Licheng Jiao, Hongying Liu, Shuyuan Yang, Jie Wu, Hongxia Hao, Jialing Yuan

For polarimetric SAR (PolSAR) image classification, it is a challenge to classify the aggregated terrain types, such as the urban area, into semantic homogenous regions due to sharp bright-dark variations in intensity.

General Classification Image Classification

Room-temperature implementation of the Deutsch-Jozsa algorithm with a single electronic spin in diamond

no code implementations12 Feb 2010 Fazhan Shi, Xing Rong, Nanyang Xu, Ya Wang, Jie Wu, Bo Chong, Xinhua Peng, Juliane Kniepert, Rolf-Simon Schoenfeld, Wolfgang Harneit, Mang Feng, Jiangfeng Du

The nitrogen-vacancy defect center (NV center) is a promising candidate for quantum information processing due to the possibility of coherent manipulation of individual spins in the absence of the cryogenic requirement.

Quantum Physics

Cannot find the paper you are looking for? You can Submit a new open access paper.