no code implementations • 13 Mar 2025 • Zexuan Yan, Yue Ma, Chang Zou, Wenteng Chen, Qifeng Chen, Linfeng Zhang
Inversion-based image editing is rapidly gaining momentum while suffering from significant computation overhead, hindering its application in real-time interactive scenarios.
no code implementations • 28 Jan 2025 • Shuai Chen, Yong Zu, Zhixi Feng, Shuyuan Yang, Mengchang Li, Yue Ma, Jun Liu, Qiukai Pan, Xinlei Zhang, Changjun Sun
HPTR enables the integration of radio signal features with expert knowledge, while FAF improves the modeling of high-frequency features critical for precise signal processing.
1 code implementation • 13 Jan 2025 • Zhen Xiong, Yuqi Li, Chuanguang Yang, Tiao Tan, Zhihong Zhu, Siyuan Li, Yue Ma
We find that deeper layers are always responsible for high - level content control, while shallow layers handle low - level content control.
no code implementations • 13 Jan 2025 • Junlong Liu, Yue Ma, Ruihui Zhao, Junhao Zheng, Qianli Ma, Yangyang Kang
Reranker models aim to re-rank the passages based on the semantics similarity between the given query and passages, which have recently received more attention due to the wide application of the Retrieval-Augmented Generation.
no code implementations • 8 Jan 2025 • Siran Chen, Yuxiao Luo, Yue Ma, Yu Qiao, Yali Wang
Consequently, it can adaptively integrate all the video contexts of multi-scale temporal resolutions to enhance video understanding.
no code implementations • 21 Dec 2024 • Beiyuan Zhang, Yue Ma, Chunlei Fu, Xinyang Song, Zhenan Sun, Ziqiang Li
To tackle this, we propose a novel multi-character video generation framework in a tuning-free manner, which is based on the separated text and pose guidance.
no code implementations • 9 Dec 2024 • Zhen Wan, Yue Ma, Chenyang Qi, Zhiheng Liu, Tao Gui
In this paper, we present UniPaint, a unified generative space-time video inpainting framework that enables spatial-temporal inpainting and interpolation.
no code implementations • 8 Dec 2024 • Yue Ma, Huantao Ren, Boyu Wang, Jingang Jin, Senem Velipasalar, Qinru Qiu
Traditional CLIP-based classification methods identify the most similar text label for a test image by comparing their embeddings.
1 code implementation • 2 Dec 2024 • Chenyang Zhu, Kai Li, Yue Ma, Longxiang Tang, Chengyu Fang, Chubin Chen, Qifeng Chen, Xiu Li
They struggle to maintain consistency in both the foreground and background during concept swapping, especially when the shape difference is large between objects.
1 code implementation • 7 Nov 2024 • Jiangshan Wang, Junfu Pu, Zhongang Qi, Jiayi Guo, Yue Ma, Nisha Huang, Yuxin Chen, Xiu Li, Ying Shan
To address this issue, we propose RF-Solver, a novel training-free sampler that effectively enhances inversion precision by mitigating the errors in the ODE-solving process of rectified flow.
no code implementations • 5 Nov 2024 • Kunyu Feng, Yue Ma, Bingyuan Wang, Chenyang Qi, Haozhe Chen, Qifeng Chen, Zeyu Wang
Compared to UNet, Diffusion Transformers (DiT) demonstrate superior capabilities to effectively capture the long-range dependencies among patches, leading to higher-quality image generation.
no code implementations • 9 Sep 2024 • Zhongbin Sun, Xiaolong Li, Yiran Li, Yue Ma
In present study, a novel memoryless method MDSS is proposed for multimodal anomaly detection, which employs a light-weighted student-teacher network and a signed distance function to learn from RGB images and 3D point clouds respectively, and complements the anomaly information from the two modalities.
1 code implementation • 2 Sep 2024 • Qihua Chen, Yue Ma, Hongfa Wang, Junkun Yuan, Wenzhe Zhao, Qi Tian, Hongmei Wang, Shaobo Min, Qifeng Chen, Wei Liu
Coupling with these two designs enables us to generate higher-resolution outpainting videos with rich content while keeping spatial and temporal consistency.
1 code implementation • 12 Aug 2024 • Zhichao Liao, Di Huang, Heming Fang, Yue Ma, Fengyuan Piao, Xinghui Li, Long Zeng, Pingfa Feng
To address this issue, we design a two-stage generative framework mimicking the human sketching behavior pattern, called MSFormer, which is the first time to produce humanoid freehand sketches tailored for mechanical components.
1 code implementation • 27 Jul 2024 • Dongzhuoran Zhou, Hui Yang, Bo Xiong, Yue Ma, Evgeny Kharlamov
To this end, we propose the aggregation over compacted manifolds method (ACM) that replaces the existing information aggregation with aggregation over compact manifolds, a special type of manifold, which avoids contracted aggregations.
1 code implementation • 13 Jun 2024 • Jiangshan Wang, Yue Ma, Jiayi Guo, Yicheng Xiao, Gao Huang, Xiu Li
Specifically, we propose an efficient sliding-window-based strategy to calculate the similarity among tokens in the diffusion features of source videos, identifying the tokens with high correspondence across frames.
no code implementations • 5 Jun 2024 • Jingyun Xue, Hongfa Wang, Qi Tian, Yue Ma, Andong Wang, Zhiyuan Zhao, Shaobo Min, Wenzhe Zhao, Kaihao Zhang, Heung-Yeung Shum, Wei Liu, Mengyang Liu, Wenhan Luo
While existing character image animation methods using pose sequences and reference images have shown promising performance, they tend to struggle with incoherent animation in complex scenarios, such as multiple character animation and body occlusion.
no code implementations • 4 Jun 2024 • Yue Ma, Hongyu Liu, Hongfa Wang, Heng Pan, Yingqing He, Junkun Yuan, Ailing Zeng, Chengfei Cai, Heung-Yeung Shum, Wei Liu, Qifeng Chen
We present Follow-Your-Emoji, a diffusion-based framework for portrait animation, which animates a reference portrait with target landmark sequences.
1 code implementation • 22 Apr 2024 • Chenyang Zhu, Kai Li, Yue Ma, Chunming He, Xiu Li
MultiBooth addresses these issues by dividing the multi-concept generation process into two phases: a single-concept learning phase and a multi-concept integration phase.
1 code implementation • 29 Mar 2024 • JianFeng Cai, Yue Ma, Zhixi Feng, Shuyuan Yang
Besides, this work has implications for how to efficiently utilize the multi-features of PolSAR data to learn better high-level representation in CL and how to construct networks suitable for PolSAR data better.
1 code implementation • 13 Mar 2024 • Yue Ma, Yingqing He, Hongfa Wang, Andong Wang, Chenyang Qi, Chengfei Cai, Xiu Li, Zhifeng Li, Heung-Yeung Shum, Wei Liu, Qifeng Chen
Despite recent advances in image-to-video generation, better controllability and local animation are less explored.
1 code implementation • 23 Jan 2024 • Mingyang Li, Yue Ma, Qinru Qiu
This approach enables the creation of a semantic map of the environment and ensures reliable camera localization.
1 code implementation • 19 Dec 2023 • Siran Chen, Yue Ma, Yu Qiao, Yali Wang
It mimics various missing cases by randomly masking features of different camera views, then leverages the original features of these views as self-supervision, and reconstructs the masked ones with the distinct spatio-temporal context across views.
no code implementations • 18 Dec 2023 • Bingyuan Wang, Hengyu Meng, Zeyu Cai, Lanjiong Li, Yue Ma, Qifeng Chen, Zeyu Wang
Visual storytelling often uses nontypical aspect-ratio images like scroll paintings, comic strips, and panoramas to create an expressive and compelling narrative.
1 code implementation • 5 Dec 2023 • Yue Ma, Xiaodong Cun, Sen Liang, Jinbo Xing, Yingqing He, Chenyang Qi, Siran Chen, Qifeng Chen
Yet succinct, our method is the first method to show the ability of video property editing from the pre-trained text-to-image model.
1 code implementation • CVPR 2024 • Yicheng Xiao, Zhuoyan Luo, Yong liu, Yue Ma, Hengwei Bian, Yatai Ji, Yujiu Yang, Xiu Li
Video Moment Retrieval (MR) and Highlight Detection (HD) have attracted significant attention due to the growing demand for video analysis.
Ranked #2 on
Highlight Detection
on YouTube Highlights
no code implementations • 16 May 2023 • Hui Yang, Patrick Koopmann, Yue Ma, Nicole Bidoit
Our evaluation indicates that our general modules are often smaller than classical modules and uniform interpolants computed by the state-of-the-art, and compared with uniform interpolants, can be computed in a significantly shorter time.
2 code implementations • 3 Apr 2023 • Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Siran Chen, Ying Shan, Xiu Li, Qifeng Chen
Generating text-editable and pose-controllable character videos have an imperious demand in creating various digital human.
no code implementations • 12 Feb 2023 • Yicheng Xiao, Yue Ma, Shuyan Li, Hantao Zhou, Ran Liao, Xiu Li
In this paper, we propose SemanticAC, a semantics-assisted framework for Audio Classification to better leverage the semantic information.
no code implementations • 7 Dec 2022 • Yue Ma, Tianyu Yang, Yin Shan, Xiu Li
This paper presents SimVTP: a Simple Video-Text Pretraining framework via masked autoencoders.
Ranked #24 on
Moment Retrieval
on Charades-STA
no code implementations • 22 Oct 2021 • Marui Du, Yue Ma, Zuoquan Zhang
Nowadays small and medium-sized enterprises have become an essential part of the national economy.
no code implementations • 23 Sep 2021 • Jieying Chen, Yue Ma, Rafael Peñaloza, Hui Yang
We present new algorithm for computing the union and intersection of all justifications for a given ontological consequence without first computing the set of all justifications.
no code implementations • 3 Nov 2020 • Jiacheng Wang, Yue Ma, Shuang Gao
The remarkable success of today's deep neural networks highly depends on a massive number of correctly labeled data.
no code implementations • 19 Dec 2019 • Yue Ma, Zengfeng Zeng, Dawei Zhu, Xuan Li, Yiying Yang, Xiaoyuan Yao, Kaijie Zhou, Jianping Shen
This paper describes our approach in DSTC 8 Track 4: Schema-Guided Dialogue State Tracking.
no code implementations • 2 Nov 2019 • Sanjay Kamath, Brigitte Grau, Yue Ma
We find open domain question answering model to be a better fit for this task rather than reading comprehension model.
1 code implementation • 31 Oct 2019 • Yue Ma, Xiaojie Wang, Zhenjiang Dong, Hong Chen
Dialogue embeddings are learned by a LSTM at the middle of the network, and updated by the feeding of all turn embeddings.
no code implementations • WS 2018 • Sanjay Kamath, Brigitte Grau, Yue Ma
BIOASQ Task B Phase B challenge focuses on extracting answers from snippets for a given question.
no code implementations • 1 Jun 2014 • Said Jabbour, Yue Ma, Badran Raddaoui, Lakhdar Sais, Yakoub Salhi
One particular MUS-decomposition, named distributable MUS-decomposition leads to an interesting partition of inconsistencies in a knowledge base such that multiple experts can check inconsistencies in parallel, which is impossible under existing measures.
no code implementations • 9 Jan 2013 • Xiaowang Zhang, Kewen Wang, Zhe Wang, Yue Ma, Guilin Qi
DL-Lite is an important family of description logics.