Search Results for author: Zhijie Lin

Found 24 papers, 8 papers with code

MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation

no code implementations9 Jan 2024 Weimin WANG, Jiawei Liu, Zhijie Lin, Jiangqiao Yan, Shuo Chen, Chetwin Low, Tuyen Hoang, Jie Wu, Jun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng

The growing demand for high-fidelity video generation from textual descriptions has catalyzed significant research in this field.

MORPH Video Generation

ChatAnything: Facetime Chat with LLM-Enhanced Personas

no code implementations12 Nov 2023 Yilin Zhao, Xinbin Yuan, ShangHua Gao, Zhijie Lin, Qibin Hou, Jiashi Feng, Daquan Zhou

For MoV, we utilize the text-to-speech (TTS) algorithms with a variety of pre-defined tones and select the most matching one based on the user-provided text description automatically.

In-Context Learning Novel Concepts +2

Towards Garment Sewing Pattern Reconstruction from a Single Image

1 code implementation7 Nov 2023 Lijuan Liu, Xiangyu Xu, Zhijie Lin, Jiabin Liang, Shuicheng Yan

In this work, we explore the challenging problem of recovering garment sewing patterns from daily photos for augmenting these applications.

Garment Reconstruction Texture Synthesis +1

Unsupervised Discovery of Interpretable Directions in h-space of Pre-trained Diffusion Models

no code implementations15 Oct 2023 Zijian Zhang, Luping Liu, Zhijie Lin, Yichen Zhu, Zhou Zhao

We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

1 code implementation17 Jul 2023 Yang Zhao, Zhijie Lin, Daquan Zhou, Zilong Huang, Jiashi Feng, Bingyi Kang

Our experiments show that BuboGPT achieves impressive multi-modality understanding and visual grounding abilities during the interaction with human.

Instruction Following Sentence +1

Rethinking Multi-Contrast MRI Super-Resolution: Rectangle-Window Cross-Attention Transformer and Arbitrary-Scale Upsampling

no code implementations ICCV 2023 Guangyuan Li, Lei Zhao, Jiakai Sun, Zehua Lan, Zhanjie Zhang, Jiafu Chen, Zhijie Lin, Huaizhong Lin, Wei Xing

Recently, several methods have explored the potential of multi-contrast magnetic resonance imaging (MRI) super-resolution (SR) and obtain results superior to single-contrast SR methods.

Super-Resolution

Unsupervised Representation Learning from Pre-trained Diffusion Probabilistic Models

2 code implementations26 Dec 2022 Zijian Zhang, Zhou Zhao, Zhijie Lin

These imply that the gap corresponds to the lost information of the image, and we can reconstruct the image by filling the gap.

Image Reconstruction Representation Learning

A Survey: Deep Learning for Hyperspectral Image Classification with Few Labeled Samples

1 code implementation3 Dec 2021 Sen Jia, Shuguo Jiang, Zhijie Lin, Nanying Li, Meng Xu, Shiqi Yu

In general, deep learning models often contain many trainable parameters and require a massive number of labeled samples to achieve optimal performance.

Active Learning Few-Shot Learning +2

ST-DDPM: Explore Class Clustering for Conditional Diffusion Probabilistic Models

no code implementations29 Sep 2021 Zhijie Lin, Zijian Zhang, Zhou Zhao

Score-based generative models involve sequentially corrupting the data distribution with noise and then learns to recover the data distribution based on score matching.

Clustering Conditional Image Generation

SimulLR: Simultaneous Lip Reading Transducer with Attention-Guided Adaptive Memory

no code implementations31 Aug 2021 Zhijie Lin, Zhou Zhao, Haoyuan Li, Jinglin Liu, Meng Zhang, Xingshan Zeng, Xiaofei He

Lip reading, aiming to recognize spoken sentences according to the given video of lip movements without relying on the audio stream, has attracted great interest due to its application in many scenarios.

Lip Reading

Cascaded Prediction Network via Segment Tree for Temporal Video Grounding

no code implementations CVPR 2021 Yang Zhao, Zhou Zhao, Zhu Zhang, Zhijie Lin

Temporal video grounding aims to localize the target segment which is semantically aligned with the given sentence in an untrimmed video.

Sentence Video Grounding

Learning to Rehearse in Long Sequence Memorization

no code implementations2 Jun 2021 Zhu Zhang, Chang Zhou, Jianxin Ma, Zhijie Lin, Jingren Zhou, Hongxia Yang, Zhou Zhao

Further, we design a history sampler to select informative fragments for rehearsal training, making the memory focus on the crucial information.

Memorization Question Answering +1

To Learn Effective Features: Understanding the Task-Specific Adaptation of MAML

no code implementations1 Jan 2021 Zhijie Lin, Zhou Zhao, Zhu Zhang, Huai Baoxing, Jing Yuan

Model Agnostic Meta-Learning~(MAML)~(\cite{finn2017model}) is one of the most well-known gradient-based meta learning algorithms, that learns the meta-initialization through the inner and outer optimization loop.

Contrastive Learning Meta-Learning

Continual Memory: Can We Reason After Long-Term Memorization?

no code implementations1 Jan 2021 Zhu Zhang, Chang Zhou, Zhou Zhao, Zhijie Lin, Jingren Zhou, Hongxia Yang

Existing reasoning tasks often follow the setting of "reasoning while experiencing", which has an important assumption that the raw contents can be always accessed while reasoning.

Memorization

Counterfactual Contrastive Learning for Weakly-Supervised Vision-Language Grounding

no code implementations NeurIPS 2020 Zhu Zhang, Zhou Zhao, Zhijie Lin, Jieming Zhu, Xiuqiang He

Weakly-supervised vision-language grounding aims to localize a target moment in a video or a specific region in an image according to the given sentence query, where only video-level or image-level sentence annotations are provided during training.

Contrastive Learning counterfactual +2

Object-Aware Multi-Branch Relation Networks for Spatio-Temporal Video Grounding

no code implementations16 Aug 2020 Zhu Zhang, Zhou Zhao, Zhijie Lin, Baoxing Huai, Nicholas Jing Yuan

Spatio-temporal video grounding aims to retrieve the spatio-temporal tube of a queried object according to the given sentence.

Object Relation +4

Weakly-Supervised Video Moment Retrieval via Semantic Completion Network

no code implementations19 Nov 2019 Zhijie Lin, Zhou Zhao, Zhu Zhang, Qi. Wang, Huasheng Liu

Video moment retrieval is to search the moment that is most relevant to the given natural language query.

Moment Retrieval Retrieval +2

Localizing Unseen Activities in Video via Image Query

no code implementations28 Jun 2019 Zhu Zhang, Zhou Zhao, Zhijie Lin, Jingkuan Song, Deng Cai

Thus, we consider a new task to localize unseen activities in videos via image queries, named Image-Based Activity Localization.

Action Localization Video Understanding

Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks

no code implementations28 Jun 2019 Zhu Zhang, Zhou Zhao, Zhijie Lin, Jingkuan Song, Xiaofei He

Concretely, we first develop a hierarchical convolutional self-attention encoder to efficiently model long-form video contents, which builds the hierarchical structure for video sequences and captures question-aware long-range dependencies from video context.

Answer Generation Question Answering +1

Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos

1 code implementation6 Jun 2019 Zhu Zhang, Zhijie Lin, Zhou Zhao, Zhenxin Xiao

Query-based moment retrieval aims to localize the most relevant moment in an untrimmed video according to the given natural language query.

Moment Retrieval Natural Language Queries +2

Cannot find the paper you are looking for? You can Submit a new open access paper.