Search Results for author: Jian Ren

Found 50 papers, 19 papers with code

Personalized Image Aesthetics

no code implementations ICCV 2017 Jian Ren, Xiaohui Shen, Zhe Lin, Radomir Mech, David J. Foran

To accommodate our study, we first collect two distinct datasets, a large image dataset from Flickr and annotated by Amazon Mechanical Turk, and a small dataset of real personal albums rated by owners.

Active Learning

Factorized Adversarial Networks for Unsupervised Domain Adaptation

no code implementations4 Jun 2018 Jian Ren, Jianchao Yang, Ning Xu, David J. Foran

In this paper, we propose Factorized Adversarial Networks (FAN) to solve unsupervised domain adaptation problems for image classification tasks.

General Classification Image Classification +1

EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching from Scratch

no code implementations CVPR 2019 Jian Ren, Zhe Li, Jianchao Yang, Ning Xu, Tianbao Yang, David J. Foran

In this paper, we propose an Ecologically-Inspired GENetic (EIGEN) approach that uses the concept of succession, extinction, mimicry, and gene duplication to search neural network structure from scratch with poorly initialized simple network and few constraints forced during the evolution, as we assume no prior knowledge about the task domain.

Human Motion Transfer from Poses in the Wild

no code implementations7 Apr 2020 Jian Ren, Menglei Chai, Sergey Tulyakov, Chen Fang, Xiaohui Shen, Jianchao Yang

In this paper, we tackle the problem of human motion transfer, where we synthesize novel motion video for a target person that imitates the movement from a reference video.

Translation

Neural Hair Rendering

no code implementations ECCV 2020 Menglei Chai, Jian Ren, Sergey Tulyakov

Unlike existing supervised translation methods that require model-level similarity to preserve consistent structure representation for both real images and fake renderings, our method adopts an unsupervised solution to work on arbitrary hair models.

Translation

Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?

no code implementations19 Feb 2021 Ning Liu, Geng Yuan, Zhengping Che, Xuan Shen, Xiaolong Ma, Qing Jin, Jian Ren, Jian Tang, Sijia Liu, Yanzhi Wang

In deep model compression, the recent finding "Lottery Ticket Hypothesis" (LTH) (Frankle & Carbin, 2018) pointed out that there could exist a winning ticket (i. e., a properly pruned sub-network together with original weight initialization) that can achieve competitive performance than the original dense network.

Model Compression

Teachers Do More Than Teach: Compressing Image-to-Image Models

1 code implementation CVPR 2021 Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov

In this work, we aim to address these issues by introducing a teacher network that provides a search space in which efficient network architectures can be found, in addition to performing knowledge distillation.

Knowledge Distillation

SMIL: Multimodal Learning with Severely Missing Modality

1 code implementation9 Mar 2021 Mengmeng Ma, Jian Ren, Long Zhao, Sergey Tulyakov, Cathy Wu, Xi Peng

A common assumption in multimodal learning is the completeness of training data, i. e., full modalities are available in all training examples.

Meta-Learning

In&Out : Diverse Image Outpainting via GAN Inversion

no code implementations1 Apr 2021 Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Ming-Hsuan Yang

Existing image outpainting methods pose the problem as a conditional image-to-image translation task, often generating repetitive structures and textures by replicating the content available in the input image.

Image Outpainting Image-to-Image Translation +1

Motion Representations for Articulated Animation

2 code implementations CVPR 2021 Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

To facilitate animation and prevent the leakage of the shape of the driving object, we disentangle shape and pose of objects in the region space.

Object Video Reconstruction

BinarizedAttack: Structural Poisoning Attacks to Graph-based Anomaly Detection

1 code implementation18 Jun 2021 Yulin Zhu, Yuni Lai, Kaifa Zhao, Xiapu Luo, Mingquan Yuan, Jian Ren, Kai Zhou

Graph-based Anomaly Detection (GAD) is becoming prevalent due to the powerful representation abilities of graphs as well as recent advances in graph mining techniques.

Anomaly Detection Combinatorial Optimization +2

InOut: Diverse Image Outpainting via GAN Inversion

no code implementations CVPR 2022 Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Ming-Hsuan Yang

Existing image outpainting methods pose the problem as a conditional image-to-image translation task, often generating repetitive structures and textures by replicating the content available in the input image.

Image Outpainting Image-to-Image Translation

A Critical Analysis of Image-based Camera Pose Estimation Techniques

no code implementations15 Jan 2022 Meng Xu, Youchen Wang, Bin Xu, Jun Zhang, Jian Ren, Stefan Poslad, Pengfei Xu

Camera, and associated with its objects within the field of view, localization could benefit many computer vision fields, such as autonomous driving, robot navigation, and augmented reality (AR).

Autonomous Driving Camera Localization +3

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

1 code implementation ICLR 2022 Qing Jin, Jian Ren, Richard Zhuang, Sumant Hanumante, Zhengang Li, Zhiyu Chen, Yanzhi Wang, Kaiyuan Yang, Sergey Tulyakov

Our approach achieves comparable and better performance, when compared not only to existing quantization techniques with INT32 multiplication or floating-point arithmetic, but also to the full-precision counterparts, achieving state-of-the-art performance.

Quantization

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

1 code implementation CVPR 2022 Ligong Han, Jian Ren, Hsin-Ying Lee, Francesco Barbieri, Kyle Olszewski, Shervin Minaee, Dimitris Metaxas, Sergey Tulyakov

In addition, our model can extract visual information as suggested by the text prompt, e. g., "an object in image one is moving northeast", and generate corresponding videos.

Self-Learning Text Augmentation +1

R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis

1 code implementation31 Mar 2022 Huan Wang, Jian Ren, Zeng Huang, Kyle Olszewski, Menglei Chai, Yun Fu, Sergey Tulyakov

On the other hand, Neural Light Field (NeLF) presents a more straightforward representation over NeRF in novel view synthesis -- the rendering of a pixel amounts to one single forward pass without ray-marching.

Novel View Synthesis

Are Multimodal Transformers Robust to Missing Modality?

no code implementations CVPR 2022 Mengmeng Ma, Jian Ren, Long Zhao, Davide Testuggine, Xi Peng

Based on these findings, we propose a principle method to improve the robustness of Transformer models by automatically searching for an optimal fusion strategy regarding input data.

EfficientFormer: Vision Transformers at MobileNet Speed

10 code implementations2 Jun 2022 Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren

Our work proves that properly designed transformers can reach extremely low latency on mobile devices while maintaining high performance.

Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

1 code implementation15 Jun 2022 Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan

Diffusion probabilistic models (DPMs) have become a popular approach to conditional generation, due to their promising results and support for cross-modal synthesis.

Contrastive Learning Denoising +2

Computer Vision-Aided Reconfigurable Intelligent Surface-Based Beam Tracking: Prototyping and Experimental Results

no code implementations11 Jul 2022 Ming Ouyang, Yucong Wang, Feifei Gao, Shun Zhang, Puchu Li, Jian Ren

The vision-aided RIS prototype system is tested in two mobile scenarios: RIS works in near-field conditions as a passive array antenna of the base station; RIS works in far-field conditions to assist the communication between the base station and the user equipment.

Cross-Modal 3D Shape Generation and Manipulation

no code implementations24 Jul 2022 Zezhou Cheng, Menglei Chai, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Subhransu Maji, Sergey Tulyakov

In this paper, we propose a generic multi-modal generative model that couples the 2D modalities and implicit 3D representations through shared latent spaces.

3D Generation 3D Shape Generation

RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation

1 code implementation25 Jul 2022 Mu He, Le Hui, Yikai Bian, Jian Ren, Jin Xie, Jian Yang

In this paper, we propose a resolution adaptive self-supervised monocular depth estimation method (RA-Depth) by learning the scale invariance of the scene depth.

Data Augmentation Monocular Depth Estimation

PIM-QAT: Neural Network Quantization for Processing-In-Memory (PIM) Systems

no code implementations18 Sep 2022 Qing Jin, Zhiyu Chen, Jian Ren, Yanyu Li, Yanzhi Wang, Kaiyuan Yang

In this paper, we propose a method for training quantized networks to incorporate PIM quantization, which is ubiquitous to all PIM systems.

Quantization

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training

1 code implementation22 Sep 2022 Geng Yuan, Yanyu Li, Sheng Li, Zhenglun Kong, Sergey Tulyakov, Xulong Tang, Yanzhi Wang, Jian Ren

Therefore, we analyze the feasibility and potentiality of using the layer freezing technique in sparse training and find it has the potential to save considerable training costs.

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

1 code implementation CVPR 2023 Tanzila Rahman, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Shweta Mahajan, Leonid Sigal

Our experiments for story generation on the MUGEN, the PororoSV and the FlintstonesSV dataset show that our method not only outperforms prior state-of-the-art in generating frames with high visual quality, which are consistent with the story, but also models appropriate correspondences between the characters and the background.

Sentence Story Generation +1

SINE: SINgle Image Editing with Text-to-Image Diffusion Models

1 code implementation CVPR 2023 Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren

We propose a novel model-based guidance built upon the classifier-free guidance so that the knowledge from the model trained on a single image can be distilled into the pre-trained diffusion model, enabling content creation even with one given image.

Image Generation

Real-Time Neural Light Field on Mobile Devices

1 code implementation CVPR 2023 Junli Cao, Huan Wang, Pavlo Chemerys, Vladislav Shakhrai, Ju Hu, Yun Fu, Denys Makoviichuk, Sergey Tulyakov, Jian Ren

Nevertheless, to reach a similar rendering quality as NeRF, the network in NeLF is designed with intensive computation, which is not mobile-friendly.

Neural Rendering Novel View Synthesis

Rethinking Vision Transformers for MobileNet Size and Speed

6 code implementations ICCV 2023 Yanyu Li, Ju Hu, Yang Wen, Georgios Evangelidis, Kamyar Salahi, Yanzhi Wang, Sergey Tulyakov, Jian Ren

With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to optimize the performance and complexity of ViTs to enable efficient deployment on mobile devices.

Invertible Neural Skinning

no code implementations CVPR 2023 Yash Kant, Aliaksandr Siarohin, Riza Alp Guler, Menglei Chai, Jian Ren, Sergey Tulyakov, Igor Gilitschenski

Next, we combine PIN with a differentiable LBS module to build an expressive and end-to-end Invertible Neural Skinning (INS) pipeline.

3D generation on ImageNet

no code implementations2 Mar 2023 Ivan Skorokhodov, Aliaksandr Siarohin, Yinghao Xu, Jian Ren, Hsin-Ying Lee, Peter Wonka, Sergey Tulyakov

Existing 3D-from-2D generators are typically designed for well-curated single-category datasets, where all the objects have (approximately) the same scale, 3D location, and orientation, and the camera always points to the center of the scene.

3D Generation

Two-Bit RIS-Aided Communications at 3.5GHz: Some Insights from the Measurement Results Under Multiple Practical Scenes

no code implementations19 May 2023 Shun Zhang, Haoran Sun, Runze Yu, Hongshenyuan Cui, Jian Ren, Feifei Gao, Shi Jin, Hongxiang Xie, Hao Wang

In particular, we adopt a self-developed broadband intelligent communication system 40MHz-Net (BICT-40N) terminal in order to fully acquire the channel information.

Intelligent Communication Quantization

COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models

1 code implementation26 May 2023 Jinqi Xiao, Miao Yin, Yu Gong, Xiao Zang, Jian Ren, Bo Yuan

Attention-based vision models, such as Vision Transformer (ViT) and its variants, have shown promising performance in various computer vision tasks.

Model Compression

Fast and Automatic 3D Modeling of Antenna Structure Using CNN-LSTM Network for Efficient Data Generation

no code implementations27 Jun 2023 Zhaohui Wei, Zhao Zhou, Peng Wang, Jian Ren, Yingzeng Yin, Gert Frølund Pedersen, Ming Shen

In this study, we proposed a deep learning-assisted and image-based intelligent modeling approach for accelerating the data acquisition of antenna samples with different physical structures.

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

1 code implementation30 Jun 2023 Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, Bernard Ghanem

We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors.

Image to 3D

CLGT: A Graph Transformer for Student Performance Prediction in Collaborative Learning

1 code implementation30 Jul 2023 Tianhao Peng, Yu Liang, Wenjun Wu, Jian Ren, Zhao Pengrui, Yanjun Pu

Based on this student interaction graph, we present an extended graph transformer framework for collaborative learning (CLGT) for evaluating and predicting the performance of students.

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion

no code implementations12 Oct 2023 Xian Liu, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Yanyu Li, Dahua Lin, Xihui Liu, Ziwei Liu, Sergey Tulyakov

Our model enforces the joint learning of image appearance, spatial relationship, and geometry in a unified network, where each branch in the model complements to each other with both structural awareness and textural richness.

Image Generation

E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

no code implementations11 Jan 2024 Yifan Gong, Zheng Zhan, Qing Jin, Yanyu Li, Yerlan Idelbayev, Xian Liu, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren

One highly promising direction for enabling flexible real-time on-device image editing is utilizing data distillation by leveraging large-scale text-to-image diffusion models, such as Stable Diffusion, to generate paired datasets used for training generative adversarial networks (GANs).

Image-to-Image Translation

Visual Concept-driven Image Generation with Text-to-Image Diffusion Model

no code implementations18 Feb 2024 Tanzila Rahman, Shweta Mahajan, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Leonid Sigal

We illustrate that such joint alternating refinement leads to the learning of better tokens for concepts and, as a bi-product, latent masks.

Image Generation

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

no code implementations22 Feb 2024 Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren, Sergey Tulyakov

Since video content is highly redundant, we argue that naively bringing advances of image models to the video generation domain reduces motion fidelity, visual quality and impairs scalability.

Image Generation Text-to-Video Generation +1

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

no code implementations29 Feb 2024 Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov

Next, we finetune a retrieval model on a small subset where the best caption of each video is manually selected and then employ the model in the whole dataset to select the best caption as the annotation.

Retrieval Text Retrieval +3

TextCraftor: Your Text Encoder Can be Image Quality Controller

no code implementations27 Mar 2024 Yanyu Li, Xian Liu, Anil Kag, Ju Hu, Yerlan Idelbayev, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov, Jian Ren

Our findings reveal that, instead of replacing the CLIP text encoder used in Stable Diffusion with other large language models, we can enhance it through our proposed fine-tuning approach, TextCraftor, leading to substantial improvements in quantitative benchmarks and human assessments.

Image Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.