Search Results for author: Jian Ren

Found 50 papers, 19 papers with code

Personalized Image Aesthetics

no code implementations • ICCV 2017 • Jian Ren, Xiaohui Shen, Zhe Lin, Radomir Mech, David J. Foran

To accommodate our study, we first collect two distinct datasets, a large image dataset from Flickr and annotated by Amazon Mechanical Turk, and a small dataset of real personal albums rated by owners.

Active Learning

Paper
Add Code

Adversarial Domain Adaptation for Classification of Prostate Histopathology Whole-Slide Images

no code implementations • 4 Jun 2018 • Jian Ren, Ilker Hacihaliloglu, Eric A. Singer, David J. Foran, Xin Qi

Automatic and accurate Gleason grading of histopathology tissue slides is crucial for prostate cancer diagnosis, treatment, and prognosis.

General Classification Unsupervised Domain Adaptation +1

Paper
Add Code

Factorized Adversarial Networks for Unsupervised Domain Adaptation

no code implementations • 4 Jun 2018 • Jian Ren, Jianchao Yang, Ning Xu, David J. Foran

In this paper, we propose Factorized Adversarial Networks (FAN) to solve unsupervised domain adaptation problems for image classification tasks.

General Classification Image Classification +1

Paper
Add Code

EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching from Scratch

no code implementations • CVPR 2019 • Jian Ren, Zhe Li, Jianchao Yang, Ning Xu, Tianbao Yang, David J. Foran

In this paper, we propose an Ecologically-Inspired GENetic (EIGEN) approach that uses the concept of succession, extinction, mimicry, and gene duplication to search neural network structure from scratch with poorly initialized simple network and few constraints forced during the evolution, as we assume no prior knowledge about the task domain.

Paper
Add Code

Human Motion Transfer from Poses in the Wild

no code implementations • 7 Apr 2020 • Jian Ren, Menglei Chai, Sergey Tulyakov, Chen Fang, Xiaohui Shen, Jianchao Yang

In this paper, we tackle the problem of human motion transfer, where we synthesize novel motion video for a target person that imitates the movement from a reference video.

Translation

Paper
Add Code

Neural Hair Rendering

no code implementations • ECCV 2020 • Menglei Chai, Jian Ren, Sergey Tulyakov

Unlike existing supervised translation methods that require model-level similarity to preserve consistent structure representation for both real images and fake renderings, our method adopts an unsupervised solution to work on arbitrary hair models.

Translation

Paper
Add Code

Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?

no code implementations • 19 Feb 2021 • Ning Liu, Geng Yuan, Zhengping Che, Xuan Shen, Xiaolong Ma, Qing Jin, Jian Ren, Jian Tang, Sijia Liu, Yanzhi Wang

In deep model compression, the recent finding "Lottery Ticket Hypothesis" (LTH) (Frankle & Carbin, 2018) pointed out that there could exist a winning ticket (i. e., a properly pruned sub-network together with original weight initialization) that can achieve competitive performance than the original dense network.

Model Compression

Paper
Add Code

Teachers Do More Than Teach: Compressing Image-to-Image Models

1 code implementation • CVPR 2021 • Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov

In this work, we aim to address these issues by introducing a teacher network that provides a search space in which efficient network architectures can be found, in addition to performing knowledge distillation.

Knowledge Distillation

175

Paper
Code

SMIL: Multimodal Learning with Severely Missing Modality

1 code implementation • 9 Mar 2021 • Mengmeng Ma, Jian Ren, Long Zhao, Sergey Tulyakov, Cathy Wu, Xi Peng

A common assumption in multimodal learning is the completeness of training data, i. e., full modalities are available in all training examples.

Meta-Learning

Paper
Code

In&Out : Diverse Image Outpainting via GAN Inversion

no code implementations • 1 Apr 2021 • Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Ming-Hsuan Yang

Existing image outpainting methods pose the problem as a conditional image-to-image translation task, often generating repetitive structures and textures by replicating the content available in the input image.

Image Outpainting Image-to-Image Translation +1

Paper
Add Code

Motion Representations for Articulated Animation

2 code implementations • CVPR 2021 • Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

To facilitate animation and prevent the leakage of the shape of the driving object, we disentangle shape and pose of objects in the region space.

Ranked #1 on Video Reconstruction on Tai-Chi-HD (512)

Object Video Reconstruction

14,190

Paper
Code

A Good Image Generator Is What You Need for High-Resolution Video Synthesis

1 code implementation • ICLR 2021 • Yu Tian, Jian Ren, Menglei Chai, Kyle Olszewski, Xi Peng, Dimitris N. Metaxas, Sergey Tulyakov

We introduce a motion generator that discovers the desired trajectory, in which content and motion are disentangled.

Ranked #32 on Video Generation on UCF-101

Video Generation

236

Paper
Code

Flow Guided Transformable Bottleneck Networks for Motion Retargeting

no code implementations • CVPR 2021 • Jian Ren, Menglei Chai, Oliver J. Woodford, Kyle Olszewski, Sergey Tulyakov

Human motion retargeting aims to transfer the motion of one person in a "driving" video or set of images to another person.

Image Generation motion retargeting

Paper
Add Code

BinarizedAttack: Structural Poisoning Attacks to Graph-based Anomaly Detection

1 code implementation • 18 Jun 2021 • Yulin Zhu, Yuni Lai, Kaifa Zhao, Xiapu Luo, Mingquan Yuan, Jian Ren, Kai Zhou

Graph-based Anomaly Detection (GAD) is becoming prevalent due to the powerful representation abilities of graphs as well as recent advances in graph mining techniques.

Anomaly Detection Combinatorial Optimization +2

Paper
Code

InOut: Diverse Image Outpainting via GAN Inversion

no code implementations • CVPR 2022 • Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Ming-Hsuan Yang

Image Outpainting Image-to-Image Translation

Paper
Add Code

A Critical Analysis of Image-based Camera Pose Estimation Techniques

no code implementations • 15 Jan 2022 • Meng Xu, Youchen Wang, Bin Xu, Jun Zhang, Jian Ren, Stefan Poslad, Pengfei Xu

Camera, and associated with its objects within the field of view, localization could benefit many computer vision fields, such as autonomous driving, robot navigation, and augmented reality (AR).

Autonomous Driving Camera Localization +3

Paper
Add Code

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

1 code implementation • ICLR 2022 • Qing Jin, Jian Ren, Richard Zhuang, Sumant Hanumante, Zhengang Li, Zhiyu Chen, Yanzhi Wang, Kaiyuan Yang, Sergey Tulyakov

Our approach achieves comparable and better performance, when compared not only to existing quantization techniques with INT32 multiplication or floating-point arithmetic, but also to the full-precision counterparts, achieving state-of-the-art performance.

Quantization

Paper
Code

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

1 code implementation • CVPR 2022 • Ligong Han, Jian Ren, Hsin-Ying Lee, Francesco Barbieri, Kyle Olszewski, Shervin Minaee, Dimitris Metaxas, Sergey Tulyakov

In addition, our model can extract visual information as suggested by the text prompt, e. g., "an object in image one is moving northeast", and generate corresponding videos.

Self-Learning Text Augmentation +1

186

Paper
Code

R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis

1 code implementation • 31 Mar 2022 • Huan Wang, Jian Ren, Zeng Huang, Kyle Olszewski, Menglei Chai, Yun Fu, Sergey Tulyakov

On the other hand, Neural Light Field (NeLF) presents a more straightforward representation over NeRF in novel view synthesis -- the rendering of a pixel amounts to one single forward pass without ray-marching.

Novel View Synthesis

182

Paper
Code

Are Multimodal Transformers Robust to Missing Modality?

no code implementations • CVPR 2022 • Mengmeng Ma, Jian Ren, Long Zhao, Davide Testuggine, Xi Peng

Based on these findings, we propose a principle method to improve the robustness of Transformer models by automatically searching for an optimal fusion strategy regarding input data.

Paper
Add Code

EfficientFormer: Vision Transformers at MobileNet Speed

10 code implementations • 2 Jun 2022 • Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren

Our work proves that properly designed transformers can reach extremely low latency on mobile devices while maintaining high performance.

29,680

Paper
Code

Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

1 code implementation • 15 Jun 2022 • Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan

Diffusion probabilistic models (DPMs) have become a popular approach to conditional generation, due to their promising results and support for cross-modal synthesis.

Contrastive Learning Denoising +2

151

Paper
Code

Computer Vision-Aided Reconfigurable Intelligent Surface-Based Beam Tracking: Prototyping and Experimental Results

no code implementations • 11 Jul 2022 • Ming Ouyang, Yucong Wang, Feifei Gao, Shun Zhang, Puchu Li, Jian Ren

The vision-aided RIS prototype system is tested in two mobile scenarios: RIS works in near-field conditions as a passive array antenna of the base station; RIS works in far-field conditions to assist the communication between the base station and the user equipment.

Paper
Add Code

Cross-Modal 3D Shape Generation and Manipulation

no code implementations • 24 Jul 2022 • Zezhou Cheng, Menglei Chai, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Subhransu Maji, Sergey Tulyakov

In this paper, we propose a generic multi-modal generative model that couples the 2D modalities and implicit 3D representations through shared latent spaces.

3D Generation 3D Shape Generation

Paper
Add Code

RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation

1 code implementation • 25 Jul 2022 • Mu He, Le Hui, Yikai Bian, Jian Ren, Jin Xie, Jian Yang

In this paper, we propose a resolution adaptive self-supervised monocular depth estimation method (RA-Depth) by learning the scale invariance of the scene depth.

Data Augmentation Monocular Depth Estimation

Paper
Code

PIM-QAT: Neural Network Quantization for Processing-In-Memory (PIM) Systems

no code implementations • 18 Sep 2022 • Qing Jin, Zhiyu Chen, Jian Ren, Yanyu Li, Yanzhi Wang, Kaiyuan Yang

In this paper, we propose a method for training quantized networks to incorporate PIM quantization, which is ubiquitous to all PIM systems.

Quantization

Paper
Add Code

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training

1 code implementation • 22 Sep 2022 • Geng Yuan, Yanyu Li, Sheng Li, Zhenglun Kong, Sergey Tulyakov, Xulong Tang, Yanzhi Wang, Jian Ren

Therefore, we analyze the feasibility and potentiality of using the layer freezing technique in sparse training and find it has the potential to save considerable training costs.

Paper
Code

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

1 code implementation • CVPR 2023 • Tanzila Rahman, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Shweta Mahajan, Leonid Sigal

Our experiments for story generation on the MUGEN, the PororoSV and the FlintstonesSV dataset show that our method not only outperforms prior state-of-the-art in generating frames with high visual quality, which are consistent with the story, but also models appropriate correspondences between the characters and the background.

Sentence Story Generation +1

Paper
Code

SINE: SINgle Image Editing with Text-to-Image Diffusion Models

1 code implementation • CVPR 2023 • Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren

We propose a novel model-based guidance built upon the classifier-free guidance so that the knowledge from the model trained on a single image can be distilled into the pre-trained diffusion model, enabling content creation even with one given image.

Image Generation

173

Paper
Code

Real-Time Neural Light Field on Mobile Devices

1 code implementation • CVPR 2023 • Junli Cao, Huan Wang, Pavlo Chemerys, Vladislav Shakhrai, Ju Hu, Yun Fu, Denys Makoviichuk, Sergey Tulyakov, Jian Ren

Nevertheless, to reach a similar rendering quality as NeRF, the network in NeLF is designed with intensive computation, which is not mobile-friendly.

Neural Rendering Novel View Synthesis

185

Paper
Code

Rethinking Vision Transformers for MobileNet Size and Speed

6 code implementations • ICCV 2023 • Yanyu Li, Ju Hu, Yang Wen, Georgios Evangelidis, Kamyar Salahi, Yanzhi Wang, Sergey Tulyakov, Jian Ren

With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to optimize the performance and complexity of ViTs to enable efficient deployment on mobile devices.

29,680

Paper
Code

Unsupervised Volumetric Animation

no code implementations • CVPR 2023 • Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Kyle Olszewski, Jian Ren, Hsin-Ying Lee, Menglei Chai, Sergey Tulyakov

We propose a novel approach for unsupervised 3D animation of non-rigid deformable objects.

Keypoint Estimation Novel View Synthesis

Paper
Add Code

Invertible Neural Skinning

no code implementations • CVPR 2023 • Yash Kant, Aliaksandr Siarohin, Riza Alp Guler, Menglei Chai, Jian Ren, Sergey Tulyakov, Igor Gilitschenski

Next, we combine PIN with a differentiable LBS module to build an expressive and end-to-end Invertible Neural Skinning (INS) pipeline.

Paper
Add Code

3D generation on ImageNet

no code implementations • 2 Mar 2023 • Ivan Skorokhodov, Aliaksandr Siarohin, Yinghao Xu, Jian Ren, Hsin-Ying Lee, Peter Wonka, Sergey Tulyakov

Existing 3D-from-2D generators are typically designed for well-curated single-category datasets, where all the objects have (approximately) the same scale, 3D location, and orientation, and the camera always points to the center of the scene.

3D Generation

Paper
Add Code

Two-Bit RIS-Aided Communications at 3.5GHz: Some Insights from the Measurement Results Under Multiple Practical Scenes

no code implementations • 19 May 2023 • Shun Zhang, Haoran Sun, Runze Yu, Hongshenyuan Cui, Jian Ren, Feifei Gao, Shi Jin, Hongxiang Xie, Hao Wang

In particular, we adopt a self-developed broadband intelligent communication system 40MHz-Net (BICT-40N) terminal in order to fully acquire the channel information.

Intelligent Communication Quantization

Paper
Add Code

COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models

1 code implementation • 26 May 2023 • Jinqi Xiao, Miao Yin, Yu Gong, Xiao Zang, Jian Ren, Bo Yuan

Attention-based vision models, such as Vision Transformer (ViT) and its variants, have shown promising performance in various computer vision tasks.

Model Compression

Paper
Code

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

no code implementations • NeurIPS 2023 • Yanyu Li, Huan Wang, Qing Jin, Ju Hu, Pavlo Chemerys, Yun Fu, Yanzhi Wang, Sergey Tulyakov, Jian Ren

We achieve so by introducing efficient network architecture and improving step distillation.

Denoising

Paper
Add Code

Robust and Efficient Fault Diagnosis of mm-Wave Active Phased Arrays using Baseband Signal

no code implementations • 7 Jun 2023 • Martin H. Nielsen, Yufeng Zhang, Changbin Xue, Jian Ren, Yingzeng Yin, Ming Shen, Gert F. Pedersen

One key communication block in 5G and 6G radios is the active phased array (APA).

Fault Detection

Paper
Add Code

Fast and Automatic 3D Modeling of Antenna Structure Using CNN-LSTM Network for Efficient Data Generation

no code implementations • 27 Jun 2023 • Zhaohui Wei, Zhao Zhou, Peng Wang, Jian Ren, Yingzeng Yin, Gert Frølund Pedersen, Ming Shen

In this study, we proposed a deep learning-assisted and image-based intelligent modeling approach for accelerating the data acquisition of antenna samples with different physical structures.

Paper
Add Code

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

1 code implementation • 30 Jun 2023 • Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, Bernard Ghanem

We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors.

Image to 3D

1,453

Paper
Code

CLGT: A Graph Transformer for Student Performance Prediction in Collaborative Learning

1 code implementation • 30 Jul 2023 • Tianhao Peng, Yu Liang, Wenjun Wu, Jian Ren, Zhao Pengrui, Yanjun Pu

Based on this student interaction graph, we present an extended graph transformer framework for collaborative learning (CLGT) for evaluating and predicting the performance of students.

Paper
Code

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion

no code implementations • 12 Oct 2023 • Xian Liu, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Yanyu Li, Dahua Lin, Xihui Liu, Ziwei Liu, Sergey Tulyakov

Our model enforces the joint learning of image appearance, spatial relationship, and geometry in a unified network, where each branch in the model complements to each other with both structural awareness and textural richness.

Image Generation

Paper
Add Code

iNVS: Repurposing Diffusion Inpainters for Novel View Synthesis

no code implementations • 24 Oct 2023 • Yash Kant, Aliaksandr Siarohin, Michael Vasilkovsky, Riza Alp Guler, Jian Ren, Sergey Tulyakov, Igor Gilitschenski

Our approach focuses on maximizing the reuse of visible pixels from the source image.

Novel View Synthesis

Paper
Add Code

E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

no code implementations • 11 Jan 2024 • Yifan Gong, Zheng Zhan, Qing Jin, Yanyu Li, Yerlan Idelbayev, Xian Liu, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren

One highly promising direction for enabling flexible real-time on-device image editing is utilizing data distillation by leveraging large-scale text-to-image diffusion models, such as Stable Diffusion, to generate paired datasets used for training generative adversarial networks (GANs).

Image-to-Image Translation

Paper
Add Code

AToM: Amortized Text-to-Mesh using 2D Diffusion

no code implementations • 1 Feb 2024 • Guocheng Qian, Junli Cao, Aliaksandr Siarohin, Yash Kant, Chaoyang Wang, Michael Vasilkovsky, Hsin-Ying Lee, Yuwei Fang, Ivan Skorokhodov, Peiye Zhuang, Igor Gilitschenski, Jian Ren, Bernard Ghanem, Kfir Aberman, Sergey Tulyakov

We introduce Amortized Text-to-Mesh (AToM), a feed-forward text-to-mesh framework optimized across multiple text prompts simultaneously.

Text to 3D

Paper
Add Code

SPAD : Spatially Aware Multiview Diffusers

no code implementations • 7 Feb 2024 • Yash Kant, Ziyi Wu, Michael Vasilkovsky, Guocheng Qian, Jian Ren, Riza Alp Guler, Bernard Ghanem, Sergey Tulyakov, Igor Gilitschenski, Aliaksandr Siarohin

We present SPAD, a novel approach for creating consistent multi-view images from text prompts or single images.

3D Generation Novel View Synthesis +1

Paper
Add Code

Visual Concept-driven Image Generation with Text-to-Image Diffusion Model

no code implementations • 18 Feb 2024 • Tanzila Rahman, Shweta Mahajan, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Leonid Sigal

We illustrate that such joint alternating refinement leads to the learning of better tokens for concepts and, as a bi-product, latent masks.

Image Generation

Paper
Add Code

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

no code implementations • 22 Feb 2024 • Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren, Sergey Tulyakov

Since video content is highly redundant, we argue that naively bringing advances of image models to the video generation domain reduces motion fidelity, visual quality and impairs scalability.

Ranked #1 on Text-to-Video Generation on MSR-VTT

Image Generation Text-to-Video Generation +1

Paper
Add Code

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

no code implementations • 29 Feb 2024 • Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov

Next, we finetune a retrieval model on a small subset where the best caption of each video is manually selected and then employ the model in the whole dataset to select the best caption as the annotation.

Retrieval Text Retrieval +3

Paper
Add Code

TextCraftor: Your Text Encoder Can be Image Quality Controller

no code implementations • 27 Mar 2024 • Yanyu Li, Xian Liu, Anil Kag, Ju Hu, Yerlan Idelbayev, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov, Jian Ren

Our findings reveal that, instead of replacing the CLIP text encoder used in Stable Diffusion with other large language models, we can enhance it through our proposed fine-tuning approach, TextCraftor, leading to substantial improvements in quantitative benchmarks and human assessments.

Image Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.