no code implementations • 6 Dec 2024 • Bohan Li, Jiazhe Guo, Hongsi Liu, Yingshuang Zou, Yikang Ding, Xiwu Chen, Hu Zhu, Feiyang Tan, Chi Zhang, Tiancai Wang, Shuchang Zhou, Li Zhang, Xiaojuan Qi, Hao Zhao, Mu Yang, Wenjun Zeng, Xin Jin
UniScene employs a progressive generation process that decomposes the complex task of scene generation into two hierarchical steps: (a) first generating semantic occupancy from a customized scene layout as a meta scene representation rich in both semantic and geometric information, and then (b) conditioned on occupancy, generating video and LiDAR data, respectively, with two novel transfer strategies of Gaussian-based Joint Rendering and Prior-guided Sparse Modeling.
no code implementations • 4 Dec 2024 • Ruibo Ming, Jingwei Wu, Zhewei Huang, Zhuoxuan Ju, Jianming Hu, Lihui Peng, Shuchang Zhou
Recent advances in auto-regressive large language models (LLMs) have shown their potential in generating high-quality text, inspiring researchers to apply them to image and video generation.
no code implementations • 16 Oct 2024 • Linfeng Xu, Fanman Meng, Qingbo Wu, Lili Pan, Heqian Qiu, Lanxiao Wang, Kailong Chen, Kanglei Geng, Yilei Qian, Haojie Wang, Shuchang Zhou, Shimou Ling, Zejia Liu, Nanlin Chen, YingJie Xu, Shaoxu Cheng, Bowen Tan, Ziyong Xu, Hongliang Li
The ARIC dataset has advantages of multiple perspectives, 32 activity categories, three modalities, and real-world classroom scenarios.
no code implementations • 5 Sep 2024 • Jing Cui, Yishi Xu, Zhewei Huang, Shuchang Zhou, Jianbin Jiao, Junge Zhang
Given the extensive research in the field of LLM security, we believe that summarizing the current state of affairs will help the research community better understand the present landscape and inform future developments.
1 code implementation • 9 Jul 2024 • Shuangkang Fang, Yufeng Wang, Yi-Hsuan Tsai, Yi Yang, Wenrui Ding, Shuchang Zhou, Ming-Hsuan Yang
Recent work on image content manipulation based on vision-language pre-training models has been effectively extended to text-driven 3D scene editing.
no code implementations • 26 Jan 2024 • Ruibo Ming, Zhewei Huang, Zhuoxuan Ju, Jianming Hu, Lihui Peng, Shuchang Zhou
Future Frame Synthesis (FFS) aims to enable models to generate sequences of future frames based on existing content.
1 code implementation • 26 Oct 2023 • Zhewei Huang, Ailin Huang, Xiaotao Hu, Chen Hu, Jun Xu, Shuchang Zhou
The Space-Time Video Super-Resolution (STVSR) task aims to enhance the visual quality of videos, by simultaneously performing video frame interpolation (VFI) and video super-resolution (VSR).
Space-time Video Super-resolution Video Frame Interpolation +1
no code implementations • 10 Sep 2023 • Shuangkang Fang, Yufeng Wang, Yi Yang, Yi-Hsuan Tsai, Wenrui Ding, Shuchang Zhou, Ming-Hsuan Yang
To tackle these issues, we introduce a text-driven editing method, termed DN2N, which allows for the direct acquisition of a NeRF model with universal editing capabilities, eliminating the requirement for retraining.
no code implementations • 10 Aug 2023 • Miao Fan, Chen Hu, Shuchang Zhou
In this paper, we introduce a simple task designed to employ Gloden as a reward model that validates the effectiveness of PPO and inspires it, primarily explaining the task of utilizing PPO to manipulate the tokenizer length of the output generated by the model.
1 code implementation • 8 Apr 2023 • Shuangkang Fang, Yufeng Wang, Yi Yang, Weixin Xu, Heng Wang, Wenrui Ding, Shuchang Zhou
For instance, PVD-AL can distill an MLP-based model from a Hashtables-based model at a 10~20X faster speed and 0. 8dB~2dB higher PSNR than training the MLP-based model from scratch.
1 code implementation • CVPR 2023 • Shengchao Zhou, Weizhou Liu, Chen Hu, Shuchang Zhou, Chao Ma
In the field of 3D object detection for autonomous driving, the sensor portfolio including multi-modality and single-modality is diverse and complex.
1 code implementation • CVPR 2023 • Xiaotao Hu, Zhewei Huang, Ailin Huang, Jun Xu, Shuchang Zhou
The performance of video prediction has been greatly boosted by advanced deep neural networks.
Ranked #1 on Video Prediction on Cityscapes
1 code implementation • CVPR 2023 • Yun-Hao Cao, Peiqin Sun, Shuchang Zhou
We propose universally slimmable self-supervised learning (dubbed as US3L) to achieve better accuracy-efficiency trade-offs for deploying self-supervised models across different devices.
1 code implementation • 27 Feb 2023 • Ruihang Miao, Weizhou Liu, Mingrui Chen, Zheng Gong, Weixin Xu, Chen Hu, Shuchang Zhou
3D Semantic Scene Completion (SSC) can provide dense geometric and semantic scene representations, which can be applied in the field of autonomous driving and robotic systems.
Ranked #15 on 3D Semantic Scene Completion on SemanticKITTI
no code implementations • ICCV 2023 • Miao Fan, Mingrui Chen, Chen Hu, Shuchang Zhou
Image matching is a fundamental and critical task in various visual applications, such as Simultaneous Localization and Mapping (SLAM) and image retrieval, which require accurate pose estimation.
1 code implementation • 29 Nov 2022 • Shuangkang Fang, Weixin Xu, Heng Wang, Yi Yang, Yufeng Wang, Shuchang Zhou
In this paper, we propose Progressive Volume Distillation (PVD), a systematic distillation method that allows any-to-any conversions between different architectures, including MLP, sparse or low-rank tensors, hashtables and their compositions.
Ranked #1 on Novel View Synthesis on NeRF (Average PSNR metric)
1 code implementation • 12 Jul 2022 • Yun-Hao Cao, Peiqin Sun, Yechang Huang, Jianxin Wu, Shuchang Zhou
In this paper, we propose a method called synergistic self-supervised and quantization learning (SSQL) to pretrain quantization-friendly self-supervised models facilitating downstream deployment.
1 code implementation • 26 Jun 2022 • Ailin Huang, Zhewei Huang, Shuchang Zhou
This paper reports our solution for ACM Multimedia ViCo 2022 Conversational Head Generation Challenge, which aims to generate vivid face-to-face conversation videos based on audio and reference images.
2 code implementations • 4 Mar 2022 • Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M. Kazachkov, Elias Khalil, Pawel Lichocki, Andrea Lodi, Miles Lubin, Chris J. Maddison, Christopher Morris, Dimitri J. Papageorgiou, Augustin Parjadis, Sebastian Pokutta, Antoine Prouvost, Lara Scavuzzo, Giulia Zarpellon, Linxin Yang, Sha Lai, Akang Wang, Xiaodong Luo, Xiang Zhou, Haohan Huang, Shengcheng Shao, Yuanming Zhu, Dong Zhang, Tao Quan, Zixuan Cao, Yang Xu, Zhewei Huang, Shuchang Zhou, Chen Binbin, He Minggui, Hao Hao, Zhang Zhiyu, An Zhiwu, Mao Kun
Combinatorial optimization is a well-established area in operations research and computer science.
1 code implementation • 25 Jan 2022 • Zixuan Cao, Yang Xu, Zhewei Huang, Shuchang Zhou
The Machine Learning for Combinatorial Optimization (ML4CO) NeurIPS 2021 competition aims to improve state-of-the-art combinatorial optimization solvers by replacing key heuristic components with machine learning models.
1 code implementation • 27 Nov 2021 • Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, Shuchang Zhou
Network quantization significantly reduces model inference complexity and has been widely used in real-world deployments.
Ranked #1 on Quantization on ImageNet
1 code implementation • 1 Nov 2021 • Weixin Xu, Zipeng Feng, Shuangkang Fang, Song Yuan, Yi Yang, Shuchang Zhou
For example, Transformer Networks do not have native support on many popular chips, and hence are difficult to deploy.
no code implementations • EACL 2021 • Yuekai Zhao, Shuchang Zhou, Zhihua Zhang
Large-scale transformers have been shown the state-of-the-art on neural machine translation.
13 code implementations • 12 Nov 2020 • Zhewei Huang, Tianyuan Zhang, Wen Heng, Boxin Shi, Shuchang Zhou
We propose RIFE, a Real-time Intermediate Flow Estimation algorithm for Video Frame Interpolation (VFI).
no code implementations • Findings of the Association for Computational Linguistics 2020 • Yuekai Zhao, Haoran Zhang, Shuchang Zhou, Zhihua Zhang
Active learning is an efficient approach for mitigating data dependency when training neural machine translation (NMT) models.
no code implementations • 30 Aug 2020 • Dachao Lin, Peiqin Sun, Guangzeng Xie, Shuchang Zhou, Zhihua Zhang
Quantized Neural Networks (QNNs) use low bit-width fixed-point numbers for representing weight parameters and activations, and are often used in real-world applications due to their saving of computation resources and reproducibility of results.
6 code implementations • ICCV 2019 • Zhewei Huang, Wen Heng, Shuchang Zhou
We show how to teach machines to paint like human painters, who can use a small number of strokes to create fantastic paintings.
no code implementations • 27 Sep 2018 • YuJun Li, Chengzhuo Ni, Guangzeng Xie, Wenhao Yang, Shuchang Zhou, Zhihua Zhang
A2VI is more efficient than the modified policy iteration, which is a classical approximate method for policy evaluation.
no code implementations • 18 Jul 2018 • Wen Heng, Shuchang Zhou, Tingting Jiang
The property of edge-free guarantees that the generated adversarial images can still preserve visual quality, even when perturbations are of large magnitudes.
1 code implementation • 23 Jun 2018 • Zhewei Huang, Wen Heng, Yuanzheng Tao, Shuchang Zhou
Background elimination for noisy character images or character images from real scene is still a challenging problem, due to the bewildering backgrounds, uneven illumination, low resolution and different distortions.
no code implementations • 17 May 2018 • Guangzeng Xie, Yitan Wang, Shuchang Zhou, Zhihua Zhang
In this paper we explore acceleration techniques for large scale nonconvex optimization problems with special focuses on deep neural networks.
2 code implementations • 2 Apr 2018 • Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll, Jennifer Hicks, Sergey Levine, Marcel Salathé, Scott Delp
In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course.
2 code implementations • 25 Dec 2017 • Zhewei Huang, Shuchang Zhou, BoEr Zhuang, Xinyu Zhou
We introduce an Actor-Critic Ensemble(ACE) method for improving the performance of Deep Deterministic Policy Gradient(DDPG) algorithm.
no code implementations • 22 Jun 2017 • Shuchang Zhou, Yuzhi Wang, He Wen, Qinyao He, Yuheng Zou
Overall, our method improves the prediction accuracies of QNNs without introducing extra computation during inference, has negligible impact on training speed, and is applicable to both Convolutional Neural Networks and Recurrent Neural Networks.
2 code implementations • 14 May 2017 • Shuchang Zhou, Taihong Xiao, Yi Yang, Dieqiao Feng, Qinyao He, Weiran He
In this work, we propose a model that can learn object transfiguration from two unpaired sets of images: one set containing images that "have" that kind of object, and the other set being the opposite, with the mild constraint that the objects be located approximately at the same place.
32 code implementations • CVPR 2017 • Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, Jiajun Liang
Previous approaches for scene text detection have already achieved promising performances across various benchmarks.
Ranked #3 on Scene Text Detection on COCO-Text
Curved Text Detection Optical Character Recognition (OCR) +1
no code implementations • 1 Dec 2016 • He Wen, Shuchang Zhou, Zhe Liang, Yuxiang Zhang, Dieqiao Feng, Xinyu Zhou, Cong Yao
Fully convolutional neural networks give accurate, per-pixel prediction for input images and have applications like semantic segmentation.
2 code implementations • 30 Nov 2016 • Qinyao He, He Wen, Shuchang Zhou, Yuxin Wu, Cong Yao, Xinyu Zhou, Yuheng Zou
In addition, we propose balanced quantization methods for weights to further reduce performance degradation.
no code implementations • 29 Jun 2016 • Cong Yao, Xiang Bai, Nong Sang, Xinyu Zhou, Shuchang Zhou, Zhimin Cao
Recently, scene text detection has become an active research topic in computer vision and document analysis, because of its great importance and significant challenge.
Ranked #6 on Scene Text Detection on COCO-Text
13 code implementations • 20 Jun 2016 • Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou
We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients.
no code implementations • 31 Dec 2015 • Shuchang Zhou, Jia-Nan Wu, Yuxin Wu, Xinyu Zhou
In this paper, we propose and study a technique to reduce the number of parameters and computation time in convolutional neural networks.
no code implementations • 30 Nov 2015 • Cong Yao, Jia-Nan Wu, Xinyu Zhou, Chi Zhang, Shuchang Zhou, Zhimin Cao, Qi Yin
Different from focused texts present in natural images, which are captured with user's intention and intervention, incidental texts usually exhibit much more diversity, variability and complexity, thus posing significant difficulties and challenges for scene text detection and recognition algorithms.
no code implementations • 30 Jul 2015 • Shuchang Zhou, Yuxin Wu
In this paper we propose and study a technique to impose structural constraints on the output of a neural network, which can reduce amount of computation and number of parameters besides improving prediction accuracy when the output is known to approximately conform to the low-rankness prior.
no code implementations • 21 Jul 2015 • Shuchang Zhou, Jia-Nan Wu
In this paper we propose and study a technique to reduce the number of parameters and computation time in fully-connected layers of neural networks using Kronecker product, at a mild cost of the prediction quality.
no code implementations • 10 Jun 2015 • Xinyu Zhou, Shuchang Zhou, Cong Yao, Zhimin Cao, Qi Yin
Recently, text detection and recognition in natural scenes are becoming increasing popular in the computer vision community as well as the document analysis community.
no code implementations • 3 Oct 2014 • Shuchang Zhou, Zhihua Zhang, Xiaobing Feng
In this paper we propose and study an optimization problem over a matrix group orbit that we call \emph{Group Orbit Optimization} (GOO).
no code implementations • 22 Jul 2013 • Zhihua Zhang, Shibo Zhao, Zebang Shen, Shuchang Zhou
In this paper we propose and study a family of sparsity-inducing penalty functions.