no code implementations • ICLR 2019 • Lili Meng, Bo Zhao, Bo Chang, Gao Huang, Frederick Tung, Leonid Sigal
Our model is efficient, as it proposes a separable spatio-temporal mechanism for video attention, while being able to identify important parts of the video both spatially and temporally.
Action Recognition In Videos
Temporal Action Localization
+1
no code implementations • 19 Feb 2025 • Zheng Wu, Yiping Xie, Bo Zhao, Jiguang He, Fei Luo, Ning Deng, Zitong Yu
However, traditional single-modality approaches (RGB or Radio Frequency (RF)) face challenges in balancing robustness and accuracy due to lighting variations, motion artifacts, and skin tone bias.
no code implementations • 3 Feb 2025 • Boyan Gao, Bo Zhao, Shreyank N Gowda, Xingrun Xing, Yibo Yang, Timothy Hospedales, David A. Clifton
These issues deteriorate when the datasets are learned via matching the trajectories of networks trained on the real and synthetic datasets with a long horizon inner-loop.
1 code implementation • 6 Jan 2025 • Yuxiang Bao, Guoliang Kang, Linlin Yang, Xiaoyue Duan, Bo Zhao, Baochang Zhang
Differently, in this paper, we identify that the bias towards the frequent class may be encoded into features, i. e., the rare-specific features which play a key role in discriminating the rare class are much weaker than the frequent-specific features.
1 code implementation • 19 Dec 2024 • Junjie Zhou, Zheng Liu, Ze Liu, Shitao Xiao, Yueze Wang, Bo Zhao, Chen Jason Zhang, Defu Lian, Yongping Xiong
Despite the rapidly growing demand for multimodal retrieval, progress in this field remains severely constrained by a lack of training data.
Ranked #1 on
Zero-Shot Composed Image Retrieval (ZS-CIR)
on CIRCO
1 code implementation • 6 Nov 2024 • Pedro R. A. S. Bassi, Wenxuan Li, Yucheng Tang, Fabian Isensee, Zifu Wang, Jieneng Chen, Yu-Cheng Chou, Yannick Kirchhoff, Maximilian Rokuss, Ziyan Huang, Jin Ye, Junjun He, Tassilo Wald, Constantin Ulrich, Michael Baumgartner, Saikat Roy, Klaus H. Maier-Hein, Paul Jaeger, Yiwen Ye, Yutong Xie, Jianpeng Zhang, Ziyang Chen, Yong Xia, Zhaohu Xing, Lei Zhu, Yousef Sadegheih, Afshin Bozorgpour, Pratibha Kumari, Reza Azad, Dorit Merhof, Pengcheng Shi, Ting Ma, Yuxin Du, Fan Bai, Tiejun Huang, Bo Zhao, Haonan Wang, Xiaomeng Li, Hanxue Gu, Haoyu Dong, Jichen Yang, Maciej A. Mazurowski, Saumya Gupta, Linshan Wu, Jiaxin Zhuang, Hao Chen, Holger Roth, Daguang Xu, Matthew B. Blaschko, Sergio Decherchi, Andrea Cavalli, Alan L. Yuille, Zongwei Zhou
We are committed to expanding this benchmark to encourage more innovation of AI algorithms for the medical domain.
2 code implementations • 27 Sep 2024 • Xinlong Wang, Xiaosong Zhang, Zhengxiong Luo, Quan Sun, Yufeng Cui, Jinsheng Wang, Fan Zhang, Yueze Wang, Zhen Li, Qiying Yu, Yingli Zhao, Yulong Ao, Xuebin Min, Tao Li, Boya Wu, Bo Zhao, BoWen Zhang, Liangdong Wang, Guang Liu, Zheqi He, Xi Yang, Jingjing Liu, Yonghua Lin, Tiejun Huang, Zhongyuan Wang
While next-token prediction is considered a promising path towards artificial general intelligence, it has struggled to excel in multimodal tasks, which are still dominated by diffusion models (e. g., Stable Diffusion) and compositional approaches (e. g., CLIP combined with LLMs).
Ranked #134 on
Visual Question Answering
on MM-Vet
1 code implementation • 22 Sep 2024 • Yan Shu, Zheng Liu, Peitian Zhang, Minghao Qin, Junjie Zhou, Zhengyang Liang, Tiejun Huang, Bo Zhao
The VST module is trained by instruction fine-tuning, where two optimizing strategies are offered.
no code implementations • 13 Sep 2024 • Bach Do, Sina Jafari Ghalekohneh, Taiwo Adebiyi, Bo Zhao, Ruda Zhang
Nonreciprocal thermal emitters that break Kirchhoff's law of thermal radiation promise exciting applications for thermal and energy applications.
no code implementations • 10 Sep 2024 • Dingxin Cheng, Mingda Li, Jingyu Liu, Yongxin Guo, Bin Jiang, Qingbin Liu, Xi Chen, Bo Zhao
While this method excels in short video understanding, it may result in a blend of multiple event information in long videos due to coarse compression, which causes information redundancy.
no code implementations • 5 Sep 2024 • Mingze Gao, Jingyu Liu, Mingda Li, Jiangtao Xie, Qingbin Liu, Bo Zhao, Xi Chen, Hui Xiong
Multimodal Large Language Models (MLLMs) have significantly improved performance across various image-language applications.
no code implementations • 3 Jul 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang
Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence.
2 code implementations • 24 Jun 2024 • Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, YaoWei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Chengjing Wu, Ting Liu, Luoqi Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, Jingnan Luo, Jinyu Yang, Jungong Han, Feng Zheng, Bin Cao, Yisi Zhang, Xuanxu Lin, Xingjian He, Bo Zhao, Jing Liu, Feiyu Pan, Hao Fang, Xiankai Lu
Moreover, we provide a new motion expression guided video segmentation dataset MeViS to study the natural language-guided video understanding in complex environments.
no code implementations • 20 Jun 2024 • Bin Cao, Yisi Zhang, Xuanxu Lin, Xingjian He, Bo Zhao, Jing Liu
Motion Expression guided Video Segmentation is a challenging task that aims at segmenting objects in the video based on natural language expressions with motion descriptions.
Instance Segmentation
Referring Video Object Segmentation
+5
1 code implementation • 19 Jun 2024 • Wenxiao Cai, Iaroslav Ponomarenko, Jianhao Yuan, Xiaoqi Li, Wankou Yang, Hao Dong, Bo Zhao
Vision Language Models (VLMs) have achieved impressive performance in 2D image understanding, however they are still struggling with spatial understanding which is the foundation of Embodied AI.
Ranked #4 on
Spatial Reasoning
on 6-DoF SpatialBench
1 code implementation • 15 Jun 2024 • Yexin Liu, Zhengyang Liang, Yueze Wang, Xianfeng Wu, Feilong Tang, Muyang He, Jian Li, Zheng Liu, Harry Yang, SerNam Lim, Bo Zhao
To this end, we manually construct a benchmark with 12 categories and design evaluation metrics that assess the degree of error in MLLM responses even when the visual content is seemingly understood.
1 code implementation • 6 Jun 2024 • Junjie Zhou, Zheng Liu, Shitao Xiao, Bo Zhao, Yongping Xiong
Thirdly, we introduce a multi-stage training algorithm, which first aligns the visual token embedding with the text encoder using massive weakly labeled data, and then develops multi-modal representation capability using the generated composed image-text data.
Ranked #7 on
Image Retrieval
on CIRR
no code implementations • 6 Jun 2024 • Jiyao Zhang, Weiyao Huang, Bo Peng, Mingdong Wu, Fei Hu, Zijian Chen, Bo Zhao, Hao Dong
6D Object Pose Estimation is a crucial yet challenging task in computer vision, suffering from a significant lack of large-scale datasets.
3 code implementations • 6 Jun 2024 • Junjie Zhou, Yan Shu, Bo Zhao, Boya Wu, Zhengyang Liang, Shitao Xiao, Minghao Qin, Xi Yang, Yongping Xiong, Bo Zhang, Tiejun Huang, Zheng Liu
To address the above problems, we propose a new benchmark called MLVU (Multi-task Long Video Understanding Benchmark) for the comprehensive and in-depth evaluation of LVU.
no code implementations • 27 May 2024 • Jian Zhao, Lei Jin, Jianshu Li, Zheng Zhu, Yinglei Teng, Jiaojiao Zhao, Sadaf Gulshad, Zheng Wang, Bo Zhao, Xiangbo Shu, Yunchao Wei, Xuecheng Nie, Xiaojie Jin, Xiaodan Liang, Shin'ichi Satoh, Yandong Guo, Cewu Lu, Junliang Xing, Jane Shen Shengmei
The SkatingVerse Workshop & Challenge aims to encourage research in developing novel and accurate methods for human action understanding.
1 code implementation • 17 May 2024 • Yizhang Jin, Jian Li, Yexin Liu, Tianjun Gu, Kai Wu, Zhengkai Jiang, Muyang He, Bo Zhao, Xin Tan, Zhenye Gan, Yabiao Wang, Chengjie Wang, Lizhuang Ma
In the past year, Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance in tasks such as visual question answering, visual understanding and reasoning.
no code implementations • 11 May 2024 • Renyou Xie, Xin Yin, Chaojie Li, Guo Chen, Nian Liu, Bo Zhao, ZhaoYang Dong
Distribution system state estimation (DSSE) plays a crucial role in the real-time monitoring, control, and operation of distribution networks.
no code implementations • 4 May 2024 • Tao Wang, Bo Zhao, Sicun Gao, Rose Yu
Physics-Informed Neural Networks (PINNs) have gained popularity in scientific computing in recent years.
1 code implementation • 29 Apr 2024 • Yichen Ouyang, Jianhao Yuan, Hao Zhao, Gaoang Wang, Bo Zhao
Generating long and consistent videos has emerged as a significant yet challenging problem.
no code implementations • 25 Apr 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang
Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications.
no code implementations • 23 Apr 2024 • Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Bo Zhao, Liping Yi, Alysa Ziying Tan, Yulan Gao, Anran Li, Xiaoxiao Li, Zengxiang Li, Qiang Yang
This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic relationship and exploring novel methodologies, challenges, and future directions that the FL research field needs to focus on in order to thrive in the age of FMs.
2 code implementations • 31 Mar 2024 • Fan Bai, Yuxin Du, Tiejun Huang, Max Q. -H. Meng, Bo Zhao
Additionally, we propose M3D-LaMed, a versatile multi-modal large language model for 3D medical image analysis.
1 code implementation • 28 Feb 2024 • Bin Cao, Jianhao Yuan, Yexin Liu, Jian Li, Shuyang Sun, Jing Liu, Bo Zhao
To alleviate artifacts and improve quality of synthetic images, we fine-tune Vision-Language Model (VLM) as artifact classifier to automatically identify and classify a wide range of artifacts and provide supervision for further optimizing generative models.
no code implementations • 19 Feb 2024 • Xuelin Qian, Yu Wang, Simian Luo, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue, Bo Zhao, Tiejun Huang, Yunsheng Wu, Yanwei Fu
In this paper, we extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
1 code implementation • 18 Feb 2024 • Muyang He, Yexin Liu, Boya Wu, Jianhao Yuan, Yueze Wang, Tiejun Huang, Bo Zhao
Multimodal Large Language Models (MLLMs) have demonstrated notable capabilities in general visual understanding and reasoning tasks.
no code implementations • 16 Feb 2024 • Jianhao Yuan, Shuyang Sun, Daniel Omeiza, Bo Zhao, Paul Newman, Lars Kunze, Matthew Gadd
Recent advancements in Multi-Modal Large Language models (MLLMs) have shown promising potential in enhancing the explainability as a driving agent by producing control predictions along with natural language explanations.
no code implementations • 4 Feb 2024 • Wuxuan Jiang, Xiangjun Song, Shenbai Hong, Haijun Zhang, Wenxin Liu, Bo Zhao, Wei Xu, Yi Li
Accuracy and efficiency remain challenges for multi-party computation (MPC) frameworks.
no code implementations • 23 Jan 2024 • Lei You, Lele Cao, Mattias Nilsson, Bo Zhao, Lei Lei
Counterfactual explanations (CE) are the de facto method for providing insights into black-box decision-making models by identifying alternative inputs that lead to different outcomes.
no code implementations • 19 Jan 2024 • Bo Zhao, Huan Yang, Jianlong Fu
Face inpainting requires the model to have a precise global understanding of the facial position structure.
1 code implementation • 8 Dec 2023 • Marcel Wagenländer, Guo Li, Bo Zhao, Luo Mai, Peter Pietzuch
After a GPU change, Scalai uses the PTC to transform the job state: the PTC repartitions the dataset state under data parallelism and exposes it to DL workers through a virtual file system; and the PTC obtains the model state as partitioned checkpoints and transforms them to reflect the new parallelization configuration.
1 code implementation • 4 Dec 2023 • Qiaole Dong, Bo Zhao, Yanwei Fu
Recently, Google proposes DDVM which for the first time demonstrates that a general diffusion model for image-to-image translation task works impressively well on optical flow estimation task without any specific designs like RAFT.
1 code implementation • 22 Nov 2023 • Yuxin Du, Fan Bai, Tiejun Huang, Bo Zhao
Precise image segmentation provides clinical study with instructive information.
no code implementations • 22 Oct 2023 • Xiao Ma, Guang Zheng, Chi Xu, L. Monika Moskal, Peng Gong, Qinghua Guo, Huabing Huang, Xuecao Li, Yong Pang, Cheng Wang, Huan Xie, Bailang Yu, Bo Zhao, Yuyu Zhou
Our results revealed that the estimated method of building height samples based on the GEDI data was effective with 0. 78 of Pearson's r and 3. 67 m of RMSE in comparison to the reference data.
1 code implementation • 16 Oct 2023 • Jianhao Yuan, Jie Zhang, Shuyang Sun, Philip Torr, Bo Zhao
Synthetic training data has gained prominence in numerous learning tasks and scenarios, offering advantages such as dataset augmentation, generalization evaluation, and privacy preservation.
no code implementations • 28 Sep 2023 • Evan Scope Crafts, Xianyang Zhang, Bo Zhao
The Bayesian Cram\'er-Rao bound (CRB) provides a lower bound on the mean square error of any Bayesian estimator under mild regularity conditions.
1 code implementation • 17 Jul 2023 • Shiye Lei, Hao Chen, Sen Zhang, Bo Zhao, DaCheng Tao
With the rapid development of Artificial Intelligence Generated Content (AIGC), it has become common practice in many learning tasks to train or fine-tune large models on synthetic data due to the data-scarcity and privacy leakage problems.
2 code implementations • 9 Jul 2023 • Bo Zhao, Boya Wu, Muyang He, Tiejun Huang
Thanks to the emerging of foundation models, the large language and vision models are integrated to acquire the multimodal ability of visual captioning, question answering, etc.
no code implementations • 28 Jun 2023 • Jie Zhang, Xiaohua Qi, Bo Zhao
Thanks to the emerging foundation generative models, we propose a novel federated learning framework, namely Federated Generative Learning.
no code implementations • 20 Jun 2023 • Yu Wang, Xuelin Qian, Jingyang Huo, Tiejun Huang, Bo Zhao, Yanwei Fu
Through the adaptation of the Auto-Regressive model and the utilization of large language models, we have developed a remarkable model with an astounding 3. 6 billion trainable parameters, establishing it as the largest 3D shape generation model to date, named Argus-3D.
1 code implementation • 8 Jun 2023 • Muyang He, Shuo Yang, Tiejun Huang, Bo Zhao
The state of the art of many learning tasks, e. g., image classification, is advanced by collecting larger datasets and then training larger models on them.
1 code implementation • NeurIPS 2023 • Salva Rühling Cachay, Bo Zhao, Hailey Joren, Rose Yu
While diffusion models can successfully generate data and make predictions, they are predominantly designed for static images.
1 code implementation • 22 May 2023 • Bo Zhao, Robert M. Gower, Robin Walters, Rose Yu
Finally, we show that integrating teleportation into a wide range of optimization algorithms and optimization-based meta-learning improves convergence.
no code implementations • 18 May 2023 • Hengfa Lu, Huihui Ye, Lawrence L. Wald, Bo Zhao
To address this problem, we present a new image reconstruction method for MR Fingerprinting, integrating low-rank and subspace modeling with a deep generative prior.
no code implementations • 15 May 2023 • Xuanchen Li, Yan Niu, Bo Zhao, Haoyuan Shi, Zitong An
In both applications, our model substantially alleviates artifacts such as Moir\'e and over-smoothness at similar or lower computational cost to currently top-performing models, as validated by diverse evaluations.
2 code implementations • CVPR 2023 • Lei Zhang, Jie Zhang, Bowen Lei, Subhabrata Mukherjee, Xiang Pan, Bo Zhao, Caiwen Ding, Yao Li, Dongkuan Xu
Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but efficient synthetic training datasets from large ones.
no code implementations • journal 2022 • Bo Zhao
While there is a new and rapidly growing literature on the effects of climatic factors on economic and social outcomes, little research has been conducted to understand the fiscal impact of weather, especially at the sub-state level.
1 code implementation • 11 Nov 2022 • Xuan Rao, Bo Zhao, Xiaosong Yi, Derong Liu
In neural architecture search (NAS) methods based on latent space optimization (LSO), a deep generative model is trained to embed discrete neural architectures into a continuous latent space.
1 code implementation • 31 Oct 2022 • Bo Zhao, Iordan Ganev, Robin Walters, Rose Yu, Nima Dehmamy
Empirical studies of the loss landscape of deep networks have revealed that many local minima are connected through low-loss valleys.
no code implementations • 16 Oct 2022 • Hui Liu, Bo Zhao, Kehuan Zhang, Peng Liu
In this paper, we propose an AutoEncoder-based Adversarial Examples (AEAE) detector, that can guard DNN models by detecting adversarial examples with low computation in an unsupervised manner.
no code implementations • 3 Oct 2022 • Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi, Yaodong Yang, Peter Pietzuch, Lei Chen
Yet, current distributed RL systems tie the definition of RL algorithms to their distributed execution: they hard-code particular distribution strategies and only accelerate specific parts of the computation (e. g. policy network updates) on GPU workers.
1 code implementation • 14 Jul 2022 • Hui Shi, Yupeng Gu, Yitong Zhou, Bo Zhao, Sicun Gao, Jishen Zhao
In this paper, we propose the Multi-Interest Preference (MIP) model, an approach that not only produces multi-interest for users by using the user's sequential engagement more effectively but also automatically learns a set of weights to represent the preference over each embedding so that the candidates can be retrieved from each interest proportionally.
1 code implementation • 17 Jun 2022 • Peter Eckmann, Kunyang Sun, Bo Zhao, Mudong Feng, Michael K. Gilson, Rose Yu
We corroborate these docking-based results with more accurate molecular dynamics-based calculations of absolute binding free energy and show that one of our generated drug-like compounds has a predicted $K_D$ (a measure of binding affinity) of $6 \cdot 10^{-14}$ M against the human estrogen receptor, well beyond the affinities of typical early-stage drug candidates and most FDA-approved drugs to their respective targets.
1 code implementation • 1 Jun 2022 • Tian Dong, Bo Zhao, Lingjuan Lyu
In this work, we for the first time identify that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free.
no code implementations • 23 May 2022 • Kai Wang, Bo Zhao, Xiangyu Peng, Zheng Zhu, Jiankang Deng, Xinchao Wang, Hakan Bilen, Yang You
Firstly, randomly masked face images are used to train the reconstruction module in FaceMAE.
1 code implementation • 21 May 2022 • Bo Zhao, Nima Dehmamy, Robin Walters, Rose Yu
Experimentally, we show that teleportation improves the convergence speed of gradient descent and AdaGrad for several optimization problems including test functions, multi-layer regressions, and MNIST classification.
1 code implementation • 18 Apr 2022 • Chengcheng Guo, Bo Zhao, Yanbing Bai
Coreset selection, which aims to select a subset of the most informative training samples, is a long-standing learning problem that can benefit many downstream tasks such as data-efficient learning, continual learning, neural architecture search, active learning, etc.
3 code implementations • 15 Apr 2022 • Bo Zhao, Hakan Bilen
However, traditional GANs generated images are not as informative as the real training samples when being used to train deep neural networks.
2 code implementations • CVPR 2022 • Kai Wang, Bo Zhao, Xiangyu Peng, Zheng Zhu, Shuo Yang, Shuo Wang, Guan Huang, Hakan Bilen, Xinchao Wang, Yang You
Dataset condensation aims at reducing the network training effort through condensing a cumbersome training set into a compact synthetic one.
no code implementations • 4 Jan 2022 • Hui Liu, Bo Zhao, Yuefeng Peng, Weidong Li, Peng Liu
Experimental results show that the contribution of image transformations to adversarial detection is significantly different, the combination of them can significantly improve the generic detection ability against state-of-the-art adversarial attacks.
no code implementations • 13 Dec 2021 • Junjun Hu, Yanhao Zhu, Bo Zhao, Jiexin Zheng, Chenxu Zhao, Xiangyu Zhu, Kangle Wu, Darun Tang
One of the challenges of logo recognition lies in the diversity of forms, such as symbols, texts or a combination of both; further, logos tend to be extremely concise in design while similar in appearance, suggesting the difficulty of learning discriminative representations.
no code implementations • 9 Dec 2021 • Zi Wang, Chen Qian, Di Guo, Hongwei Sun, Rushuai Li, Bo Zhao, Xiaobo Qu
Deep learning has shown astonishing performance in accelerated magnetic resonance imaging (MRI).
4 code implementations • 8 Oct 2021 • Bo Zhao, Hakan Bilen
Computational cost of training state-of-the-art deep models in many learning problems is rapidly increasing due to more sophisticated models and larger datasets.
Ranked #8 on
Dataset Distillation - 1IPC
on TinyImageNet
no code implementations • 24 Jul 2021 • Xinlin Zhang, Hengfa Lu, Di Guo, Zongying Lai, Huihui Ye, Xi Peng, Bo Zhao, Xiaobo Qu
The combination of the sparse sampling and the low-rank structured matrix reconstruction has shown promising performance, enabling a significant reduction of the magnetic resonance imaging data acquisition time.
no code implementations • 19 Jul 2021 • Hui Liu, Bo Zhao, Minzhi Ji, Yuefeng Peng, Jiabao Guo, Peng Liu
In this paper, we reveal that imperceptible adversarial examples are the product of recessive features misleading neural networks, and an adversarial attack is essentially a kind of method to enrich these recessive features in the image.
no code implementations • 16 Apr 2021 • Bo Zhao, Peng Sun, Liming Fang, Tao Wang, Keyu Jiang
The results demonstrate its effectiveness and superior performance compared to the state-of-the-art Byzantine-robust schemes in defending against typical data poisoning and model poisoning attacks under practical Non-IID data distributions.
no code implementations • 18 Mar 2021 • James Fox, Bo Zhao, Sivasankaran Rajamanickam, Rampi Ramprasad, Le Song
Learning 3D representations that generalize well to arbitrarily oriented inputs is a challenge of practical importance in applications varying from computer vision to physics and chemistry.
no code implementations • 8 Mar 2021 • Zhenhuan Huang, Xiaoyue Duan, Bo Zhao, Jinhu Lü, Baochang Zhang
We propose an Interpretable Attention Guided Network (IAGN) for fine-grained visual classification.
2 code implementations • 16 Feb 2021 • Bo Zhao, Hakan Bilen
In many machine learning problems, large-scale datasets have become the de-facto standard to train state-of-the-art deep networks at the price of heavy computation load.
Ranked #5 on
Dataset Distillation - 1IPC
on CUB-200-2011
no code implementations • 5 Jan 2021 • Jue Nan, Jian Lin, Yuchen Luo, Bo Zhao, Xiaopeng Li
Its feasibility has been demonstrated with numerical simulations of the adiabatic preparation for certain incommensurate particle-doping fractions, where the major problem to circumvent is the atomic localization in the incommensurate lattice.
Quantum Gases Strongly Correlated Electrons Quantum Physics
no code implementations • 31 Dec 2020 • Mahsa Ghasemi, Evan Scope Crafts, Bo Zhao, Ufuk Topcu
In planning problems, it is often challenging to fully model the desired specifications.
no code implementations • 23 Dec 2020 • Zichang He, Bo Zhao, Zheng Zhang
In this paper, we introduce an active low-rank tensor model for fast MR imaging.
1 code implementation • 14 Oct 2020 • Hui Liu, Bo Zhao, Minzhi Ji, Peng Liu
Adversarial examples are well-designed input samples, in which perturbations are imperceptible to the human eyes, but easily mislead the output of deep neural networks (DNNs).
2 code implementations • 27 Aug 2020 • Ke Ma, Bo Zhao, Leonid Sigal
Also, the generated images from our model have higher resolution, object classification accuracy and consistency, as compared to the previous state-of-the-art.
no code implementations • 15 Jul 2020 • Baoming Yan, Chen Zhou, Bo Zhao, Kan Guo, Jiang Yang, Xiaobo Li, Ming Zhang, Yizhou Wang
Finally, the model learns to compare global and local features separately, i. e., in two paths, before merging the similarities.
no code implementations • 7 Jul 2020 • Aditya Pal, Chantat Eksombatchai, Yitong Zhou, Bo Zhao, Charles Rosenberg, Jure Leskovec
Latent user representations are widely adopted in the tech industry for powering personalized recommender systems.
no code implementations • 15 Jun 2020 • Ce Ju, Ruihui Zhao, Jichao Sun, Xiguang Wei, Bo Zhao, Yang Liu, Hongshan Li, Tianjian Chen, Xinwei Zhang, Dashan Gao, Ben Tan, Han Yu, Chuning He, Yuan Jin
It adopts federated averaging during the model training process, without patient data being taken out of the hospitals during the whole process of model training and forecasting.
5 code implementations • ICLR 2021 • Bo Zhao, Konda Reddy Mopuri, Hakan Bilen
As the state-of-the-art machine learning methods in many fields rely on larger datasets, storing datasets and training models on them become significantly more expensive.
Ranked #4 on
Dataset Distillation - 1IPC
on CUB-200-2011
1 code implementation • 8 Jun 2020 • Bo Zhao, Shixiang Tang, Dapeng Chen, Hakan Bilen, Rui Zhao
With the explosion of digital data in recent years, continuously learning new tasks from a stream of data without forgetting previously acquired knowledge has become increasingly important.
3 code implementations • 8 Jan 2020 • Bo Zhao, Konda Reddy Mopuri, Hakan Bilen
Particularly, our approach can certainly extract the ground-truth labels as opposed to DLG, hence we name it Improved DLG (iDLG).
3 code implementations • 21 Oct 2019 • Polina Zablotskaia, Aliaksandr Siarohin, Bo Zhao, Leonid Sigal
In this paper, we focus on human motion transfer - generation of a video depicting a particular subject, observed in a single image, performing a series of motions exemplified by an auxiliary (driving) video.
no code implementations • CVPR 2019 • Bo Zhao, Lili Meng, Weidong Yin, Leonid Sigal
The representation of each object is disentangled into a specified/certain part (category) and an unspecified/uncertain part (appearance).
Ranked #2 on
Layout-to-Image Generation
on Visual Genome 64x64
no code implementations • 1 Oct 2018 • Lili Meng, Bo Zhao, Bo Chang, Gao Huang, Wei Sun, Frederich Tung, Leonid Sigal
Inspired by the observation that humans are able to process videos efficiently by only paying attention where and when it is needed, we propose an interpretable and easy plug-in spatial-temporal attention mechanism for video action recognition.
no code implementations • ACL 2018 • Yang Li, Bo Zhao, Ariel Fuxman, Fangbo Tao
The framework takes the enterprise corpus as input and produces a high-quality acronym disambiguation system as output.
no code implementations • ICML 2018 • Bo Zhao, Xinwei Sun, Yanwei Fu, Yuan YAO, Yizhou Wang
To solve this task, $L_{1}$ regularization is widely used for the pursuit of feature selection and avoiding overfitting, and yet the sparse estimation of features in $L_{1}$ regularization may cause the underfitting of training data.
1 code implementation • 12 Apr 2018 • Bo Zhao, Yanwei Fu, Rui Liang, Jia-Hong Wu, Yonggang Wang, Yizhou Wang
In classical ZSL algorithms, attributes are introduced as the intermediate semantic representation to realize the knowledge transfer from seen classes to unseen classes.
2 code implementations • ECCV 2018 • Bo Zhao, Bo Chang, Zequn Jie, Leonid Sigal
Existing methods for multi-domain image-to-image translation (or generation) attempt to directly map an input image (or a random vector) to an image in one of the output domains.
no code implementations • CVPR 2018 • Zequn Jie, Pengfei Wang, Yonggen Ling, Bo Zhao, Yunchao Wei, Jiashi Feng, Wei Liu
Left-right consistency check is an effective way to enhance the disparity estimation by referring to the information from the opposite view.
no code implementations • 20 Nov 2017 • Bo Zhao, Xinwei Sun, Yuan YAO, Yizhou Wang
With the learned SRG, each unseen class prototype (cluster center) in the image feature space can be synthesized by the linear combination of other class prototypes, so that testing instances can be classified based on the distance to these synthesized prototypes.
3 code implementations • 17 Nov 2017 • Jiahong Wu, He Zheng, Bo Zhao, Yixin Li, Baoming Yan, Rui Liang, Wenjia Wang, Shipei Zhou, Guosen Lin, Yanwei Fu, Yizhou Wang, Yonggang Wang
Significant progress has been achieved in Computer Vision by leveraging large-scale image datasets.
no code implementations • 23 Oct 2017 • Bo Zhao, Justin P. Haldar, Congyu Liao, Dan Ma, Yun Jiang, Mark A. Griswold, Kawin Setsompop, Lawrence L. Wald
Magnetic resonance (MR) fingerprinting is a new quantitative imaging paradigm, which simultaneously acquires multiple MR tissue parameter maps in a single experiment.
Signal Processing
no code implementations • ICCV 2017 • Hao Liu, Jiashi Feng, Zequn Jie, Karlekar Jayashree, Bo Zhao, Meibin Qi, Jianguo Jiang, Shuicheng Yan
We investigate the problem of person search in the wild in this work.
Ranked #4 on
Person Re-Identification
on CUHK-SYSU
no code implementations • CVPR 2017 • Bo Zhao, Jiashi Feng, Xiao Wu, Shuicheng Yan
We introduce a new fashion search protocol where attribute manipulation is allowed within the interaction between users and search engines, e. g. manipulating the color attribute of the clothing from red to blue.
no code implementations • 17 Apr 2017 • Bo Zhao, Xiao Wu, Zhi-Qi Cheng, Hao liu, Zequn Jie, Jiashi Feng
This paper addresses a challenging problem -- how to generate multi-view cloth images from only a single view input.
no code implementations • 2 Dec 2016 • Bo Zhao, Botong Wu, Tianfu Wu, Yizhou Wang
This paper presents a method of zero-shot learning (ZSL) which poses ZSL as the missing data problem, rather than the missing label problem.
no code implementations • 28 Jun 2016 • Bo Zhao, Xiao Wu, Jiashi Feng, Qiang Peng, Shuicheng Yan
Fine-grained object classification is a challenging task due to the subtle inter-class difference and large intra-class variation.
no code implementations • 22 Sep 2014 • Xi Peng, Rui Yan, Bo Zhao, Huajin Tang, Zhang Yi
Although the methods achieve a higher recognition rate than the traditional SPM, they consume more time to encode the local descriptors extracted from the image.