no code implementations • 25 Nov 2023 • Haoran Zhao, Fengxing Pan, Huqiuyue Ping, Yaoming Zhou
In this study, we present a novel paradigm for industrial robotic embodied agents, encapsulating an 'agent as cerebrum, controller as cerebellum' architecture.
no code implementations • 18 Nov 2023 • Haoran Zhao, Jake Ryland Williams
While Large Language Models (LLMs) become ever more dominant, classic pre-trained word embeddings sustain their relevance through computational efficiency and nuanced linguistic interpretation.
no code implementations • 13 Nov 2023 • Jake Ryland Williams, Haoran Zhao
We will discuss a general result about feed-forward neural networks and then extend this solution to compositional (mult-layer) networks, which are applied to a simplified transformer block containing feed-forward and self-attention layers.
no code implementations • 13 Nov 2023 • Jake Ryland Williams, Haoran Zhao
Iterative differential approximation methods that rely upon backpropagation have enabled the optimization of neural networks; however, at present, they remain computationally expensive, especially when training models at scale.
1 code implementation • 11 Oct 2023 • Mingcheng Chen, Haoran Zhao, Yuxiang Zhao, Hulei Fan, Hongqiao Gao, Yong Yu, Zheng Tian
Data-driven black-box model-based optimization (MBO) problems arise in a great number of practical application scenarios, where the goal is to find a design over the whole space maximizing a black-box target function based on a static offline dataset.
1 code implementation • 8 Aug 2023 • Jiaju Lin, Haoran Zhao, Aochi Zhang, Yiting Wu, Huqiuyue Ping, Qin Chen
With ChatGPT-like large language models (LLM) prevailing in the community, how to evaluate the ability of LLMs is an open question.
1 code implementation • 11 Oct 2021 • Fei Zhou, Xin Sun, Junyu Dong, Haoran Zhao, Xiao Xiang Zhu
Although Convolution Neural Networks (CNNs) has made substantial progress in the low-light image enhancement task, one critical problem of CNNs is the paradox of model complexity and performance.
no code implementations • 21 Jun 2021 • Haoran Zhao, Xin Sun, Junyu Dong, Zihe Dong, Qiong Li
Recently, distillation approaches are suggested to extract general knowledge from a teacher network to guide a student network.
no code implementations • 12 Apr 2021 • Haoran Zhao, Xin Sun, Junyu Dong, Hui Yu, Huiyu Zhou
Then the generated samples are used to train the compact student network under the supervision of the teacher.
no code implementations • 18 Mar 2021 • Haoran Zhao, Kun Gong, Xin Sun, Junyu Dong, Hui Yu
The proposed approach promotes the performance of student model as the virtual sample created by multiple images produces a similar probability distribution in the teacher and student networks.
1 code implementation • 23 Jul 2019 • Haoran Zhao, Xin Sun, Junyu Dong, Changrui Chen, Zihe Dong
Knowledge distillation aims to train a compact student network by transferring knowledge from a larger pre-trained teacher model.