LLaMA Pro: Progressive LLaMA with Block Expansion

1 code implementation4 Jan 2024 Chengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ping Luo, Ying Shan

Humans generally acquire new skills without compromising the old; however, the opposite holds for Large Language Models (LLMs), e. g., from LLaMA to CodeLLaMA.

Instruction Following Math

Binary Embedding-based Retrieval at Tencent

1 code implementation17 Feb 2023 Yukang Gan, Yixiao Ge, Chang Zhou, Shupeng Su, Zhouchuan Xu, Xuyuan Xu, Quanchao Hui, Xiang Chen, Yexin Wang, Ying Shan

To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension.

Binarization Retrieval

Cross-Modal Attentional Context Learning for RGB-D Object Detection

no code implementations30 Oct 2018 Guanbin Li, Yukang Gan, Hejun Wu, Nong Xiao, Liang Lin

In this paper, we address this problem by developing a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and depth data.

Autonomous Driving Object +2

Monocular Depth Estimation with Affinity, Vertical Pooling, and Label Enhancement

no code implementations ECCV 2018 Yukang Gan, Xiangyu Xu, Wenxiu Sun, Liang Lin

While significant progress has been made in monocular depth estimation with Convolutional Neural Networks (CNNs) extracting absolute features, such as edges and textures, the depth constraint of neighboring pixels, namely relative features, has been mostly ignored by recent methods.

Monocular Depth Estimation Stereo Matching +1

Knowledge-Guided Recurrent Neural Network Learning for Task-Oriented Action Prediction

no code implementations15 Jul 2017 Liang Lin, Lili Huang, Tianshui Chen, Yukang Gan, Hui Cheng

This paper aims at task-oriented action prediction, i. e., predicting a sequence of actions towards accomplishing a specific task under a certain scene, which is a new problem in computer vision research.

Common Sense Reasoning valid

LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling

1 code implementation18 Apr 2016 Zhen Li, Yukang Gan, Xiaodan Liang, Yizhou Yu, Hui Cheng, Liang Lin

Another long short-term memorized fusion layer is set up to integrate the contexts along the vertical direction from different channels, and perform bi-directional propagation of the fused vertical contexts along the horizontal direction to obtain true 2D global contexts.

Scene Labeling

