Search Results for author: Liang Zeng

Found 17 papers, 7 papers with code

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

no code implementations11 Jul 2024 Liang Zeng, Liangjun Zhong, Liang Zhao, Tianwen Wei, Liu Yang, Jujie He, Cheng Cheng, Rui Hu, Yang Liu, Shuicheng Yan, Han Fang, Yahui Zhou

In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs).

GSM8K Math +1

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

no code implementations20 Jun 2024 Chaojie Wang, Yanchen Deng, Zhiyi Lyu, Liang Zeng, Jujie He, Shuicheng Yan, Bo An

Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.

GSM8K Math

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

1 code implementation3 Jun 2024 Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, XiaoYu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts.

Language Modelling Large Language Model

KeNet:Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification

no code implementations4 Mar 2024 Bo Li, Yuyan Chen, Liang Zeng

It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC.

Information Retrieval Multi Label Text Classification +4

When Spatio-Temporal Meet Wavelets: Disentangled Traffic Forecasting via Efficient Spectral Graph Attention Networks

1 code implementation IEEE 39th International Conference on Data Engineering (ICDE) 2023 Yuchen Fang, Yanjun Qin, Haiyong Luo, Fang Zhao, Bingbing Xu, Liang Zeng, Chenxing Wang

To capture these intricate dependencies, spatio-temporal networks, such as recurrent neural networks with graph convolution networks, graph convolution networks with temporal convolution networks, and temporal attention networks with full graph attention networks, are applied.

Graph Attention Traffic Prediction

MolKD: Distilling Cross-Modal Knowledge in Chemical Reactions for Molecular Property Prediction

no code implementations3 May 2023 Liang Zeng, Lanqing Li, Jian Li

This paper studies this problem and proposes to incorporate chemical domain knowledge, specifically related to chemical reactions, for learning effective molecular representations.

Drug Discovery Molecular Property Prediction +1

Copy-Pasting Coherent Depth Regions Improves Contrastive Learning for Urban-Scene Segmentation

1 code implementation25 Nov 2022 Liang Zeng, Attila Lengyel, Nergis Tömen, Jan van Gemert

For unsupervised semantic segmentation of urban scenes, our method surpasses the previous state-of-the-art baseline by +7. 14% in mIoU on Cityscapes and +6. 65% on KITTI.

Contrastive Learning Depth Estimation +3

ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification

no code implementations23 May 2022 Liang Zeng, Lanqing Li, Ziqi Gao, Peilin Zhao, Jian Li

Motivated by this observation, we propose a principled GCL framework on Imbalanced node classification (ImGCL), which automatically and adaptively balances the representations learned from GCL without labels.

Classification Contrastive Learning +2

Spatio-Temporal meets Wavelet: Disentangled Traffic Flow Forecasting via Efficient Spectral Graph Attention Network

no code implementations6 Dec 2021 Yuchen Fang, Yanjun Qin, Haiyong Luo, Fang Zhao, Bingbing Xu, Chenxing Wang, Liang Zeng

Traffic forecasting is crucial for public safety and resource optimization, yet is very challenging due to three aspects: i) current existing works mostly exploit intricate temporal patterns (e. g., the short-term thunderstorm and long-term daily trends) within a single method, which fail to accurately capture spatio-temporal dependencies under different schemas; ii) the under-exploration of the graph positional encoding limit the extraction of spatial information in the commonly used full graph attention network; iii) the quadratic complexity of the full graph attention introduces heavy computational needs.

Graph Attention Time Series Analysis

CDGNet: A Cross-Time Dynamic Graph-based Deep Learning Model for Traffic Forecasting

no code implementations6 Dec 2021 Yuchen Fang, Yanjun Qin, Haiyong Luo, Fang Zhao, Liang Zeng, Bo Hui, Chenxing Wang

Besides, we propose a novel encoder-decoder architecture to incorporate the cross-time dynamic graph-based GCN for multi-step traffic forecasting.

Decoder

Inductive Matrix Completion Using Graph Autoencoder

2 code implementations25 Aug 2021 Wei Shen, Chuheng Zhang, Yun Tian, Liang Zeng, Xiaonan He, Wanchun Dou, Xiaolong Xu

However, without node content (i. e., side information) for training, the user (or item) specific representation can not be learned in the inductive setting, that is, a model trained on one group of users (or items) cannot adapt to new users (or items).

Graph Neural Network Matrix Completion

Trade When Opportunity Comes: Price Movement Forecasting via Locality-Aware Attention and Iterative Refinement Labeling

no code implementations26 Jul 2021 Liang Zeng, Lei Wang, Hui Niu, Ruchen Zhang, Ling Wang, Jian Li

Price movement forecasting, aimed at predicting financial asset trends based on current market information, has achieved promising advancements through machine learning (ML) methods.

Metric Learning Time Series Analysis

AKE-GNN: Effective Graph Learning with Adaptive Knowledge Exchange

no code implementations10 Jun 2021 Liang Zeng, Jin Xu, Zijun Yao, Yanqiao Zhu, Jian Li

In this paper, we propose to substitute these redundant channels with other informative channels to achieve this goal.

Graph Classification Graph Learning +4

Context-Aware Sparse Deep Coordination Graphs

1 code implementation ICLR 2022 Tonghan Wang, Liang Zeng, Weijun Dong, Qianlan Yang, Yang Yu, Chongjie Zhang

Learning sparse coordination graphs adaptive to the coordination dynamics among agents is a long-standing problem in cooperative multi-agent learning.

graph construction Graph Learning +2

AnatomyNet: Deep Learning for Fast and Fully Automated Whole-volume Segmentation of Head and Neck Anatomy

2 code implementations15 Aug 2018 Wentao Zhu, Yufang Huang, Liang Zeng, Xuming Chen, Yong liu, Zhen Qian, Nan Du, Wei Fan, Xiaohui Xie

Methods: Our deep learning model, called AnatomyNet, segments OARs from head and neck CT images in an end-to-end fashion, receiving whole-volume HaN CT images as input and generating masks of all OARs of interest in one shot.

3D Medical Imaging Segmentation Anatomy

Cannot find the paper you are looking for? You can Submit a new open access paper.