no code implementations • 11 Jul 2024 • Liang Zeng, Liangjun Zhong, Liang Zhao, Tianwen Wei, Liu Yang, Jujie He, Cheng Cheng, Rui Hu, Yang Liu, Shuicheng Yan, Han Fang, Yahui Zhou
In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs).
no code implementations • 20 Jun 2024 • Chaojie Wang, Yanchen Deng, Zhiyi Lyu, Liang Zeng, Jujie He, Shuicheng Yan, Bo An
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
1 code implementation • 3 Jun 2024 • Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, XiaoYu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou
In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts.
no code implementations • 2 Jun 2024 • Liang Zhao, Tianwen Wei, Liang Zeng, Cheng Cheng, Liu Yang, Peng Cheng, Lijie Wang, Chenxia Li, Xuejie Wu, Bo Zhu, Yimeng Gan, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou
We introduce LongSkywork, a long-context Large Language Model (LLM) capable of processing up to 200, 000 tokens.
no code implementations • 4 Mar 2024 • Bo Li, Yuyan Chen, Liang Zeng
It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC.
1 code implementation • IEEE 39th International Conference on Data Engineering (ICDE) 2023 • Yuchen Fang, Yanjun Qin, Haiyong Luo, Fang Zhao, Bingbing Xu, Liang Zeng, Chenxing Wang
To capture these intricate dependencies, spatio-temporal networks, such as recurrent neural networks with graph convolution networks, graph convolution networks with temporal convolution networks, and temporal attention networks with full graph attention networks, are applied.
Ranked #2 on Traffic Prediction on PeMSD8
no code implementations • 3 May 2023 • Liang Zeng, Lanqing Li, Jian Li
This paper studies this problem and proposes to incorporate chemical domain knowledge, specifically related to chemical reactions, for learning effective molecular representations.
1 code implementation • 25 Nov 2022 • Liang Zeng, Attila Lengyel, Nergis Tömen, Jan van Gemert
For unsupervised semantic segmentation of urban scenes, our method surpasses the previous state-of-the-art baseline by +7. 14% in mIoU on Cityscapes and +6. 65% on KITTI.
1 code implementation • 16 Sep 2022 • Lanqing Li, Liang Zeng, Ziqi Gao, Shen Yuan, Yatao Bian, Bingzhe Wu, Hengtong Zhang, Yang Yu, Chan Lu, Zhipeng Zhou, Hongteng Xu, Jia Li, Peilin Zhao, Pheng-Ann Heng
The last decade has witnessed a prosperous development of computational methods and dataset curation for AI-aided drug discovery (AIDD).
no code implementations • 23 May 2022 • Liang Zeng, Lanqing Li, Ziqi Gao, Peilin Zhao, Jian Li
Motivated by this observation, we propose a principled GCL framework on Imbalanced node classification (ImGCL), which automatically and adaptively balances the representations learned from GCL without labels.
no code implementations • 6 Dec 2021 • Yuchen Fang, Yanjun Qin, Haiyong Luo, Fang Zhao, Bingbing Xu, Chenxing Wang, Liang Zeng
Traffic forecasting is crucial for public safety and resource optimization, yet is very challenging due to three aspects: i) current existing works mostly exploit intricate temporal patterns (e. g., the short-term thunderstorm and long-term daily trends) within a single method, which fail to accurately capture spatio-temporal dependencies under different schemas; ii) the under-exploration of the graph positional encoding limit the extraction of spatial information in the commonly used full graph attention network; iii) the quadratic complexity of the full graph attention introduces heavy computational needs.
no code implementations • 6 Dec 2021 • Yuchen Fang, Yanjun Qin, Haiyong Luo, Fang Zhao, Liang Zeng, Bo Hui, Chenxing Wang
Besides, we propose a novel encoder-decoder architecture to incorporate the cross-time dynamic graph-based GCN for multi-step traffic forecasting.
2 code implementations • 25 Aug 2021 • Wei Shen, Chuheng Zhang, Yun Tian, Liang Zeng, Xiaonan He, Wanchun Dou, Xiaolong Xu
However, without node content (i. e., side information) for training, the user (or item) specific representation can not be learned in the inductive setting, that is, a model trained on one group of users (or items) cannot adapt to new users (or items).
Ranked #3 on Recommendation Systems on MovieLens 1M
no code implementations • 26 Jul 2021 • Liang Zeng, Lei Wang, Hui Niu, Ruchen Zhang, Ling Wang, Jian Li
Price movement forecasting, aimed at predicting financial asset trends based on current market information, has achieved promising advancements through machine learning (ML) methods.
no code implementations • 10 Jun 2021 • Liang Zeng, Jin Xu, Zijun Yao, Yanqiao Zhu, Jian Li
In this paper, we propose to substitute these redundant channels with other informative channels to achieve this goal.
1 code implementation • ICLR 2022 • Tonghan Wang, Liang Zeng, Weijun Dong, Qianlan Yang, Yang Yu, Chongjie Zhang
Learning sparse coordination graphs adaptive to the coordination dynamics among agents is a long-standing problem in cooperative multi-agent learning.
2 code implementations • 15 Aug 2018 • Wentao Zhu, Yufang Huang, Liang Zeng, Xuming Chen, Yong liu, Zhen Qian, Nan Du, Wei Fan, Xiaohui Xie
Methods: Our deep learning model, called AnatomyNet, segments OARs from head and neck CT images in an end-to-end fashion, receiving whole-volume HaN CT images as input and generating masks of all OARs of interest in one shot.