no code implementations • 30 Jan 2025 • Yue Wang, Qiuzhi Liu, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Linfeng Song, Dian Yu, Juntao Li, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu
To address underthinking, we propose a decoding strategy with thought switching penalty TIP that discourages premature transitions between thoughts, encouraging deeper exploration of each reasoning path.
no code implementations • 30 Dec 2024 • Xingyu Chen, Jiahao Xu, Tian Liang, Zhiwei He, Jianhui Pang, Dian Yu, Linfeng Song, Qiuzhi Liu, Mengfei Zhou, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu
The remarkable performance of models like the OpenAI o1 can be attributed to their ability to emulate human-like long-time thinking during inference.
no code implementations • 22 Dec 2024 • Dian Yu, Yuheng Zhang, Jiahao Xu, Tian Liang, Linfeng Song, Zhaopeng Tu, Haitao Mi, Dong Yu
We propose CaP, a novel approach that uses external tools to refine chain-of-thought (CoT) responses generated by the same or other LLMs.
1 code implementation • 16 Dec 2024 • Longyue Wang, Siyou Liu, Chenyang Lyu, Wenxiang Jiao, Xing Wang, Jiahao Xu, Zhaopeng Tu, Yan Gu, WeiYu Chen, Minghao Wu, Liting Zhou, Philipp Koehn, Andy Way, Yulin Yuan
Following last year, we have continued to host the WMT translation shared task this year, the second edition of the Discourse-Level Literary Translation.
1 code implementation • 29 Nov 2024 • Zicheng Lin, Tian Liang, Jiahao Xu, Qiuzhi Lin, Xing Wang, Ruilin Luo, Chufan Shi, Siheng Li, Yujiu Yang, Zhaopeng Tu
Our results underscore the potential of leveraging critical tokens to reduce errors in reasoning tasks, advancing the development of AI systems capable of robust logical deduction.
1 code implementation • 27 Nov 2024 • Ziyin Zhang, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Rui Wang, Zhaopeng Tu
Speculative Decoding (SD) has become an important technique in accelerating the inference speed of large language models.
1 code implementation • 1 Nov 2024 • Jiahao Xu, Zikai Zhang, Rui Hu
Inspired by this, we propose MASA, a method that utilizes individual unlearning on local models to identify malicious models in FL.
no code implementations • 14 Oct 2024 • Zikai Zhang, Jiahao Xu, Ping Liu, Rui Hu
Specifically, Federated FMs (FedFMs) fine-tuning using low-rank adaptation (LoRA) modules instead of the full model over multiple clients can achieve both parameter efficiency and data privacy.
1 code implementation • 2 Sep 2024 • Jiahao Xu, Zikai Zhang, Rui Hu
To address these challenges, we propose the Layer-Adaptive Sparsified Model Aggregation (LASA) approach, which combines pre-aggregation sparsification with layer-wise adaptive aggregation to improve robustness.
2 code implementations • 12 Jul 2024 • Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Jiahao Xu, Tian Liang, Pinjia He, Zhaopeng Tu
DeRTa incorporates two novel components: (1) Maximum Likelihood Estimation (MLE) with Harmful Response Prefix, which trains models to recognize and avoid unsafe content by appending a segment of harmful response to the beginning of a safe response, and (2) Reinforced Transition Optimization (RTO), which equips models with the ability to transition from potential harm to safety refusal consistently throughout the harmful response sequence.
1 code implementation • 6 May 2024 • Shuhao Mei, Xin Li, Yuxi Zhou, Jiahao Xu, Yong Zhang, Yuxuan Wan, Shan Cao, Qinghao Zhao, Shijia Geng, Junqing Xie, ShengYong Chen, Shenda Hong
Chronic Obstructive Pulmonary Disease (COPD) is a chronic lung condition characterized by airflow obstruction.
1 code implementation • 26 Mar 2024 • Yilun Zheng, Jiahao Xu, Lihui Chen
Under circumstances of heterophily, where nodes with different labels tend to be connected based on semantic meanings, Graph Neural Networks (GNNs) often exhibit suboptimal performance.
Ranked #22 on
Node Classification
on Texas
1 code implementation • 20 Oct 2023 • Jiahao Xu, Wei Shao, Lihui Chen, Lemao Liu
This paper proposes the DistillCSE framework, which performs contrastive learning under the self-training paradigm with knowledge distillation.
1 code implementation • NAACL 2022 • Jiahao Xu, Yubin Ruan, Wei Bi, Guoping Huang, Shuming Shi, Lihui Chen, Lemao Liu
Back translation (BT) is one of the most significant technologies in NMT research fields.
1 code implementation • 15 Sep 2023 • Yancheng Cai, Bo Zhang, Baopu Li, Tao Chen, Hongliang Yan, Jingdong Zhang, Jiahao Xu
Therefore, we focus on cross-domain background feature alignment while minimizing the influence of foreground features on the cross-domain alignment stage.
no code implementations • 22 May 2023 • Jiahao Xu, Wei Shao, Lihui Chen, Lemao Liu
This paper improves contrastive learning for sentence embeddings from two perspectives: handling dropout noise and addressing feature corruption.
1 code implementation • CVPR 2023 • Divya Saxena, Jiannong Cao, Jiahao Xu, Tarun Kulshrestha
Re-GAN stabilizes the GANs models with less data and offers an alternative to the existing GANs tickets and progressive growing methods.
no code implementations • 20 May 2022 • Jiahao Xu, Zihuai Lin
Second, we investigate some deep learning models based on CNN (ResNet34, hierarchical structure) and other deep learning models (LSTM, CLDNN).
no code implementations • 28 Oct 2020 • Xiaoyu Kou, Yankai Lin, Yuntao Li, Jiahao Xu, Peng Li, Jie zhou, Yan Zhang
Knowledge graph embedding (KGE), aiming to embed entities and relations into low-dimensional vectors, has attracted wide attention recently.
1 code implementation • 23 Sep 2020 • Xinyi Zhang, Jiahao Xu, Charlie Soh, Lihui Chen
In this paper, we propose a Label-based Attention for Hierarchical Mutlti-label Text Classification Neural Network (LA-HCN), where the novel label-based attention module is designed to hierarchically extract important information from the text based on the labels from different hierarchy levels.
Multi Label Text Classification
Multi-Label Text Classification
+1