no code implementations • 19 Feb 2024 • Zhengfu He, Xuyang Ge, Qiong Tang, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu
Sparse dictionary learning has been a rapidly growing technique in mechanistic interpretability to attack superposition and extract more human-understandable features from model activations.
1 code implementation • 24 Jan 2024 • Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, ShiMin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu
To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets.
1 code implementation • 28 Nov 2022 • Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu
We present DiffusionBERT, a new generative masked language model based on discrete diffusion models.
1 code implementation • 14 Oct 2022 • Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang
MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks.
1 code implementation • 23 May 2022 • Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.
no code implementations • 13 Dec 2021 • Ximing Yang, Zhengfu He, Cheng Jin
As details are missing in most representations of structures, the lack of controllability to details is one of the major weaknesses in structure-based controllable point cloud generation.