1 code implementation • 12 Mar 2025 • Zhaoling Chen, Xiangru Tang, Gangda Deng, Fang Wu, Jialong Wu, Zhiwei Jiang, Viktor Prasanna, Arman Cohan, Xingyao Wang
By parsing codebases into directed heterogeneous graphs, LocAgent creates a lightweight representation that captures code structures (files, classes, functions) and their dependencies (imports, invocations, inheritance), enabling LLM agents to effectively search and locate relevant entities through powerful multi-hop reasoning.
1 code implementation • 10 Mar 2025 • Xiangru Tang, Daniel Shao, Jiwoong Sohn, Jiapeng Chen, Jiayi Zhang, Jinyu Xiang, Fang Wu, Yilun Zhao, Chenglin Wu, Wenqi Shi, Arman Cohan, Mark Gerstein
Large Language Models (LLMs) have shown impressive performance on existing medical question-answering benchmarks.
no code implementations • 11 Feb 2025 • Zicheng Liu, Siyuan Li, ZhiYuan Chen, Lei Xin, Fang Wu, Chang Yu, Qirong Yang, Yucheng Guo, Yujie Yang, Stan Z. Li
In this paper, we follow the guidance of the central dogma to redesign both the data and model pipeline and offer a comprehensive framework, Life-Code, that spans different biological functions.
no code implementations • 16 Oct 2024 • Zerui Xu, Fang Wu, Yuanyuan Zhang, Yue Zhao
Despite the advancements of large language models (LLMs) in general generation tasks, their potential in facilitating the generation of synthetic clinical trials is under-explored.
1 code implementation • 15 Jun 2024 • Yijun Liu, Yuan Meng, Fang Wu, Shenhao Peng, Hang Yao, Chaoyu Guan, Chen Tang, Xinzhu Ma, Zhi Wang, Wenwu Zhu
Based on this benchmark, we conduct extensive experiments with two well-known LLMs (English and Chinese) and four quantization algorithms to investigate this topic in-depth, yielding several counter-intuitive and valuable findings, e. g., models quantized using a calibration set with the same distribution as the test data are not necessarily optimal.
1 code implementation • 13 Feb 2024 • Xiangru Tang, Howard Dai, Elizabeth Knight, Fang Wu, Yunyang Li, Tianxiao Li, Mark Gerstein
Within each theme, we identify a variety of subtasks and applications, highlighting important datasets, benchmarks, and model architectures and comparing the performance of top models.
1 code implementation • 4 Oct 2023 • Siyuan Li, Weiyang Jin, Zedong Wang, Fang Wu, Zicheng Liu, Cheng Tan, Stan Z. Li
The main challenge is how to distinguish high-quality pseudo labels against the confirmation bias.
1 code implementation • 8 Apr 2023 • Fang Wu, Huiling Qin, Siyuan Li, Stan Z. Li, Xianyuan Zhan, Jinbo Xu
In the field of artificial intelligence for science, it is consistently an essential challenge to face a limited amount of labeled data for real-world problems.
1 code implementation • 7 Jan 2023 • Fang Wu, Siyuan Li, Xurui Jin, Yinghui Jiang, Dragomir Radev, Zhangming Niu, Stan Z. Li
It takes advantage of MatchExplainer to fix the most informative portion of the graph and merely operates graph augmentations on the rest less informative part.
1 code implementation • 7 Dec 2022 • Fang Wu, Lirong Wu, Dragomir Radev, Jinbo Xu, Stan Z. Li
Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area.
3 code implementations • 27 May 2022 • Siyuan Li, Di wu, Fang Wu, Zelin Zang, Stan. Z. Li
We then propose an Architecture-Agnostic Masked Image Modeling framework (A$^2$MIM), which is compatible with both Transformers and CNNs in a unified way.
1 code implementation • 15 May 2022 • Fang Wu, Siyuan Li, Lirong Wu, Dragomir Radev, Stan Z. Li
Graph neural networks (GNNs) mainly rely on the message-passing paradigm to propagate node features and build interactions, and different graph learning tasks require different ranges of node interactions.
no code implementations • 19 Apr 2022 • Fang Wu, Stan Z. Li
To waive this requirement, we propose a novel model called DiffMD by directly estimating the gradient of the log density of molecular conformations.
1 code implementation • 13 Feb 2022 • Fang Wu, Nicolas Courty, Shuting Jin, Stan Z. Li
Training data are usually limited or heterogeneous in many chemical and biological applications.
2 code implementations • 4 Oct 2021 • Fang Wu, Dragomir Radev, Stan Z. Li
Then HMGs are constructed with both atom-level and motif-level nodes.
2 code implementations • 28 Mar 2021 • Fang Wu, Stan Z. Li
Sentence insertion is an interesting NLP problem but received insufficient attention.