Molecule Captioning
19 papers with code • 2 benchmarks • 2 datasets
Molecular description generation entails the creation of a detailed textual depiction illuminating the structure, properties, biological activity, and applications of a molecule based on its molecular descriptors. It furnishes chemists and biologists with a swift conduit to essential molecular information, thus efficiently guiding their research and experiments.
Most implemented papers
A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language
Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality.
MolFM: A Multimodal Molecular Foundation Model
In this study, we introduce MolFM, a multimodal molecular foundation model designed to facilitate joint representation learning from molecular structures, biomedical texts, and knowledge graphs.
Translation between Molecules and Natural Language
We present $\textbf{MolT5}$ $-$ a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings.
Graph-based Molecular Representation Learning
Recently, MRL has achieved considerable progress, especially in methods based on deep molecular graph learning.
Unifying Molecular and Textual Representations via Multi-task Language Modelling
Here, we propose the first multi-domain, multi-task language model that can solve a wide range of tasks in both the chemical and natural language domains.
Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective
In this work, we propose a novel LLM-based framework (MolReGPT) for molecule-caption translation, where an In-Context Few-Shot Molecule Learning paradigm is introduced to empower molecule discovery with LLMs like ChatGPT to perform their in-context learning capability without domain-specific pre-training and fine-tuning.
GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text
Large language models have made significant strides in natural language processing, enabling innovative applications in molecular science by processing textual representations of molecules.
From Artificially Real to Real: Leveraging Pseudo Data from Large Language Models for Low-Resource Molecule Discovery
Furthermore, our method shows a sustained improvement as the volume of pseudo data increases, revealing the great potential of pseudo data in advancing low-resource cross-modal molecule discovery.
BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations
Recent advancements in biological research leverage the integration of molecules, proteins, and natural language to enhance drug discovery.
MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter
MolCA enables an LM (e. g., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector.