TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Cross-Modal Retrieval	ChEBI-20	All-Ensemble	Mean Rank	20.21	# 2
Cross-Modal Retrieval	ChEBI-20	All-Ensemble	Test MRR	49.9	# 2
Cross-Modal Retrieval	ChEBI-20	All-Ensemble	Hits@1	34.4	# 2
Cross-Modal Retrieval	ChEBI-20	All-Ensemble	Hits@10	81.1	# 2
Cross-Modal Retrieval	ChEBI-20	GCN2	Mean Rank	41.90	# 4
Cross-Modal Retrieval	ChEBI-20	GCN2	Test MRR	37.1	# 4
Cross-Modal Retrieval	ChEBI-20	GCN2	Hits@1	22.3	# 4
Cross-Modal Retrieval	ChEBI-20	GCN2	Hits@10	68.9	# 3
Cross-Modal Retrieval	ChEBI-20	MLP1	Mean Rank	30.38	# 3
Cross-Modal Retrieval	ChEBI-20	MLP1	Test MRR	37.2	# 3
Cross-Modal Retrieval	ChEBI-20	MLP1	Hits@1	22.4	# 3
Cross-Modal Retrieval	ChEBI-20	MLP1	Hits@10	68.6	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/text2mol-cross-modal-molecule-retrieval-with/cross-modal-retrieval-on-chebi-20)](https://paperswithcode.com/sota/cross-modal-retrieval-on-chebi-20?p=text2mol-cross-modal-molecule-retrieval-with)`

Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries

EMNLP 2021 · Carl Edwards, ChengXiang Zhai, Heng Ji ·

We propose a new task, Text2Mol, to retrieve molecules using natural language descriptions as queries. Natural language and molecules encode information in very different ways, which leads to the exciting but challenging problem of integrating these two very different modalities. Although some work has been done on text-based retrieval and structure-based retrieval, this new task requires integrating molecules and natural language more directly. Moreover, this can be viewed as an especially challenging cross-lingual retrieval problem by considering the molecules as a language with a very unique grammar. We construct a paired dataset of molecules and their corresponding text descriptions, which we use to learn an aligned common semantic embedding space for retrieval. We extend this to create a cross-modal attention-based model for explainability and reranking by interpreting the attentions as association rules. We also employ an ensemble approach to integrate our different architectures, which significantly improves results from 0.372 to 0.499 MRR. This new multimodal approach opens a new perspective on solving problems in chemistry literature understanding and molecular machine learning.

PDF Abstract

Code

Add Remove Mark official

cnedwards/text2mol official

Tasks

Add Remove

Cross-Modal Retrieval

Natural Language Queries

Retrieval

Datasets

Introduced in the Paper:

ChEBI-20

Results from the Paper

Add Remove

Ranked #2 on Cross-Modal Retrieval on ChEBI-20

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Cross-Modal Retrieval	ChEBI-20	All-Ensemble	Mean Rank	20.21	# 2	Compare
			Test MRR	49.9	# 2	Compare
			Hits@1	34.4	# 2	Compare
			Hits@10	81.1	# 2	Compare
Cross-Modal Retrieval	ChEBI-20	GCN2	Mean Rank	41.90	# 4	Compare
			Test MRR	37.1	# 4	Compare
			Hits@1	22.3	# 4	Compare
			Hits@10	68.9	# 3	Compare
Cross-Modal Retrieval	ChEBI-20	MLP1	Mean Rank	30.38	# 3	Compare
			Test MRR	37.2	# 3	Compare
			Hits@1	22.4	# 3	Compare
			Hits@10	68.6	# 4	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove