DisKnE is a benchmark for Disease Knowledge Evaluation built from MedNLI and MEDIQA-NLI. This benchmark is constructed to specifically test the medical reasoning capabilities of ML models, such as mapping symptoms to diseases.
The dataset was built by annotating each positive MedNLI example with the types of medical reasoning that are needed. Negative examples were created by corrupting these positive examples in an adversarial way. Furthermore, the training-test splits are defined per disease, ensuring that no knowledge about test diseases can be learned from the training data.
Paper | Code | Results | Date | Stars |
---|