As part of our open-science mission, OpenADMET aims to curate and disseminate ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) data used to train our models for general use.
An easy and convenient way of sharing and accessing these datasets (largely borrowed from the geosciences) is via Intake catalogs. Intake is a lightweight, user-friendly data access tool that simplifies data discovery, loading, and sharing. See here for more information: https://intake.readthedocs.io/en/latest/index.html
This repository hosts Intake catalogs for various ADMET datasets curated by OpenADMET as well as the curation steps as implemented in openadmet_toolkit: https://github.com/OpenADMET/openadmet_toolkit
This repo is under very active development, we make no guarantees about the stability or correctness of any catalogs contained herein.
You can use the data contained here directly by downloading it or cloning the repo
- To use the
Intakecatalogs, install the required dependencies:
pip install intakeOpen a Python session or Jupyter session and load a catalog. Here we are loading a catalog of pChEMBL curated from ChEMBL for a set of targets with functionality available in
openadmet_toolkit:importintakecatalog=intake.open_catalog("https://github.com/OpenADMET/data-catalogs/blob/main/catalogs/activities/ChEMBL_pChEMBL_permissive/CATALOG_ChEMBL35_permissive.yaml") # also available on S3
List available datasets:
catalogLoad a specific dataset, here for the Pregnane X receptor (PXR, CHEMBL3401)
print(catalog.entries) >>> ... df=catalog["PXR_aggregated"].read()
This repository is distributed under an open license to promote accessibility and collaboration. Please refer to the LICENSE file for more details.
For questions, suggestions, or collaborations, please reach out via the OpenADMET organization or submit an issue in this repository.