OpenML-Python provides an easy-to-use and straightforward Python interface for OpenML, an online platform for open science collaboration in machine learning. It can download or upload data from OpenML, such as datasets and machine learning experiment results.
Use the following code to get the credit-gdataset:
importopenmldataset=openml.datasets.get_dataset("credit-g") # or by ID get_dataset(31)X, y, categorical_indicator, attribute_names=dataset.get_data(target="class")Get a task for supervised classification on credit-g:
importopenmltask=openml.tasks.get_task(31) dataset=task.get_dataset() X, y, categorical_indicator, attribute_names=dataset.get_data(target=task.target_name) # get splits for the first fold of 10-fold cross-validationtrain_indices, test_indices=task.get_train_test_split_indices(fold=0)Use an OpenML benchmarking suite to get a curated list of machine-learning tasks:
importopenmlsuite=openml.study.get_suite("amlb-classification-all") # Get a curated list of tasks for classificationfortask_idinsuite.tasks: task=openml.tasks.get_task(task_id)OpenML-Python is supported on Python 3.8 - 3.13 and is available on Linux, MacOS, and Windows.
You can install OpenML-Python with:
pip install openmlIf you use OpenML-Python in a scientific publication, we would appreciate a reference to the following paper:
Bibtex entry:
@article{JMLR:v22:19-920, author = {Matthias Feurer and Jan N. van Rijn and Arlind Kadra and Pieter Gijsbers and Neeratyoy Mallik and Sahithya Ravi and Andreas Müller and Joaquin Vanschoren and Frank Hutter}, title = {OpenML-Python: an extensible Python API for OpenML}, journal = {Journal of Machine Learning Research}, year = {2021}, volume = {22}, number = {100}, pages = {1--5}, url = {http://jmlr.org/papers/v22/19-920.html} }We welcome contributions from both new and experienced developers!
If you would like to contribute to OpenML-Python, please read our
Contribution Guidelines.
If you are new to open-source development, a great way to get started is by looking at issues labeled "good first issue" in our GitHub issue tracker. These tasks are beginner-friendly and help you understand the project structure, development workflow, and how to submit a pull request.
