mirdata

mirdata is an open-source Python library that provides tools for working with common Music Information Retrieval (MIR) datasets, including tools for:

  • downloading datasets to a common location and format

  • validating that the files for a dataset are all present

  • loading annotation files to a common format, consistent with mir_eval

  • parsing track level metadata for detailed evaluations.

pip install mirdata

For more details on how to use the library see the Tutorial.

Citing mirdata

If you are using the library for your work, please cite the version you used as indexed at Zenodo:

https://zenodo.org/badge/DOI/10.5281/zenodo.4355859.svg

If you refer to mirdata’s design principles, motivation etc., please cite the following paper 1:

https://zenodo.org/badge/DOI/10.5281/zenodo.3527750.svg
1

Rachel M. Bittner, Magdalena Fuentes, David Rubinstein, Andreas Jansson, Keunwoo Choi, and Thor Kell. “mirdata: Software for Reproducible Usage of Datasets.” In Proceedings of the 20th International Society for Music Information Retrieval (ISMIR) Conference, 2019.:

When working with datasets, please cite the version of mirdata that you are using (given by the DOI above) AND include the reference of the dataset, which can be found in the respective dataset loader using the cite() method.

Contributing to mirdata

We welcome contributions to this library, especially new datasets. Please see Contributing for guidelines.

Further Information