mirdata

mirdata is an open-source Python library that provides tools for working with common Music Information Retrieval (MIR) datasets, including tools for:

  • downloading datasets to a common location and format

  • validating that the files for a dataset are all present

  • loading annotation files to a common format, consistent with mir_eval

  • parsing track level metadata for detailed evaluations.

pip install mirdata

For more details on how to use the library see the Tutorial.

Citing mirdata

If you are using the library for your work, please cite the version you used as indexed at Zenodo:

https://zenodo.org/badge/DOI/10.5281/zenodo.4355859.svg

If you refer to mirdata’s design principles, motivation etc., please cite the following paper 1:

https://zenodo.org/badge/DOI/10.5281/zenodo.3527750.svg
1

Rachel M. Bittner, Magdalena Fuentes, David Rubinstein, Andreas Jansson, Keunwoo Choi, and Thor Kell. “mirdata: Software for Reproducible Usage of Datasets.” In Proceedings of the 20th International Society for Music Information Retrieval (ISMIR) Conference, 2019.:

When working with datasets, please cite the version of mirdata that you are using (given by the DOI above) AND include the reference of the dataset, which can be found in the respective dataset loader using the cite() method.

Contributing to mirdata

We welcome contributions to this library, especially new datasets. Please see Contributing for guidelines.

Overview

pip install mirdata

mirdata is a library which aims to standardize how audio datasets are accessed in Python, removing the need for writing custom loaders in every project, and improving reproducibility. Working with datasets usually requires an often cumbersome step of downloading data and writing load functions that load related files (for example, audio and annotations) into a standard format to be used for experimenting or evaluating. mirdata does all of this for you:

import mirdata

print(mirdata.list_datasets())

tinysol = mirdata.initialize('tinysol')
tinysol.download()

# get annotations and audio for a random track
example_track = tinysol.choice_track()
instrument = example_track.instrument_full
pitch = example_track.pitch
y, sr = example_track.audio

mirdata loaders contain methods to:

  • download(): download (or give instructions to download) a dataset

  • load_*(): load a dataset’s files (audio, metadata, annotations, etc.) into standard formats, so you don’t have to write them yourself which are compatible with mir_eval and jams.

  • validate(): validate that a dataset is complete and correct

  • cite(): quickly print a dataset’s relevant citation

  • access track and multitrack objects for grouping multiple annotations for a particular track/multitrack

  • and more

See the Tutorial for a detailed explanation of how to get started using this library.

mirdata design principles

Ease of use and contribution

We designed mirdata to be easy to use and easy to contribute to. mirdata simplifies the research pipeline considerably, facilitating research in a wider diversity of tasks and musical datasets. We provide detailed examples on how to interact with the library in the Tutorial, as well as detail explanation on how to contribute in Contributing. Additionally, we have a repository of Jupyter notebooks with usage examples of the different datasets.

Reproducibility

We aim for mirdata to aid in increasing research reproducibility by providing a common framework for MIR researchers to compare and validate their data. If mistakes are found in annotations or audio versions change, using mirdata, the community can fix mistakes while still being able to compare methods moving forward.

canonical versions

The dataset loaders in mirdata are written for what we call the canonical version of a dataset. Whenever possible, this should be the official release of the dataset as published by the dataset creator/s. When this is not possible, (e.g. for data that is no longer available), the procedure we follow is to find as many copies of the data as possible from different researchers (at least 4), and use the most common one. To make this process transparent, when there are doubts about the data consistency we open an issue and leave it to the community to discuss what to use.

Standardization

Different datasets have different annotations, metadata, etc. We try to respect the idiosyncracies of each dataset as much as we can. For this reason, tracks in each Dataset in mirdata have different attributes, e.g. some may have artist information and some may not. However there are some elements that are common in most datasets, and in these cases we standarize them to increase the usability of the library. Some examples of this are the annotations in mirdata, e.g. BeatData.

indexes

Indexes in mirdata are manifests of the files in a dataset and their corresponding md5 checksums. Specifically, an index is a json file with the mandatory top-level key version and at least one of the optional top-level keys metadata, tracks, multitracks or records. An index might look like:

The optional top-level keys (tracks, multitracks and records) relate to different organizations of music datasets. tracks are used when a dataset is organized as a collection of individual tracks, namely mono or multi-channel audio, spectrograms only, and their respective annotations. multitracks are used in when a dataset comprises of multitracks - different groups of tracks which are directly related to each other. Finally, records are used when a dataset consits of groups of tables (e.g. relational databases), as many recommendation datasets do.

See the contributing docs 1. Create an index for more information about mirdata indexes.

annotations

mirdata provdes Annotation classes of various kinds which provide a standard interface to different annotation formats. These classes are compatible with the mir_eval library’s expected format, as well as with the jams format. The format can be easily extended to other formats, if requested.

metadata

When available, we provide extensive and easy-to-access metadata to facilitate track metadata-specific analysis. metadata is available as attroibutes at the track level, e.g. track.artist.

Supported Datasets and Annotations

⭐ Dataset Quick Reference ⭐

This table is provided as a guide for users to select appropriate datasets. The list of annotations omits some metadata for brevity, and we document the dataset’s primary annotations only. The number of tracks indicates the number of unique “tracks” in a dataset, but it may not reflect the actual size or diversity of a dataset, as tracks can vary greatly in length (from a few seconds to a few minutes), and may be homogeneous.

“Downloadable” possible values:

  • ✅ : Freely downloadable

  • 🔑 : Available upon request

  • 📺 : Youtube Links only

  • ❌ : Not available

Find the API documentation for each of the below datasets in Initializing.

Dataset

Downloadable?

Annotation Types

Tracks

License

AcousticBrainz Genre

  • audio: ❌

  • annotations: ✅

  • features: ✅

>4M

Beatles

  • audio: ❌

  • annotations: ✅

180

Beatport EDM key

  • audio: ✅

  • annotations: ✅

1486

https://licensebuttons.net/l/by-sa/4.0/80x15.png

cante100

  • audio: 🔑

  • annotations: ✅

F0

100

Custom

DALI

  • audio: 📺

  • annotations: ✅

5358

https://licensebuttons.net/l/by-sa/4.0/80x15.png

Giantsteps key

  • audio: ✅

  • annotations: ✅

global Key

500

https://licensebuttons.net/l/by-sa/4.0/80x15.png

Giantsteps tempo

  • audio: : ❌

  • annotations: ✅

664

https://licensebuttons.net/l/by-sa/4.0/80x15.png

Groove MIDI

  • audio: ✅

  • midi: ✅

1150

https://licensebuttons.net/l/by-sa/4.0/80x15.png

Gtzan-Genre

  • audio: : ✅

  • annotations: ✅

global Genre

1000

Guitarset

  • audio: ✅

  • midi: ✅

360

https://img.shields.io/badge/License-MIT-blue.svg

Ikala

  • audio: ❌

  • annotations: ❌

252

Custom

IRMAS

  • audio: ✅

  • annotations: ✅

9579

https://licensebuttons.net/l/by-nc-sa/3.0/80x15.png

MAESTRO

  • audio: ✅

  • annotations: ✅

Piano Notes

1282

https://licensebuttons.net/l/by-nc-sa/4.0/80x15.png

Medley-solos-DB

  • audio: : ✅

  • annotations: ✅

Instruments

21571

https://licensebuttons.net/l/by-sa/4.0/80x15.png

MedleyDB melody

  • audio: 🔑

  • annotations: ✅

Melody F0

108

https://licensebuttons.net/l/by-nc-sa/4.0/80x15.png

MedleyDB pitch

  • audio: 🔑

  • annotations: ✅

103

https://licensebuttons.net/l/by-nc-sa/4.0/80x15.png

Mridangam Stroke

  • audio: ✅

  • annotations: ✅

6977

https://licensebuttons.net/l/by/3.0/80x15.png

Orchset

  • audio: ✅

  • annotations: ✅

Melody F0

64

https://licensebuttons.net/l/by-nc-sa/4.0/80x15.png

RWC classical

  • audio: ❌

  • annotations: ✅

61

Custom

RWC jazz

  • audio: ❌

  • annotations: ✅

50

Custom

RWC popular

  • audio: ❌

  • annotations: ✅

100

Custom

Salami

  • audio: ❌

  • annotations: ✅

Sections

1359

https://licensebuttons.net/l/zero/1.0/80x15.png

Saraga Carnatic

  • audio: ✅

  • annotations: ✅

249

https://licensebuttons.net/l/by-nc-sa/4.0/80x15.png

Saraga Hindustani

  • audio: ✅

  • annotations: ✅

108

https://licensebuttons.net/l/by-nc-sa/4.0/80x15.png

Tinysol

  • audio: ✅

  • annotations: ✅

2913

https://licensebuttons.net/l/by/4.0/80x15.png

Tonality ClassicalDB

  • audio: ❌

  • annotations: ✅

Global Key

881

https://licensebuttons.net/l/by-nc-sa/4.0/80x15.png

Annotation Types

The table above provides annotation types as a guide for choosing appropriate datasets, but it is difficult to generically categorize annotation types, as they depend on varying definitions and their meaning can change depending on the type of music they correspond to. Here we provide a rough guide to the types in this table, but we strongly recommend reading the dataset specific documentation to ensure the data is as you expect. To see how these annotation types are implemented in mirdata see Annotations.

Beats

Musical beats, typically encoded as sequence of timestamps and corresponding beat positions. This implicitly includes downbeat information (the beginning of a musical measure).

Chords

Musical chords, e.g. as might be played on a guitar. Typically encoded as a sequence of labeled events, where each event has a start time, end time, and a label. The label taxonomy varies per dataset, but typically encode a chord’s root and its quality, e.g. A:m7 for “A minor 7”.

Drums

Transcription of the drums, typically encoded as a sequence of labeled events, where the labels indicate which drum instrument (e.g. cymbal, snare drum) is played. These events often overlap with one another, as multiple drums can be played at the same time.

F0

Musical pitch contours, typically encoded as time series indidcating the musical pitch over time. The time series typically have evenly spaced timestamps, each with a correspoinding pitch value which may be encoded in a number of formats/granularities, including midi note numbers and Hertz.

Genre

A typically global “tag”, indicating the genre of a recording. Note that the concept of genre is highly subjective and we refer those new to this task to this article.

Instruments

Labels indicating which instrument is present in a musical recording. This may refer to recordings of solo instruments, or to recordings with multiple instruments. The labels may be global to a recording, or they may vary over time, indicating the presence/absence of a particular instrument as a time series.

Key

Musical key. This can be defined globally for an audio file or as a sequence of events.

Lyrics

Lyrics corresponding to the singing voice of the audio. These may be raw text with no time information, or they may be time-aligned events. They may have varying levels of granularity (paragraph, line, word, phoneme, character) depending on the dataset.

Melody

The musical melody of a song. Melody has no universal definition and is typically defined per dataset. It is typically enocoded as F0 or as Notes. Other types of annotations such as Vocal F0 or Vocal Notes can often be considered as melody annotations as well.

Notes

Musical note events, typically encoded as sequences of start time, end time, label. The label typically indicates a musical pitch, which may be in a number of formats/granularities, including midi note numbers, Hertz, or pitch class.

Phrases

Musical phrase events, typically encoded by a sequence of timestamps indicating the boundary times and defined by solfège symbols. This annotations are not intended to describe the complete melody but the musical phrases present in the track.

Sections

Musical sections, which may be “flat” or “hierarchical”, typically encoded by a sequence of timestamps indicating musical section boundary times. Section annotations sometimes also include labels for sections, which may indicate repetitions and/or the section type (e.g. Chorus, Verse).

Technique

The playing technique used by a particular instrument, for example “Pizzicato”. This label may be global for a given recording or encoded as a sequence of labeled events.

Tempo

The tempo of a song, typical in units of beats-per-minute (bpm). This is often indicated globally per track, but in practice tracks may have tempos that change, and some datasets encode tempo as time-varying quantity. Additionally, there may be multiple reasonable tempos at any given time (for example, often 2x or 0.5x a tempo value will also be “correct”). For this reason, some datasets provide two or more different tempo values.

Vocal Activity

A time series or sequence of events indicating when singing voice is present in a recording. This type of annotation is implicitly available when Vocal F0 or Vocal Notes annotations are available.

Stroke Name

An open “tag” to identify an instrument stroke name or type. Used for instruments that have specific stroke labels.

Tonic

The absolute tonic of a track. It may refer to the tonic a single stroke, or the tonal center of a track.

Tutorial

Installation

To install mirdata:

pip install mirdata

Usage

mirdata is easily imported into your Python code by:

import mirdata
Initializing a dataset

Print a list of all available dataset loaders by calling:

import mirdata
print(mirdata.list_datasets())

To use a loader, (for example, ‘orchset’) you need to initialize it by calling:

import mirdata
orchset = mirdata.initialize('orchset')

Now orchset is a Dataset object containing common methods, described below.

Downloading a dataset

All dataset loaders in mirdata have a download() function that allows the user to download the canonical version of the dataset (when available). When initializing a dataset it is important to set up correctly the directory where the dataset is going to be stored and retrieved.

Downloading a dataset into the default folder:

In this first example, data_home is not specified. Thus, ORCHSET will be downloaded and retrieved from mir_datasets folder created at user root folder:

import mirdata
orchset = mirdata.initialize('orchset')
orchset.download()  # Dataset is downloaded at user root folder
Downloading a dataset into a specified folder:

Now data_home is specified and so ORCHSET will be downloaded and retrieved from it:

orchset = mirdata.initialize('orchset', data_home='Users/johnsmith/Desktop')
orchset.download()  # Dataset is downloaded at John Smith's desktop
Partially downloading a dataset

The download() functions allows to partially download a dataset. In other words, if applicable, the user can select which elements of the dataset they want to download. Each dataset has a REMOTES dictionary were all the available elements are listed.

cante100 has different elements as seen in the REMOTES dictionary. Thus, we can specify which of these elements are downloaded, by passing to the download() function the list of keys in REMOTES that we are interested in. This list is passed to the download() function through the partial_download variable.

An partial download example for cante100 dataset could be:

cante100.download(partial_download=['spectrogram', 'melody', 'metadata'])
Validating a dataset

Using the method validate() we can check if the files in the local version are the same than the available canical version, and the files were downloaded correctly (none of them are corrupted).

For big datasets: In future mirdata versions, a random validation will be included. This improvement will reduce validation time for very big datasets.

Accessing annotations

We can choose a random track from a dataset with the choice_track() method.

We can also access specific tracks by id. The available track ids can be acessed via the .track_ids attribute. In the next example we take the first track id, and then we retrieve the melody annotation.

orchset_ids = orchset.track_ids  # the list of orchset's track ids
orchset_data = orchset.load_tracks()  # Load all tracks in the dataset
example_track = orchset_data[orchset_ids[0]]  # Get the first track

# Accessing the track's melody annotation
example_melody = example_track.melody

Alternatively, we don’t need to load the whole dataset to get a single track.

orchset_ids = orchset.track_ids  # the list of orchset's track ids
example_track = orchset.track(orchset_ids[0])  # load this particular track
example_melody = example_track.melody  # Get the melody from first track
Accessing data remotely

Annotations can also be accessed through load_*() methods which may be useful, for instance, when your data isn’t available locally. If you specify the annotation’s path, you can use the module’s loading functions directly. Let’s see an example.

Annotation classes

mirdata defines annotation-specific data classes. These data classes are meant to standarize the format for all loaders, and are compatibly with JAMS and mir_eval.

The list and descriptions of available annotation classes can be found in Annotations.

Note

These classes may be extended in the case that a loader requires it.

Iterating over datasets and annotations

In general, most datasets are a collection of tracks, and in most cases each track has an audio file along with annotations.

With the load_tracks() method, all tracks are loaded as a dictionary with the ids as keys and track objects (which include their respective audio and annotations, which are lazy-loaded on access) as values.

orchset = mirdata.initialize('orchset')
for key, track in orchset.load_tracks().items():
    print(key, track.audio_path)

Alternatively, we can loop over the track_ids list to directly access each track in the dataset.

orchset = mirdata.initialize('orchset')
for track_id in orchset.track_ids:

    print(track_id, orchset.track(track_id).audio_path)
Basic example: including mirdata in your pipeline

If we wanted to use orchset to evaluate the performance of a melody extraction algorithm (in our case, very_bad_melody_extractor), and then split the scores based on the metadata, we could do the following:

This is the result of the example above.

You can see that very_bad_melody_extractor performs very badly!

Using mirdata with tensorflow

The following is a simple example of a generator that can be used to create a tensorflow Dataset.

In future mirdata versions, generators for Tensorflow and Pytorch will be included.

Initializing

mirdata.initialize(dataset_name, data_home=None)[source]

Load a mirdata dataset by name

Example

orchset = mirdata.initialize('orchset')  # get the orchset dataset
orchset.download()  # download orchset
orchset.validate()  # validate orchset
track = orchset.choice_track()  # load a random track
print(track)  # see what data a track contains
orchset.track_ids()  # load all track ids
Parameters
  • dataset_name (str) – the dataset’s name see mirdata.DATASETS for a complete list of possibilities

  • data_home (str or None) – path where the data lives. If None uses the default location.

Returns

Dataset – a mirdata.core.Dataset object

mirdata.list_datasets()[source]

Get a list of all mirdata dataset names

Returns

list – list of dataset names as strings

Dataset Loaders

acousticbrainz_genre

Acoustic Brainz Genre dataset

class mirdata.datasets.acousticbrainz_genre.Dataset(data_home=None)[source]

The acousticbrainz genre dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

filter_index(search_key)[source]

Load from AcousticBrainz genre dataset the indexes that match with search_key.

Parameters

search_key (str) – regex to match with folds, mbid or genres

Returns

dict – {track_id: track data}

license()[source]

Print the license

load_all_train()[source]

Load from AcousticBrainz genre dataset the tracks that are used for training across the four different datasets.

Returns

dict – {track_id: track data}

load_all_validation()[source]

Load from AcousticBrainz genre dataset the tracks that are used for validating across the four different datasets.

Returns

dict – {track_id: track data}

load_allmusic_train()[source]

Load from AcousticBrainz genre dataset the tracks that are used for validation in allmusic dataset.

Returns

dict – {track_id: track data}

load_allmusic_validation()[source]

Load from AcousticBrainz genre dataset the tracks that are used for validation in allmusic dataset.

Returns

dict – {track_id: track data}

load_discogs_train()[source]

Load from AcousticBrainz genre dataset the tracks that are used for training in discogs dataset.

Returns

dict – {track_id: track data}

load_discogs_validation()[source]

Load from AcousticBrainz genre dataset the tracks that are used for validation in tagtraum dataset.

Returns

dict – {track_id: track data}

load_extractor(*args, **kwargs)[source]

Load a AcousticBrainz Dataset json file with all the features and metadata.

Parameters

fhandle (str or file-like) – path or file-like object pointing to a json file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_lastfm_train()[source]

Load from AcousticBrainz genre dataset the tracks that are used for training in lastfm dataset.

Returns

dict – {track_id: track data}

load_lastfm_validation()[source]

Load from AcousticBrainz genre dataset the tracks that are used for validation in lastfm dataset.

Returns

dict – {track_id: track data}

load_tagtraum_train()[source]

Load from AcousticBrainz genre dataset the tracks that are used for training in tagtraum dataset.

Returns

dict – {track_id: track data}

load_tagtraum_validation()[source]

Load from AcousticBrainz genre dataset the tracks that are used for validating in tagtraum dataset.

Returns

dict – {track_id: track data}

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.acousticbrainz_genre.Track(track_id, data_home, dataset_name, index, metadata)[source]

AcousticBrainz Genre Dataset track class

Parameters
  • track_id (str) – track id of the track

  • data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets

Variables
  • track_id (str) – track id

  • genre (list) – human-labeled genre and subgenres list

  • mbid (str) – musicbrainz id

  • mbid_group (str) – musicbrainz id group

  • artist (list) – the track’s artist/s

  • title (list) – the track’s title

  • date (list) – the track’s release date/s

  • filename (str) – the track’s filename

  • album (list) – the track’s album/s

  • track_number (list) – the track number/s

  • tonal (dict) – dictionary of acousticbrainz tonal features

  • low_level (dict) – dictionary of acousticbrainz low-level features

  • rhythm (dict) – dictionary of acousticbrainz rhythm features

Other Parameters

acousticbrainz_metadata (dict) – dictionary of metadata provided by AcousticBrainz

property album

metadata album annotation

Returns

list – album

property artist

metadata artist annotation

Returns

list – artist

property date

metadata date annotation

Returns

list – date

property file_name

metadata file_name annotation

Returns

str – file name

property low_level

low_level track descriptors.

Returns

dict

  • ‘average_loudness’: dynamic range descriptor. It rescales average loudness, computed on 2sec windows with 1 sec overlap, into the [0,1] interval. The value of 0 corresponds to signals with large dynamic range, 1 corresponds to signal with little dynamic range. Algorithms: Loudness

  • ’dynamic_complexity’: dynamic complexity computed on 2sec windows with 1sec overlap. Algorithms: DynamicComplexity

  • ’silence_rate_20dB’, ‘silence_rate_30dB’, ‘silence_rate_60dB’: rate of silent frames in a signal for thresholds of 20, 30, and 60 dBs. Algorithms: SilenceRate

  • ’spectral_rms’: spectral RMS. Algorithms: RMS

  • ’spectral_flux’: spectral flux of a signal computed using L2-norm. Algorithms: Flux

  • ’spectral_centroid’, ‘spectral_kurtosis’, ‘spectral_spread’, ‘spectral_skewness’: centroid and central moments statistics describing the spectral shape. Algorithms: Centroid, CentralMoments

  • ’spectral_rolloff’: the roll-off frequency of a spectrum. Algorithms: RollOff

  • ’spectral_decrease’: spectral decrease. Algorithms: Decrease

  • ’hfc’: high frequency content descriptor as proposed by Masri. Algorithms: HFC

  • ’zerocrossingrate’ zero-crossing rate. Algorithms: ZeroCrossingRate

  • ’spectral_energy’: spectral energy. Algorithms: Energy

  • ’spectral_energyband_low’, ‘spectral_energyband_middle_low’, ‘spectral_energyband_middle_high’,

  • ’spectral_energyband_high’: spectral energy in frequency bands [20Hz, 150Hz], [150Hz, 800Hz], [800Hz, 4kHz], and [4kHz, 20kHz]. Algorithms EnergyBand

  • ’barkbands’: spectral energy in 27 Bark bands. Algorithms: BarkBands

  • ’melbands’: spectral energy in 40 mel bands. Algorithms: MFCC

  • ’erbbands’: spectral energy in 40 ERB bands. Algorithms: ERBBands

  • ’mfcc’: the first 13 mel frequency cepstrum coefficients. See algorithm: MFCC

  • ’gfcc’: the first 13 gammatone feature cepstrum coefficients. Algorithms: GFCC

  • ’barkbands_crest’, ‘barkbands_flatness_db’: crest and flatness computed over energies in Bark bands. Algorithms: Crest, FlatnessDB

  • ’barkbands_kurtosis’, ‘barkbands_skewness’, ‘barkbands_spread’: central moments statistics over energies in Bark bands. Algorithms: CentralMoments

  • ’melbands_crest’, ‘melbands_flatness_db’: crest and flatness computed over energies in mel bands. Algorithms: Crest, FlatnessDB

  • ’melbands_kurtosis’, ‘melbands_skewness’, ‘melbands_spread’: central moments statistics over energies in mel bands. Algorithms: CentralMoments

  • ’erbbands_crest’, ‘erbbands_flatness_db’: crest and flatness computed over energies in ERB bands. Algorithms: Crest, FlatnessDB

  • ’erbbands_kurtosis’, ‘erbbands_skewness’, ‘erbbands_spread’: central moments statistics over energies in ERB bands. Algorithms: CentralMoments

  • ’dissonance’: sensory dissonance of a spectrum. Algorithms: Dissonance

  • ’spectral_entropy’: Shannon entropy of a spectrum. Algorithms: Entropy

  • ’pitch_salience’: pitch salience of a spectrum. Algorithms: PitchSalience

  • ’spectral_complexity’: spectral complexity. Algorithms: SpectralComplexity

  • ’spectral_contrast_coeffs’, ‘spectral_contrast_valleys’: spectral contrast features. Algorithms: SpectralContrast

property rhythm

rhythm essentia extractor descriptors

Returns

dict

  • ‘beats_position’: time positions [sec] of detected beats using beat tracking algorithm by Degara et al., 2012. Algorithms: RhythmExtractor2013, BeatTrackerDegara

  • ’beats_count’: number of detected beats

  • ’bpm’: BPM value according to detected beats

  • ’bpm_histogram_first_peak_bpm’, ‘bpm_histogram_first_peak_spread’, ‘bpm_histogram_first_peak_weight’,

  • ’bpm_histogram_second_peak_bpm’, ‘bpm_histogram_second_peak_spread’, ‘bpm_histogram_second_peak_weight’: descriptors characterizing highest and second highest peak of the BPM histogram. Algorithms: BpmHistogramDescriptors

  • ’beats_loudness’, ‘beats_loudness_band_ratio’: spectral energy computed on beats segments of audio across the whole spectrum, and ratios of energy in 6 frequency bands. Algorithms: BeatsLoudness, SingleBeatLoudness

  • ’onset_rate’: number of detected onsets per second. Algorithms: OnsetRate

  • ’danceability’: danceability estimate. Algorithms: Danceability

property title

metadata title annotation

Returns

list – title

to_jams()[source]

the track’s data in jams format

Returns

jams.JAMS – return track data in jam format

property tonal

tonal features

Returns

dict

  • ‘tuning_frequency’: estimated tuning frequency [Hz]. Algorithms: TuningFrequency

  • ’tuning_nontempered_energy_ratio’ and ‘tuning_equal_tempered_deviation’

  • ’hpcp’, ‘thpcp’: 32-dimensional harmonic pitch class profile (HPCP) and its transposed version. Algorithms: HPCP

  • ’hpcp_entropy’: Shannon entropy of a HPCP vector. Algorithms: Entropy

  • ’key_key’, ‘key_scale’: Global key feature. Algorithms: Key

  • ’chords_key’, ‘chords_scale’: Global key extracted from chords detection.

  • ’chords_strength’, ‘chords_histogram’: : strength of estimated chords and normalized histogram of their progression; Algorithms: ChordsDetection, ChordsDescriptors

  • ’chords_changes_rate’, ‘chords_number_rate’: chords change rate in the progression; ratio of different chords from the total number of chords in the progression; Algorithms: ChordsDetection, ChordsDescriptors

property tracknumber

metadata tracknumber annotation

Returns

list – tracknumber

mirdata.datasets.acousticbrainz_genre.load_extractor(fhandle)[source]

Load a AcousticBrainz Dataset json file with all the features and metadata.

Parameters

fhandle (str or file-like) – path or file-like object pointing to a json file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

beatles

Beatles Dataset Loader

class mirdata.datasets.beatles.Dataset(data_home=None)[source]

The beatles dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Beatles audio file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]

Load Beatles format beat data from a file

Parameters

fhandle (str or file-like) – path or file-like object pointing to a beat annotation file

Returns

BeatData – loaded beat data

load_chords(*args, **kwargs)[source]

Load Beatles format chord data from a file

Parameters

fhandle (str or file-like) – path or file-like object pointing to a chord annotation file

Returns

ChordData – loaded chord data

load_sections(*args, **kwargs)[source]

Load Beatles format section data from a file

Parameters

fhandle (str or file-like) – path or file-like object pointing to a section annotation file

Returns

SectionData – loaded section data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.beatles.Track(track_id, data_home, dataset_name, index, metadata)[source]

Beatles track class

Parameters
  • track_id (str) – track id of the track

  • data_home (str) – path where the data lives

Variables
  • audio_path (str) – track audio path

  • beats_path (str) – beat annotation path

  • chords_path (str) – chord annotation path

  • keys_path (str) – key annotation path

  • sections_path (str) – sections annotation path

  • title (str) – title of the track

  • track_id (str) – track id

Other Parameters
  • beats (BeatData) – human-labeled beat annotations

  • chords (ChordData) – human-labeled chord annotations

  • key (KeyData) – local key annotations

  • sections (SectionData) – section annotations

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

the track’s data in jams format

Returns

jams.JAMS – return track data in jam format

mirdata.datasets.beatles.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a Beatles audio file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.beatles.load_beats(fhandle: TextIO)mirdata.annotations.BeatData[source]

Load Beatles format beat data from a file

Parameters

fhandle (str or file-like) – path or file-like object pointing to a beat annotation file

Returns

BeatData – loaded beat data

mirdata.datasets.beatles.load_chords(fhandle: TextIO)mirdata.annotations.ChordData[source]

Load Beatles format chord data from a file

Parameters

fhandle (str or file-like) – path or file-like object pointing to a chord annotation file

Returns

ChordData – loaded chord data

mirdata.datasets.beatles.load_key(fhandle: TextIO)mirdata.annotations.KeyData[source]

Load Beatles format key data from a file

Parameters

fhandle (str or file-like) – path or file-like object pointing to a key annotation file

Returns

KeyData – loaded key data

mirdata.datasets.beatles.load_sections(fhandle: TextIO)mirdata.annotations.SectionData[source]

Load Beatles format section data from a file

Parameters

fhandle (str or file-like) – path or file-like object pointing to a section annotation file

Returns

SectionData – loaded section data

beatport_key

beatport_key Dataset Loader

class mirdata.datasets.beatport_key.Dataset(data_home=None)[source]

The beatport_key dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download the dataset

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_artist(*args, **kwargs)[source]

Load beatport_key tempo data from a file

Parameters

metadata_path (str) – path to metadata annotation file

Returns

list – list of artists involved in the track.

load_audio(*args, **kwargs)[source]

Load a beatport_key audio file.

Parameters

audio_path (str) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_genre(*args, **kwargs)[source]

Load beatport_key genre data from a file

Parameters

metadata_path (str) – path to metadata annotation file

Returns

dict – with the list with genres [‘genres’] and list with sub-genres [‘sub_genres’]

load_key(*args, **kwargs)[source]

Load beatport_key format key data from a file

Parameters

keys_path (str) – path to key annotation file

Returns

list – list of annotated keys

load_tempo(*args, **kwargs)[source]

Load beatport_key tempo data from a file

Parameters

metadata_path (str) – path to metadata annotation file

Returns

str – tempo in beats per minute

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.beatport_key.Track(track_id, data_home, dataset_name, index, metadata)[source]

beatport_key track class

Parameters
  • track_id (str) – track id of the track

  • data_home (str) – Local path where the dataset is stored.

Variables
  • audio_path (str) – track audio path

  • keys_path (str) – key annotation path

  • metadata_path (str) – sections annotation path

  • title (str) – title of the track

  • track_id (str) – track id

Other Parameters
  • key (list) – list of annotated musical keys

  • artists (list) – artists involved in the track

  • genre (dict) – genres and subgenres

  • tempo (int) – tempo in beats per minute

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.beatport_key.load_artist(metadata_path)[source]

Load beatport_key tempo data from a file

Parameters

metadata_path (str) – path to metadata annotation file

Returns

list – list of artists involved in the track.

mirdata.datasets.beatport_key.load_audio(audio_path)[source]

Load a beatport_key audio file.

Parameters

audio_path (str) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.beatport_key.load_genre(metadata_path)[source]

Load beatport_key genre data from a file

Parameters

metadata_path (str) – path to metadata annotation file

Returns

dict – with the list with genres [‘genres’] and list with sub-genres [‘sub_genres’]

mirdata.datasets.beatport_key.load_key(keys_path)[source]

Load beatport_key format key data from a file

Parameters

keys_path (str) – path to key annotation file

Returns

list – list of annotated keys

mirdata.datasets.beatport_key.load_tempo(metadata_path)[source]

Load beatport_key tempo data from a file

Parameters

metadata_path (str) – path to metadata annotation file

Returns

str – tempo in beats per minute

cante100

cante100 Loader

class mirdata.datasets.cante100.Dataset(data_home=None)[source]

The cante100 dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a cante100 audio file.

Parameters

fhandle (str) – path to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_melody(*args, **kwargs)[source]

Load cante100 f0 annotations

Parameters

fhandle (str or file-like) – path or file-like object pointing to melody annotation file

Returns

F0Data – predominant melody

load_notes(*args, **kwargs)[source]

Load note data from the annotation files

Parameters

fhandle (str or file-like) – path or file-like object pointing to a notes annotation file

Returns

NoteData – note annotations

load_spectrogram(*args, **kwargs)[source]

Load a cante100 dataset spectrogram file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

np.ndarray – spectrogram

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.cante100.Track(track_id, data_home, dataset_name, index, metadata)[source]

cante100 track class

Parameters
  • track_id (str) – track id of the track

  • data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets/cante100

Variables
  • track_id (str) – track id

  • identifier (str) – musicbrainz id of the track

  • artist (str) – performing artists

  • title (str) – title of the track song

  • release (str) – release where the track can be found

  • duration (str) – duration in seconds of the track

Other Parameters
  • melody (F0Data) – annotated melody

  • notes (NoteData) – annotated notes

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

property spectrogram

spectrogram of The track’s audio

Returns

np.ndarray – spectrogram

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.cante100.load_audio(fhandle: str) → Tuple[numpy.ndarray, float][source]

Load a cante100 audio file.

Parameters

fhandle (str) – path to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.cante100.load_melody(fhandle: TextIO) → Optional[mirdata.annotations.F0Data][source]

Load cante100 f0 annotations

Parameters

fhandle (str or file-like) – path or file-like object pointing to melody annotation file

Returns

F0Data – predominant melody

mirdata.datasets.cante100.load_notes(fhandle: TextIO)mirdata.annotations.NoteData[source]

Load note data from the annotation files

Parameters

fhandle (str or file-like) – path or file-like object pointing to a notes annotation file

Returns

NoteData – note annotations

mirdata.datasets.cante100.load_spectrogram(fhandle: TextIO)numpy.ndarray[source]

Load a cante100 dataset spectrogram file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

np.ndarray – spectrogram

dali

DALI Dataset Loader

class mirdata.datasets.dali.Dataset(data_home=None)[source]

The dali dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_annotations_class(*args, **kwargs)[source]

Load full annotations into the DALI class object

Parameters

annotations_path (str) – path to a DALI annotation file

Returns

DALI.annotations – DALI annotations object

load_annotations_granularity(*args, **kwargs)[source]

Load annotations at the specified level of granularity

Parameters
  • annotations_path (str) – path to a DALI annotation file

  • granularity (str) – one of ‘notes’, ‘words’, ‘lines’, ‘paragraphs’

Returns

NoteData for granularity=’notes’ or LyricData otherwise

load_audio(*args, **kwargs)[source]

Load a DALI audio file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.dali.Track(track_id, data_home, dataset_name, index, metadata)[source]

DALI melody Track class

Parameters

track_id (str) – track id of the track

Variables
  • album (str) – the track’s album

  • annotation_path (str) – path to the track’s annotation file

  • artist (str) – the track’s artist

  • audio_path (str) – path to the track’s audio file

  • audio_url (str) – youtube ID

  • dataset_version (int) – dataset annotation version

  • ground_truth (bool) – True if the annotation is verified

  • language (str) – sung language

  • release_date (str) – year the track was released

  • scores_manual (int) – manual score annotations

  • scores_ncc (float) – ncc score annotations

  • title (str) – the track’s title

  • track_id (str) – the unique track id

  • url_working (bool) – True if the youtube url was valid

Other Parameters
  • notes (NoteData) – vocal notes

  • words (LyricData) – word-level lyrics

  • lines (LyricData) – line-level lyrics

  • paragraphs (LyricData) – paragraph-level lyrics

  • annotation-object (DALI.Annotations) – DALI annotation object

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.dali.load_annotations_class(annotations_path)[source]

Load full annotations into the DALI class object

Parameters

annotations_path (str) – path to a DALI annotation file

Returns

DALI.annotations – DALI annotations object

mirdata.datasets.dali.load_annotations_granularity(annotations_path, granularity)[source]

Load annotations at the specified level of granularity

Parameters
  • annotations_path (str) – path to a DALI annotation file

  • granularity (str) – one of ‘notes’, ‘words’, ‘lines’, ‘paragraphs’

Returns

NoteData for granularity=’notes’ or LyricData otherwise

mirdata.datasets.dali.load_audio(fhandle: BinaryIO) → Optional[Tuple[numpy.ndarray, float]][source]

Load a DALI audio file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

giantsteps_key

giantsteps_key Dataset Loader

class mirdata.datasets.giantsteps_key.Dataset(data_home=None)[source]

The giantsteps_key dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_artist(*args, **kwargs)[source]

Load giantsteps_key tempo data from a file

Parameters

fhandle (str or file-like) – File-like object or path pointing to metadata annotation file

Returns

list – list of artists involved in the track.

load_audio(*args, **kwargs)[source]

Load a giantsteps_key audio file.

Parameters

fhandle (str or file-like) – path pointing to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_genre(*args, **kwargs)[source]

Load giantsteps_key genre data from a file

Parameters

fhandle (str or file-like) – File-like object or path pointing to metadata annotation file

Returns

dict{‘genres’: […], ‘subgenres’: […]}

load_key(*args, **kwargs)[source]

Load giantsteps_key format key data from a file

Parameters

fhandle (str or file-like) – File like object or string pointing to key annotation file

Returns

str – loaded key data

load_tempo(*args, **kwargs)[source]

Load giantsteps_key tempo data from a file

Parameters

fhandle (str or file-like) – File-like object or string pointing to metadata annotation file

Returns

str – loaded tempo data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.giantsteps_key.Track(track_id, data_home, dataset_name, index, metadata)[source]

giantsteps_key track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_path (str) – track audio path

  • keys_path (str) – key annotation path

  • metadata_path (str) – sections annotation path

  • title (str) – title of the track

  • track_id (str) – track id

Other Parameters
  • key (str) – musical key annotation

  • artists (list) – list of artists involved

  • genres (dict) – genres and subgenres

  • tempo (int) – crowdsourced tempo annotations in beats per minute

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.giantsteps_key.load_artist(fhandle: TextIO) → List[str][source]

Load giantsteps_key tempo data from a file

Parameters

fhandle (str or file-like) – File-like object or path pointing to metadata annotation file

Returns

list – list of artists involved in the track.

mirdata.datasets.giantsteps_key.load_audio(fhandle: str) → Tuple[numpy.ndarray, float][source]

Load a giantsteps_key audio file.

Parameters

fhandle (str or file-like) – path pointing to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.giantsteps_key.load_genre(fhandle: TextIO) → Dict[str, List[str]][source]

Load giantsteps_key genre data from a file

Parameters

fhandle (str or file-like) – File-like object or path pointing to metadata annotation file

Returns

dict{‘genres’: […], ‘subgenres’: […]}

mirdata.datasets.giantsteps_key.load_key(fhandle: TextIO) → str[source]

Load giantsteps_key format key data from a file

Parameters

fhandle (str or file-like) – File like object or string pointing to key annotation file

Returns

str – loaded key data

mirdata.datasets.giantsteps_key.load_tempo(fhandle: TextIO) → str[source]

Load giantsteps_key tempo data from a file

Parameters

fhandle (str or file-like) – File-like object or string pointing to metadata annotation file

Returns

str – loaded tempo data

giantsteps_tempo

giantsteps_tempo Dataset Loader

class mirdata.datasets.giantsteps_tempo.Dataset(data_home=None)[source]

The giantsteps_tempo dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a giantsteps_tempo audio file.

Parameters

fhandle (str or file-like) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_genre(*args, **kwargs)[source]

Load genre data from a file

Parameters

path (str) – path to metadata annotation file

Returns

str – loaded genre data

load_tempo(*args, **kwargs)[source]

Load giantsteps_tempo tempo data from a file ordered by confidence

Parameters

fhandle (str or file-like) – File-like object or path to tempo annotation file

Returns

annotations.TempoData – Tempo data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.giantsteps_tempo.Track(track_id, data_home, dataset_name, index, metadata)[source]

giantsteps_tempo track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_path (str) – track audio path

  • title (str) – title of the track

  • track_id (str) – track id

  • annotation_v1_path (str) – track annotation v1 path

  • annotation_v2_path (str) – track annotation v2 path

Other Parameters
  • genre (dict) – Human-labeled metadata annotation

  • tempo (list) – List of annotations.TempoData, ordered by confidence

  • tempo_v2 (list) – List of annotations.TempoData for version 2, ordered by confidence

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

to_jams_v2()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.giantsteps_tempo.load_audio(fhandle: str) → Tuple[numpy.ndarray, float][source]

Load a giantsteps_tempo audio file.

Parameters

fhandle (str or file-like) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.giantsteps_tempo.load_genre(fhandle: TextIO) → str[source]

Load genre data from a file

Parameters

path (str) – path to metadata annotation file

Returns

str – loaded genre data

mirdata.datasets.giantsteps_tempo.load_tempo(fhandle: TextIO)mirdata.annotations.TempoData[source]

Load giantsteps_tempo tempo data from a file ordered by confidence

Parameters

fhandle (str or file-like) – File-like object or path to tempo annotation file

Returns

annotations.TempoData – Tempo data

groove_midi

Groove MIDI Loader

class mirdata.datasets.groove_midi.Dataset(data_home=None)[source]

The groove_midi dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Groove MIDI audio file.

Parameters

path – path to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]

Load beat data from the midi file.

Parameters
  • midi_path (str) – path to midi file

  • midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

annotations.BeatData – machine generated beat data

load_drum_events(*args, **kwargs)[source]

Load drum events from the midi file.

Parameters
  • midi_path (str) – path to midi file

  • midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

annotations.EventData – drum event data

load_midi(*args, **kwargs)[source]

Load a Groove MIDI midi file.

Parameters

fhandle (str or file-like) – File-like object or path to midi file

Returns

midi_data (pretty_midi.PrettyMIDI) – pretty_midi object

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.groove_midi.Track(track_id, data_home, dataset_name, index, metadata)[source]

Groove MIDI Track class

Parameters

track_id (str) – track id of the track

Variables
  • drummer (str) – Drummer id of the track (ex. ‘drummer1’)

  • session (str) – Type of session (ex. ‘session1’, ‘eval_session’)

  • track_id (str) – track id of the track (ex. ‘drummer1/eval_session/1’)

  • style (str) – Style (genre, groove type) of the track (ex. ‘funk/groove1’)

  • tempo (int) – track tempo in beats per minute (ex. 138)

  • beat_type (str) – Whether the track is a beat or a fill (ex. ‘beat’)

  • time_signature (str) – Time signature of the track (ex. ‘4-4’, ‘6-8’)

  • midi_path (str) – Path to the midi file

  • audio_path (str) – Path to the audio file

  • duration (float) – Duration of the midi file in seconds

  • split (str) – Whether the track is for a train/valid/test set. One of ‘train’, ‘valid’ or ‘test’.

Other Parameters
  • beats (BeatData) – Machine-generated beat annotations

  • drum_events (EventData) – Annotated drum kit events

  • midi (pretty_midi.PrettyMIDI) – object containing MIDI information

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.groove_midi.load_audio(path: str) → Tuple[Optional[numpy.ndarray], Optional[float]][source]

Load a Groove MIDI audio file.

Parameters

path – path to an audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.groove_midi.load_beats(midi_path, midi=None)[source]

Load beat data from the midi file.

Parameters
  • midi_path (str) – path to midi file

  • midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

annotations.BeatData – machine generated beat data

mirdata.datasets.groove_midi.load_drum_events(midi_path, midi=None)[source]

Load drum events from the midi file.

Parameters
  • midi_path (str) – path to midi file

  • midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

annotations.EventData – drum event data

mirdata.datasets.groove_midi.load_midi(fhandle: BinaryIO) → Optional[pretty_midi.PrettyMIDI][source]

Load a Groove MIDI midi file.

Parameters

fhandle (str or file-like) – File-like object or path to midi file

Returns

midi_data (pretty_midi.PrettyMIDI) – pretty_midi object

gtzan_genre

GTZAN-Genre Dataset Loader

class mirdata.datasets.gtzan_genre.Dataset(data_home=None)[source]

The gtzan_genre dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a GTZAN audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.gtzan_genre.Track(track_id, data_home, dataset_name, index, metadata)[source]

gtzan_genre Track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_path (str) – path to the audio file

  • genre (str) – annotated genre

  • track_id (str) – track id

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.gtzan_genre.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a GTZAN audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

guitarset

GuitarSet Loader

class mirdata.datasets.guitarset.Dataset(data_home=None)[source]

The guitarset dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Guitarset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]

Load a Guitarset beats annotation.

Parameters

fhandle (str or file-like) – File-like object or path of the jams annotation file

Returns

BeatData – Beat data

load_chords(*args, **kwargs)[source]

Load a guitarset chord annotation.

Parameters
  • jams_path (str) – Path of the jams annotation file

  • leadsheet_version (Bool) – Whether or not to load the leadsheet version of the chord annotation If False, load the infered version.

Returns

ChordData – Chord data

load_key_mode(*args, **kwargs)[source]

Load a Guitarset key-mode annotation.

Parameters

fhandle (str or file-like) – File-like object or path of the jams annotation file

Returns

KeyData – Key data

load_multitrack_audio(*args, **kwargs)[source]

Load a Guitarset multitrack audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_notes(*args, **kwargs)[source]

Load a guitarset note annotation for a given string

Parameters
  • jams_path (str) – Path of the jams annotation file

  • string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.

Returns

NoteData – Note data for the given string

load_pitch_contour(*args, **kwargs)[source]

Load a guitarset pitch contour annotation for a given string

Parameters
  • jams_path (str) – Path of the jams annotation file

  • string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.

Returns

F0Data – Pitch contour data for the given string

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.guitarset.Track(track_id, data_home, dataset_name, index, metadata)[source]

guitarset Track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_hex_cln_path (str) – path to the debleeded hex wave file

  • audio_hex_path (str) – path to the original hex wave file

  • audio_mic_path (str) – path to the mono wave via microphone

  • audio_mix_path (str) – path to the mono wave via downmixing hex pickup

  • jams_path (str) – path to the jams file

  • mode (str) – one of [‘solo’, ‘comp’] For each excerpt, players are asked to first play in ‘comp’ mode and later play a ‘solo’ version on top of the already recorded comp.

  • player_id (str) – ID of the different players. one of [‘00’, ‘01’, … , ‘05’]

  • style (str) – one of [‘Jazz’, ‘Bossa Nova’, ‘Rock’, ‘Singer-Songwriter’, ‘Funk’]

  • tempo (float) – BPM of the track

  • track_id (str) – track id

Other Parameters
  • beats (BeatData) – beat positions

  • leadsheet_chords (ChordData) – chords as written in the leadsheet

  • inferred_chords (ChordData) – chords inferred from played transcription

  • key_mode (KeyData) – key and mode

  • pitch_contours (dict) – Pitch contours per string - ‘E’: F0Data(…) - ‘A’: F0Data(…) - ‘D’: F0Data(…) - ‘G’: F0Data(…) - ‘B’: F0Data(…) - ‘e’: F0Data(…)

  • notes (dict) – Notes per string - ‘E’: NoteData(…) - ‘A’: NoteData(…) - ‘D’: NoteData(…) - ‘G’: NoteData(…) - ‘B’: NoteData(…) - ‘e’: NoteData(…)

property audio_hex

Hexaphonic audio (6-channels) with one channel per string

Returns

  • np.ndarray - audio signal

  • float - sample rate

property audio_hex_cln
Hexaphonic audio (6-channels) with one channel per string

after bleed removal

Returns

  • np.ndarray - audio signal

  • float - sample rate

property audio_mic

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

property audio_mix

Mixture audio (mono)

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.guitarset.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a Guitarset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.guitarset.load_beats(fhandle: TextIO)mirdata.annotations.BeatData[source]

Load a Guitarset beats annotation.

Parameters

fhandle (str or file-like) – File-like object or path of the jams annotation file

Returns

BeatData – Beat data

mirdata.datasets.guitarset.load_chords(jams_path, leadsheet_version=True)[source]

Load a guitarset chord annotation.

Parameters
  • jams_path (str) – Path of the jams annotation file

  • leadsheet_version (Bool) – Whether or not to load the leadsheet version of the chord annotation If False, load the infered version.

Returns

ChordData – Chord data

mirdata.datasets.guitarset.load_key_mode(fhandle: TextIO)mirdata.annotations.KeyData[source]

Load a Guitarset key-mode annotation.

Parameters

fhandle (str or file-like) – File-like object or path of the jams annotation file

Returns

KeyData – Key data

mirdata.datasets.guitarset.load_multitrack_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a Guitarset multitrack audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.guitarset.load_notes(jams_path, string_num)[source]

Load a guitarset note annotation for a given string

Parameters
  • jams_path (str) – Path of the jams annotation file

  • string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.

Returns

NoteData – Note data for the given string

mirdata.datasets.guitarset.load_pitch_contour(jams_path, string_num)[source]

Load a guitarset pitch contour annotation for a given string

Parameters
  • jams_path (str) – Path of the jams annotation file

  • string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.

Returns

F0Data – Pitch contour data for the given string

ikala

iKala Dataset Loader

class mirdata.datasets.ikala.Dataset(data_home=None)[source]

The ikala dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_f0(*args, **kwargs)[source]

Load an ikala f0 annotation

Parameters

fhandle (str or file-like) – File-like object or path to f0 annotation file

Raises

IOError – If f0_path does not exist

Returns

F0Data – the f0 annotation data

load_instrumental_audio(*args, **kwargs)[source]

Load ikala instrumental audio

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - audio signal

  • float - sample rate

load_lyrics(*args, **kwargs)[source]

Load an ikala lyrics annotation

Parameters

fhandle (str or file-like) – File-like object or path to lyric annotation file

Raises

IOError – if lyrics_path does not exist

Returns

LyricData – lyric annotation data

load_mix_audio(*args, **kwargs)[source]

Load an ikala mix.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - audio signal

  • float - sample rate

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

load_vocal_audio(*args, **kwargs)[source]

Load ikala vocal audio

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - audio signal

  • float - sample rate

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.ikala.Track(track_id, data_home, dataset_name, index, metadata)[source]

ikala Track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_path (str) – path to the track’s audio file

  • f0_path (str) – path to the track’s f0 annotation file

  • lyrics_path (str) – path to the track’s lyric annotation file

  • section (str) – section. Either ‘verse’ or ‘chorus’

  • singer_id (str) – singer id

  • song_id (str) – song id of the track

  • track_id (str) – track id

Other Parameters
  • f0 (F0Data) – human-annotated singing voice pitch

  • lyrics (LyricsData) – human-annotated lyrics

property instrumental_audio

instrumental audio (mono)

Returns

  • np.ndarray - audio signal

  • float - sample rate

property mix_audio

mixture audio (mono)

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

property vocal_audio

solo vocal audio (mono)

Returns

  • np.ndarray - audio signal

  • float - sample rate

mirdata.datasets.ikala.load_f0(fhandle: TextIO)mirdata.annotations.F0Data[source]

Load an ikala f0 annotation

Parameters

fhandle (str or file-like) – File-like object or path to f0 annotation file

Raises

IOError – If f0_path does not exist

Returns

F0Data – the f0 annotation data

mirdata.datasets.ikala.load_instrumental_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load ikala instrumental audio

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - audio signal

  • float - sample rate

mirdata.datasets.ikala.load_lyrics(fhandle: TextIO)mirdata.annotations.LyricData[source]

Load an ikala lyrics annotation

Parameters

fhandle (str or file-like) – File-like object or path to lyric annotation file

Raises

IOError – if lyrics_path does not exist

Returns

LyricData – lyric annotation data

mirdata.datasets.ikala.load_mix_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load an ikala mix.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - audio signal

  • float - sample rate

mirdata.datasets.ikala.load_vocal_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load ikala vocal audio

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - audio signal

  • float - sample rate

irmas

IRMAS Loader

class mirdata.datasets.irmas.Dataset(data_home=None)[source]

The irmas dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a IRMAS dataset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_pred_inst(*args, **kwargs)[source]

Load predominant instrument of track

Parameters

fhandle (str or file-like) – File-like object or path where the test annotations are stored.

Returns

list(str) – test track predominant instrument(s) annotations

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.irmas.Track(track_id, data_home, dataset_name, index, metadata)[source]

IRMAS track class

Parameters
  • track_id (str) – track id of the track

  • data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets/Mridangam-Stroke

Variables
  • track_id (str) – track id

  • predominant_instrument (list) – Training tracks predominant instrument

  • train (bool) – flag to identify if the track is from the training of the testing dataset

  • genre (str) – string containing the namecode of the genre of the track.

  • drum (bool) – flag to identify if the track contains drums or not.

Other Parameters

instrument (list) – list of predominant instruments as str

property audio

The track’s audio signal

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

to_jams()[source]

the track’s data in jams format

Returns

jams.JAMS – return track data in jam format

mirdata.datasets.irmas.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a IRMAS dataset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.irmas.load_pred_inst(fhandle: TextIO) → List[str][source]

Load predominant instrument of track

Parameters

fhandle (str or file-like) – File-like object or path where the test annotations are stored.

Returns

list(str) – test track predominant instrument(s) annotations

maestro

MAESTRO Dataset Loader

class mirdata.datasets.maestro.Dataset(data_home=None)[source]

The maestro dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download the dataset

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a MAESTRO audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_midi(*args, **kwargs)[source]

Load a MAESTRO midi file.

Parameters

fhandle (str or file-like) – File-like object or path to midi file

Returns

pretty_midi.PrettyMIDI – pretty_midi object

load_notes(*args, **kwargs)[source]

Load note data from the midi file.

Parameters
  • midi_path (str) – path to midi file

  • midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

NoteData – note annotations

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.maestro.Track(track_id, data_home, dataset_name, index, metadata)[source]

MAESTRO Track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_path (str) – Path to the track’s audio file

  • canonical_composer (str) – Composer of the piece, standardized on a single spelling for a given name.

  • canonical_title (str) – Title of the piece. Not guaranteed to be standardized to a single representation.

  • duration (float) – Duration in seconds, based on the MIDI file.

  • midi_path (str) – Path to the track’s MIDI file

  • split (str) – Suggested train/validation/test split.

  • track_id (str) – track id

  • year (int) – Year of performance.

Cached Property:

midi (pretty_midi.PrettyMIDI): object containing MIDI annotations notes (NoteData): annotated piano notes

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.maestro.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a MAESTRO audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.maestro.load_midi(fhandle: BinaryIO)pretty_midi.PrettyMIDI[source]

Load a MAESTRO midi file.

Parameters

fhandle (str or file-like) – File-like object or path to midi file

Returns

pretty_midi.PrettyMIDI – pretty_midi object

mirdata.datasets.maestro.load_notes(midi_path, midi=None)[source]

Load note data from the midi file.

Parameters
  • midi_path (str) – path to midi file

  • midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

NoteData – note annotations

medley_solos_db

Medley-solos-DB Dataset Loader.

class mirdata.datasets.medley_solos_db.Dataset(data_home=None)[source]

The medley_solos_db dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Medley Solos DB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.medley_solos_db.Track(track_id, data_home, dataset_name, index, metadata)[source]

medley_solos_db Track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_path (str) – path to the track’s audio file

  • instrument (str) – instrument encoded by its English name

  • instrument_id (int) – instrument encoded as an integer

  • song_id (int) – song encoded as an integer

  • subset (str) – either equal to ‘train’, ‘validation’, or ‘test’

  • track_id (str) – track id

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.medley_solos_db.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a Medley Solos DB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

medleydb_melody

MedleyDB melody Dataset Loader

class mirdata.datasets.medleydb_melody.Dataset(data_home=None)[source]

The medleydb_melody dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a MedleyDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_melody(*args, **kwargs)[source]

Load a MedleyDB melody1 or melody2 annotation file

Parameters

fhandle (str or file-like) – File-like object or path to a melody annotation file

Raises

IOError – if melody_path does not exist

Returns

F0Data – melody data

load_melody3(*args, **kwargs)[source]

Load a MedleyDB melody3 annotation file

Parameters

fhandle (str or file-like) – File-like object or melody 3 melody annotation path

Raises

IOError – if melody_path does not exist

Returns

MultiF0Data – melody 3 annotation data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.medleydb_melody.Track(track_id, data_home, dataset_name, index, metadata)[source]

medleydb_melody Track class

Parameters

track_id (str) – track id of the track

Variables
  • artist (str) – artist

  • audio_path (str) – path to the audio file

  • genre (str) – genre

  • is_excerpt (bool) – True if the track is an excerpt

  • is_instrumental (bool) – True of the track does not contain vocals

  • melody1_path (str) – path to the melody1 annotation file

  • melody2_path (str) – path to the melody2 annotation file

  • melody3_path (str) – path to the melody3 annotation file

  • n_sources (int) – Number of instruments in the track

  • title (str) – title

  • track_id (str) – track id

Other Parameters
  • melody1 (F0Data) – the pitch of the single most predominant source (often the voice)

  • melody2 (F0Data) – the pitch of the predominant source for each point in time

  • melody3 (MultiF0Data) – the pitch of any melodic source. Allows for more than one f0 value at a time

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.medleydb_melody.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a MedleyDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.medleydb_melody.load_melody(fhandle: TextIO)mirdata.annotations.F0Data[source]

Load a MedleyDB melody1 or melody2 annotation file

Parameters

fhandle (str or file-like) – File-like object or path to a melody annotation file

Raises

IOError – if melody_path does not exist

Returns

F0Data – melody data

mirdata.datasets.medleydb_melody.load_melody3(fhandle: TextIO)mirdata.annotations.MultiF0Data[source]

Load a MedleyDB melody3 annotation file

Parameters

fhandle (str or file-like) – File-like object or melody 3 melody annotation path

Raises

IOError – if melody_path does not exist

Returns

MultiF0Data – melody 3 annotation data

medleydb_pitch

MedleyDB pitch Dataset Loader

class mirdata.datasets.medleydb_pitch.Dataset(data_home=None)[source]

The medleydb_pitch dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a MedleyDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_pitch(*args, **kwargs)[source]

load a MedleyDB pitch annotation file

Parameters

pitch_path (str) – path to pitch annotation file

Raises

IOError – if pitch_path doesn’t exist

Returns

F0Data – pitch annotation

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.medleydb_pitch.Track(track_id, data_home, dataset_name, index, metadata)[source]

medleydb_pitch Track class

Parameters

track_id (str) – track id of the track

Variables
  • artist (str) – artist

  • audio_path (str) – path to the audio file

  • genre (str) – genre

  • instrument (str) – instrument of the track

  • pitch_path (str) – path to the pitch annotation file

  • title (str) – title

  • track_id (str) – track id

Other Parameters

pitch (F0Data) – human annotated pitch

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.medleydb_pitch.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a MedleyDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.medleydb_pitch.load_pitch(fhandle: TextIO)mirdata.annotations.F0Data[source]

load a MedleyDB pitch annotation file

Parameters

pitch_path (str) – path to pitch annotation file

Raises

IOError – if pitch_path doesn’t exist

Returns

F0Data – pitch annotation

mridangam_stroke

Mridangam Stroke Dataset Loader

class mirdata.datasets.mridangam_stroke.Dataset(data_home=None)[source]

The mridangam_stroke dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Mridangam Stroke Dataset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.mridangam_stroke.Track(track_id, data_home, dataset_name, index, metadata)[source]

Mridangam Stroke track class

Parameters
  • track_id (str) – track id of the track

  • data_home (str) – Local path where the dataset is stored.

Variables
  • track_id (str) – track id

  • audio_path (str) – audio path

  • stroke_name (str) – name of the Mridangam stroke present in Track

  • tonic (str) – tonic of the stroke in the Track

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.mridangam_stroke.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a Mridangam Stroke Dataset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

orchset

ORCHSET Dataset Loader

class mirdata.datasets.orchset.Dataset(data_home=None)[source]

The orchset dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio_mono(*args, **kwargs)[source]

Load an Orchset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_audio_stereo(*args, **kwargs)[source]

Load an Orchset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the stereo audio signal

  • float - The sample rate of the audio file

load_melody(*args, **kwargs)[source]

Load an Orchset melody annotation file

Parameters

fhandle (str or file-like) – File-like object or path to melody annotation file

Raises

IOError – if melody_path doesn’t exist

Returns

F0Data – melody annotation data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.orchset.Track(track_id, data_home, dataset_name, index, metadata)[source]

orchset Track class

Parameters

track_id (str) – track id of the track

Variables
  • alternating_melody (bool) – True if the melody alternates between instruments

  • audio_path_mono (str) – path to the mono audio file

  • audio_path_stereo (str) – path to the stereo audio file

  • composer (str) – the work’s composer

  • contains_brass (bool) – True if the track contains any brass instrument

  • contains_strings (bool) – True if the track contains any string instrument

  • contains_winds (bool) – True if the track contains any wind instrument

  • excerpt (str) – True if the track is an excerpt

  • melody_path (str) – path to the melody annotation file

  • only_brass (bool) – True if the track contains brass instruments only

  • only_strings (bool) – True if the track contains string instruments only

  • only_winds (bool) – True if the track contains wind instruments only

  • predominant_melodic_instruments (list) – List of instruments which play the melody

  • track_id (str) – track id

  • work (str) – The musical work

Other Parameters

melody (F0Data) – melody annotation

property audio_mono

the track’s audio (mono)

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

property audio_stereo

the track’s audio (stereo)

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.orchset.load_audio_mono(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load an Orchset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.orchset.load_audio_stereo(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load an Orchset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the stereo audio signal

  • float - The sample rate of the audio file

mirdata.datasets.orchset.load_melody(fhandle: TextIO)mirdata.annotations.F0Data[source]

Load an Orchset melody annotation file

Parameters

fhandle (str or file-like) – File-like object or path to melody annotation file

Raises

IOError – if melody_path doesn’t exist

Returns

F0Data – melody annotation data

rwc_classical

RWC Classical Dataset Loader

class mirdata.datasets.rwc_classical.Dataset(data_home=None)[source]

The rwc_classical dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a RWC audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]

Load rwc beat data from a file

Parameters

fhandle (str or file-like) – File-like object or path to beats annotation file

Returns

BeatData – beat data

load_sections(*args, **kwargs)[source]

Load rwc section data from a file

Parameters

fhandle (str or file-like) – File-like object or path to sections annotation file

Returns

SectionData – section data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.rwc_classical.Track(track_id, data_home, dataset_name, index, metadata)[source]

rwc_classical Track class

Parameters

track_id (str) – track id of the track

Variables
  • artist (str) – the track’s artist

  • audio_path (str) – path of the audio file

  • beats_path (str) – path of the beat annotation file

  • category (str) – One of ‘Symphony’, ‘Concerto’, ‘Orchestral’, ‘Solo’, ‘Chamber’, ‘Vocal’, or blank.

  • composer (str) – Composer of this Track.

  • duration (float) – Duration of the track in seconds

  • piece_number (str) – Piece number of this Track, [1-50]

  • sections_path (str) – path of the section annotation file

  • suffix (str) – string within M01-M06

  • title (str) – Title of The track.

  • track_id (str) – track id

  • track_number (str) – CD track number of this Track

Other Parameters
  • sections (SectionData) – human-labeled section annotations

  • beats (BeatData) – human-labeled beat annotations

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.rwc_classical.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a RWC audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.rwc_classical.load_beats(fhandle: TextIO)mirdata.annotations.BeatData[source]

Load rwc beat data from a file

Parameters

fhandle (str or file-like) – File-like object or path to beats annotation file

Returns

BeatData – beat data

mirdata.datasets.rwc_classical.load_sections(fhandle: TextIO) → Optional[mirdata.annotations.SectionData][source]

Load rwc section data from a file

Parameters

fhandle (str or file-like) – File-like object or path to sections annotation file

Returns

SectionData – section data

rwc_jazz

RWC Jazz Dataset Loader.

class mirdata.datasets.rwc_jazz.Dataset(data_home=None)[source]

The rwc_jazz dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a RWC audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]

Load rwc beat data from a file

Parameters

fhandle (str or file-like) – File-like object or path to beats annotation file

Returns

BeatData – beat data

load_sections(*args, **kwargs)[source]

Load rwc section data from a file

Parameters

fhandle (str or file-like) – File-like object or path to sections annotation file

Returns

SectionData – section data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.rwc_jazz.Track(track_id, data_home, dataset_name, index, metadata)[source]

rwc_jazz Track class

Parameters

track_id (str) – track id of the track

Variables
  • artist (str) – Artist name

  • audio_path (str) – path of the audio file

  • beats_path (str) – path of the beat annotation file

  • duration (float) – Duration of the track in seconds

  • instruments (str) – list of used instruments.

  • piece_number (str) – Piece number of this Track, [1-50]

  • sections_path (str) – path of the section annotation file

  • suffix (str) – M01-M04

  • title (str) – Title of The track.

  • track_id (str) – track id

  • track_number (str) – CD track number of this Track

  • variation (str) – style variations

Other Parameters
  • sections (SectionData) – human-labeled section data

  • beats (BeatData) – human-labeled beat data

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

salami

SALAMI Dataset Loader

class mirdata.datasets.salami.Dataset(data_home=None)[source]

The salami dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Salami audio file.

Parameters

fhandle (str or file-like) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_sections(*args, **kwargs)[source]

Load salami sections data from a file

Parameters

fhandle (str or file-like) – File-like object or path to sectin annotation file

Returns

SectionData – section data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.salami.Track(track_id, data_home, dataset_name, index, metadata)[source]

salami Track class

Parameters

track_id (str) – track id of the track

Variables
  • annotator_1_id (str) – number that identifies annotator 1

  • annotator_1_time (str) – time that the annotator 1 took to complete the annotation

  • annotator_2_id (str) – number that identifies annotator 1

  • annotator_2_time (str) – time that the annotator 1 took to complete the annotation

  • artist (str) – song artist

  • audio_path (str) – path to the audio file

  • broad_genre (str) – broad genre of the song

  • duration (float) – duration of song in seconds

  • genre (str) – genre of the song

  • sections_annotator1_lowercase_path (str) – path to annotations in hierarchy level 1 from annotator 1

  • sections_annotator1_uppercase_path (str) – path to annotations in hierarchy level 0 from annotator 1

  • sections_annotator2_lowercase_path (str) – path to annotations in hierarchy level 1 from annotator 2

  • sections_annotator2_uppercase_path (str) – path to annotations in hierarchy level 0 from annotator 2

  • source (str) – dataset or source of song

  • title (str) – title of the song

Other Parameters
  • sections_annotator_1_uppercase (SectionData) – annotations in hierarchy level 0 from annotator 1

  • sections_annotator_1_lowercase (SectionData) – annotations in hierarchy level 1 from annotator 1

  • sections_annotator_2_uppercase (SectionData) – annotations in hierarchy level 0 from annotator 2

  • sections_annotator_2_lowercase (SectionData) – annotations in hierarchy level 1 from annotator 2

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.salami.load_audio(fhandle: str) → Tuple[numpy.ndarray, float][source]

Load a Salami audio file.

Parameters

fhandle (str or file-like) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.salami.load_sections(fhandle: TextIO)mirdata.annotations.SectionData[source]

Load salami sections data from a file

Parameters

fhandle (str or file-like) – File-like object or path to sectin annotation file

Returns

SectionData – section data

saraga_carnatic

Saraga Dataset Loader

class mirdata.datasets.saraga_carnatic.Dataset(data_home=None)[source]

The saraga_carnatic dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Saraga Carnatic audio file.

Parameters

audio_path (str) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_metadata(*args, **kwargs)[source]

Load a Saraga Carnatic metadata file

Parameters

metadata_path (str) – path to metadata json file

Returns

dict

metadata with the following fields

  • title (str): Title of the piece in the track

  • mbid (str): MusicBrainz ID of the track

  • album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid

  • artists (list, dicts): list of dicts containing information of the featuring artists in the track

  • raaga (list, dict): list of dicts containing information about the raagas present in the track

  • form (list, dict): list of dicts containing information about the forms present in the track

  • work (list, dicts): list of dicts containing the work present in the piece, and its mbid

  • taala (list, dicts): list of dicts containing the talas present in the track and its uuid

  • concert (list, dicts): list of dicts containing the concert where the track is present and its mbid

load_phrases(*args, **kwargs)[source]

Load phrases

Parameters

phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None.

Returns

EventData – phrases annotation for track

load_pitch(*args, **kwargs)[source]

Load pitch

Parameters

pitch path (str) – Local path where the pitch annotation is stored. If None, returns None.

Returns

F0Data – pitch annotation

load_sama(*args, **kwargs)[source]

Load sama

Parameters

sama_path (str) – Local path where the sama annotation is stored. If None, returns None.

Returns

BeatData – sama annotations

load_sections(*args, **kwargs)[source]

Load sections from carnatic collection

Parameters

sections_path (str) – Local path where the section annotation is stored.

Returns

SectionData – section annotations for track

load_tempo(*args, **kwargs)[source]

Load tempo from carnatic collection

Parameters

tempo_path (str) – Local path where the tempo annotation is stored.

Returns

dict

Dictionary of tempo information with the following keys:

  • tempo_apm: tempo in aksharas per minute (APM)

  • tempo_bpm: tempo in beats per minute (BPM)

  • sama_interval: median duration (in seconds) of one tāla cycle

  • beats_per_cycle: number of beats in one cycle of the tāla

  • subdivisions: number of aksharas per beat of the tāla

load_tonic(*args, **kwargs)[source]

Load track absolute tonic

Parameters

tonic_path (str) – Local path where the tonic path is stored. If None, returns None.

Returns

int – Tonic annotation in Hz

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.saraga_carnatic.Track(track_id, data_home, dataset_name, index, metadata)[source]

Saraga Track Carnatic class

Parameters
  • track_id (str) – track id of the track

  • data_home (str) – Local path where the dataset is stored. default=None If None, looks for the data in the default directory, ~/mir_datasets

Variables
  • audio_path (str) – path to audio file

  • audio_ghatam_path (str) – path to ghatam audio file

  • audio_mridangam_left_path (str) – path to mridangam left audio file

  • audio_mridangam_right_path (str) – path to mridangam right audio file

  • audio_violin_path (str) – path to violin audio file

  • audio_vocal_s_path (str) – path to vocal s audio file

  • audio_vocal_pat (str) – path to vocal pat audio file

  • ctonic_path (srt) – path to ctonic annotation file

  • pitch_path (srt) – path to pitch annotation file

  • pitch_vocal_path (srt) – path to vocal pitch annotation file

  • tempo_path (srt) – path to tempo annotation file

  • sama_path (srt) – path to sama annotation file

  • sections_path (srt) – path to sections annotation file

  • phrases_path (srt) – path to phrases annotation file

  • metadata_path (srt) – path to metadata file

Other Parameters
  • tonic (float) – tonic annotation

  • pitch (F0Data) – pitch annotation

  • pitch_vocal (F0Data) – vocal pitch annotation

  • tempo (dict) – tempo annotations

  • sama (BeatData) – sama section annotations

  • sections (SectionData) – track section annotations

  • phrases (SectionData) – phrase annotations

  • metadata (dict) – track metadata with the following fields:

    • title (str): Title of the piece in the track

    • mbid (str): MusicBrainz ID of the track

    • album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid

    • artists (list, dicts): list of dicts containing information of the featuring artists in the track

    • raaga (list, dict): list of dicts containing information about the raagas present in the track

    • form (list, dict): list of dicts containing information about the forms present in the track

    • work (list, dicts): list of dicts containing the work present in the piece, and its mbid

    • taala (list, dicts): list of dicts containing the talas present in the track and its uuid

    • concert (list, dicts): list of dicts containing the concert where the track is present and its mbid

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.saraga_carnatic.load_audio(audio_path)[source]

Load a Saraga Carnatic audio file.

Parameters

audio_path (str) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.saraga_carnatic.load_metadata(metadata_path)[source]

Load a Saraga Carnatic metadata file

Parameters

metadata_path (str) – path to metadata json file

Returns

dict

metadata with the following fields

  • title (str): Title of the piece in the track

  • mbid (str): MusicBrainz ID of the track

  • album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid

  • artists (list, dicts): list of dicts containing information of the featuring artists in the track

  • raaga (list, dict): list of dicts containing information about the raagas present in the track

  • form (list, dict): list of dicts containing information about the forms present in the track

  • work (list, dicts): list of dicts containing the work present in the piece, and its mbid

  • taala (list, dicts): list of dicts containing the talas present in the track and its uuid

  • concert (list, dicts): list of dicts containing the concert where the track is present and its mbid

mirdata.datasets.saraga_carnatic.load_phrases(phrases_path)[source]

Load phrases

Parameters

phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None.

Returns

EventData – phrases annotation for track

mirdata.datasets.saraga_carnatic.load_pitch(pitch_path)[source]

Load pitch

Parameters

pitch path (str) – Local path where the pitch annotation is stored. If None, returns None.

Returns

F0Data – pitch annotation

mirdata.datasets.saraga_carnatic.load_sama(sama_path)[source]

Load sama

Parameters

sama_path (str) – Local path where the sama annotation is stored. If None, returns None.

Returns

BeatData – sama annotations

mirdata.datasets.saraga_carnatic.load_sections(sections_path)[source]

Load sections from carnatic collection

Parameters

sections_path (str) – Local path where the section annotation is stored.

Returns

SectionData – section annotations for track

mirdata.datasets.saraga_carnatic.load_tempo(tempo_path)[source]

Load tempo from carnatic collection

Parameters

tempo_path (str) – Local path where the tempo annotation is stored.

Returns

dict

Dictionary of tempo information with the following keys:

  • tempo_apm: tempo in aksharas per minute (APM)

  • tempo_bpm: tempo in beats per minute (BPM)

  • sama_interval: median duration (in seconds) of one tāla cycle

  • beats_per_cycle: number of beats in one cycle of the tāla

  • subdivisions: number of aksharas per beat of the tāla

mirdata.datasets.saraga_carnatic.load_tonic(tonic_path)[source]

Load track absolute tonic

Parameters

tonic_path (str) – Local path where the tonic path is stored. If None, returns None.

Returns

int – Tonic annotation in Hz

saraga_hindustani

Saraga Dataset Loader

class mirdata.datasets.saraga_hindustani.Dataset(data_home=None)[source]

The saraga_hindustani dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Saraga Hindustani audio file.

Parameters

audio_path (str) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_phrases(*args, **kwargs)[source]

Load phrases

Parameters

phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None.

Returns

EventData – phrases annotation for track

load_pitch(*args, **kwargs)[source]

Load automatic extracted pitch or melody

Parameters

pitch path (str) – Local path where the pitch annotation is stored. If None, returns None.

Returns

F0Data – pitch annotation

load_sama(*args, **kwargs)[source]

Load sama

Parameters

sama_path (str) – Local path where the sama annotation is stored. If None, returns None.

Returns

SectionData – sama annotations

load_sections(*args, **kwargs)[source]

Load tracks sections

Parameters

sections_path (str) – Local path where the section annotation is stored.

Returns

SectionData – section annotations for track

load_tempo(*args, **kwargs)[source]

Load tempo from hindustani collection

Parameters

tempo_path (str) – Local path where the tempo annotation is stored.

Returns

dict – Dictionary of tempo information with the following keys:

  • tempo: median tempo for the section in mātrās per minute (MPM)

  • matra_interval: tempo expressed as the duration of the mātra (essentially dividing 60 by tempo, expressed in seconds)

  • sama_interval: median duration of one tāl cycle in the section

  • matras_per_cycle: indicator of the structure of the tāl, showing the number of mātrā in a cycle of the tāl of the recording

  • start_time: start time of the section

  • duration: duration of the section

load_tonic(*args, **kwargs)[source]

Load track absolute tonic

Parameters

tonic_path (str) – Local path where the tonic path is stored. If None, returns None.

Returns

int – Tonic annotation in Hz

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.saraga_hindustani.Track(track_id, data_home, dataset_name, index, metadata)[source]

Saraga Hindustani Track class

Parameters
  • track_id (str) – track id of the track

  • data_home (str) – Local path where the dataset is stored. default=None If None, looks for the data in the default directory, ~/mir_datasets

Variables
  • audio_path (str) – path to audio file

  • ctonic_path (str) – path to ctonic annotation file

  • pitch_path (str) – path to pitch annotation file

  • tempo_path (str) – path to tempo annotation file

  • sama_path (str) – path to sama annotation file

  • sections_path (str) – path to sections annotation file

  • phrases_path (str) – path to phrases annotation file

  • metadata_path (str) – path to metadata annotation file

Other Parameters
  • tonic (float) – tonic annotation

  • pitch (F0Data) – pitch annotation

  • tempo (dict) – tempo annotations

  • sama (BeatData) – Sama section annotations

  • sections (SectionData) – track section annotations

  • phrases (EventData) – phrase annotations

  • metadata (dict) – track metadata with the following fields

    • title (str): Title of the piece in the track

    • mbid (str): MusicBrainz ID of the track

    • album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid

    • artists (list, dicts): list of dicts containing information of the featuring artists in the track

    • raags (list, dict): list of dicts containing information about the raags present in the track

    • forms (list, dict): list of dicts containing information about the forms present in the track

    • release (list, dicts): list of dicts containing information of the release where the track is found

    • works (list, dicts): list of dicts containing the work present in the piece, and its mbid

    • taals (list, dicts): list of dicts containing the taals present in the track and its uuid

    • layas (list, dicts): list of dicts containing the layas present in the track and its uuid

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.saraga_hindustani.load_audio(audio_path)[source]

Load a Saraga Hindustani audio file.

Parameters

audio_path (str) – path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.saraga_hindustani.load_metadata(metadata_path)[source]

Load a Saraga Hindustani metadata file

Parameters

metadata_path (str) – path to metadata json file

Returns

dict

metadata with the following fields

  • title (str): Title of the piece in the track

  • mbid (str): MusicBrainz ID of the track

  • album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid

  • artists (list, dicts): list of dicts containing information of the featuring artists in the track

  • raags (list, dict): list of dicts containing information about the raags present in the track

  • forms (list, dict): list of dicts containing information about the forms present in the track

  • release (list, dicts): list of dicts containing information of the release where the track is found

  • works (list, dicts): list of dicts containing the work present in the piece, and its mbid

  • taals (list, dicts): list of dicts containing the taals present in the track and its uuid

  • layas (list, dicts): list of dicts containing the layas present in the track and its uuid

mirdata.datasets.saraga_hindustani.load_phrases(phrases_path)[source]

Load phrases

Parameters

phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None.

Returns

EventData – phrases annotation for track

mirdata.datasets.saraga_hindustani.load_pitch(pitch_path)[source]

Load automatic extracted pitch or melody

Parameters

pitch path (str) – Local path where the pitch annotation is stored. If None, returns None.

Returns

F0Data – pitch annotation

mirdata.datasets.saraga_hindustani.load_sama(sama_path)[source]

Load sama

Parameters

sama_path (str) – Local path where the sama annotation is stored. If None, returns None.

Returns

SectionData – sama annotations

mirdata.datasets.saraga_hindustani.load_sections(sections_path)[source]

Load tracks sections

Parameters

sections_path (str) – Local path where the section annotation is stored.

Returns

SectionData – section annotations for track

mirdata.datasets.saraga_hindustani.load_tempo(tempo_path)[source]

Load tempo from hindustani collection

Parameters

tempo_path (str) – Local path where the tempo annotation is stored.

Returns

dict – Dictionary of tempo information with the following keys:

  • tempo: median tempo for the section in mātrās per minute (MPM)

  • matra_interval: tempo expressed as the duration of the mātra (essentially dividing 60 by tempo, expressed in seconds)

  • sama_interval: median duration of one tāl cycle in the section

  • matras_per_cycle: indicator of the structure of the tāl, showing the number of mātrā in a cycle of the tāl of the recording

  • start_time: start time of the section

  • duration: duration of the section

mirdata.datasets.saraga_hindustani.load_tonic(tonic_path)[source]

Load track absolute tonic

Parameters

tonic_path (str) – Local path where the tonic path is stored. If None, returns None.

Returns

int – Tonic annotation in Hz

tinysol

TinySOL Dataset Loader.

class mirdata.datasets.tinysol.Dataset(data_home=None)[source]

The tinysol dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a TinySOL audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.tinysol.Track(track_id, data_home, dataset_name, index, metadata)[source]

tinysol Track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_path (str) – path of the audio file

  • dynamics (str) – dynamics abbreviation. Ex: pp, mf, ff, etc.

  • dynamics_id (int) – pp=0, p=1, mf=2, f=3, ff=4

  • family (str) – instrument family encoded by its English name

  • instance_id (int) – instance ID. Either equal to 0, 1, 2, or 3.

  • instrument_abbr (str) – instrument abbreviation

  • instrument_full (str) – instrument encoded by its English name

  • is_resampled (bool) – True if this sample was pitch-shifted from a neighbor; False if it was genuinely recorded.

  • pitch (str) – string containing English pitch class and octave number

  • pitch_id (int) – MIDI note index, where middle C (“C4”) corresponds to 60

  • string_id (NoneType) – string ID. By musical convention, the first string is the highest. On wind instruments, this is replaced by None.

  • technique_abbr (str) – playing technique abbreviation

  • technique_full (str) – playing technique encoded by its English name

  • track_id (str) – track id

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.tinysol.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a TinySOL audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

tonality_classicaldb

Tonality classicalDB Dataset Loader

class mirdata.datasets.tonality_classicaldb.Dataset(data_home=None)[source]

The tonality_classicaldb dataset

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_audio(*args, **kwargs)[source]

Load a Tonality classicalDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

load_hpcp(*args, **kwargs)[source]

Load Tonality classicalDB HPCP feature from a file

Parameters

fhandle (str or file-like) – File-like object or path to HPCP file

Returns

np.ndarray – loaded HPCP data

load_key(*args, **kwargs)[source]

Load Tonality classicalDB format key data from a file

Parameters

fhandle (str or file-like) – File-like object or path to key annotation file

Returns

str – musical key data

load_musicbrainz(*args, **kwargs)[source]

Load Tonality classicalDB musicbraiz metadata from a file

Parameters

fhandle (str or file-like) – File-like object or path to musicbrainz metadata file

Returns

dict – musicbrainz metadata

load_spectrum(*args, **kwargs)[source]

Load Tonality classicalDB spectrum data from a file

Parameters

fhandle (str or file-like) – File-like object or path to spectrum file

Returns

np.ndarray – spectrum data

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.datasets.tonality_classicaldb.Track(track_id, data_home, dataset_name, index, metadata)[source]

tonality_classicaldb track class

Parameters

track_id (str) – track id of the track

Variables
  • audio_path (str) – track audio path

  • key_path (str) – key annotation path

  • title (str) – title of the track

  • track_id (str) – track id

Other Parameters
  • key (str) – key annotation

  • spectrum (np.array) – computed audio spectrum

  • hpcp (np.array) – computed hpcp

  • musicbrainz_metadata (dict) – MusicBrainz metadata

property audio

The track’s audio

Returns

  • np.ndarray - audio signal

  • float - sample rate

to_jams()[source]

Get the track’s data in jams format

Returns

jams.JAMS – the track’s data in jams format

mirdata.datasets.tonality_classicaldb.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]

Load a Tonality classicalDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

  • np.ndarray - the mono audio signal

  • float - The sample rate of the audio file

mirdata.datasets.tonality_classicaldb.load_hpcp(fhandle: TextIO)numpy.ndarray[source]

Load Tonality classicalDB HPCP feature from a file

Parameters

fhandle (str or file-like) – File-like object or path to HPCP file

Returns

np.ndarray – loaded HPCP data

mirdata.datasets.tonality_classicaldb.load_key(fhandle: TextIO) → str[source]

Load Tonality classicalDB format key data from a file

Parameters

fhandle (str or file-like) – File-like object or path to key annotation file

Returns

str – musical key data

mirdata.datasets.tonality_classicaldb.load_musicbrainz(fhandle: TextIO) → Dict[Any, Any][source]

Load Tonality classicalDB musicbraiz metadata from a file

Parameters

fhandle (str or file-like) – File-like object or path to musicbrainz metadata file

Returns

dict – musicbrainz metadata

mirdata.datasets.tonality_classicaldb.load_spectrum(fhandle: TextIO)numpy.ndarray[source]

Load Tonality classicalDB spectrum data from a file

Parameters

fhandle (str or file-like) – File-like object or path to spectrum file

Returns

np.ndarray – spectrum data

Core

Core mirdata classes

class mirdata.core.Dataset(data_home=None, name=None, track_class=None, bibtex=None, remotes=None, download_info=None, license_info=None, custom_index_path=None)[source]

mirdata Dataset class

Variables
  • data_home (str) – path where mirdata will look for the dataset

  • name (str) – the identifier of the dataset

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • readme (str) – information about the dataset

  • track (function) – a function mapping a track_id to a mirdata.core.Track

__init__(data_home=None, name=None, track_class=None, bibtex=None, remotes=None, download_info=None, license_info=None, custom_index_path=None)[source]

Dataset init method

Parameters
  • data_home (str or None) – path where mirdata will look for the dataset

  • name (str or None) – the identifier of the dataset

  • track_class (mirdata.core.Track or None) – a Track class

  • bibtex (str or None) – dataset citation/s in bibtex format

  • remotes (dict or None) – data to be downloaded

  • download_info (str or None) – download instructions or caveats

  • license_info (str or None) – license of the dataset

  • custom_index_path (str or None) – overwrites the default index path for remote indexes

choice_track()[source]

Choose a random track

Returns

Track – a Track object instantiated by a random track_id

cite()[source]

Print the reference

property default_path

Get the default path for the dataset

Returns

str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally print a message.

Parameters
  • partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises
  • ValueError – if invalid keys are passed to partial_download

  • IOError – if a downloaded file’s checksum is different from expected

license()[source]

Print the license

load_tracks()[source]

Load all tracks in the dataset

Returns

dict – {track_id: track data}

Raises

NotImplementedError – If the dataset does not support Tracks

track_ids[source]

Return track ids

Returns

list – A list of track ids

validate(verbose=True)[source]

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

  • list - files in the index but are missing locally

  • list - files which have an invalid checksum

class mirdata.core.MultiTrack(track_id, data_home, dataset_name, index, metadata=None)[source]

MultiTrack class.

A multitrack class is a collection of track objects and their associated audio that can be mixed together. A multitrack is iteslf a Track, and can have its own associated audio (such as a mastered mix), its own metadata and its own annotations.

get_mix()[source]

Create a linear mixture given a subset of tracks.

Parameters

track_keys (list) – list of track keys to mix together

Returns

np.ndarray – mixture audio with shape (n_samples, n_channels)

get_random_target(n_tracks=None, min_weight=0.3, max_weight=1.0)[source]

Get a random target by combining a random selection of tracks with random weights

Parameters
  • n_tracks (int or None) – number of tracks to randomly mix. If None, uses all tracks

  • min_weight (float) – minimum possible weight when mixing

  • max_weight (float) – maximum possible weight when mixing

Returns

  • np.ndarray - mixture audio with shape (n_samples, n_channels)

  • list - list of keys of included tracks

  • list - list of weights used to mix tracks

get_target(track_keys, weights=None, average=True, enforce_length=True)[source]

Get target which is a linear mixture of tracks

Parameters
  • track_keys (list) – list of track keys to mix together

  • weights (list or None) – list of positive scalars to be used in the average

  • average (bool) – if True, computes a weighted average of the tracks if False, computes a weighted sum of the tracks

  • enforce_length (bool) – If True, raises ValueError if the tracks are not the same length. If False, pads audio with zeros to match the length of the longest track

Returns

np.ndarray – target audio with shape (n_channels, n_samples)

Raises

ValueError – if sample rates of the tracks are not equal if enforce_length=True and lengths are not equal

class mirdata.core.Track(track_id, data_home, dataset_name, index, metadata=None)[source]

Track base class

See the docs for each dataset loader’s Track class for details

__init__(track_id, data_home, dataset_name, index, metadata=None)[source]

Track init method. Sets boilerplate attributes, including:

  • track_id

  • _dataset_name

  • _data_home

  • _track_paths

  • _track_metadata

Parameters
  • track_id (str) – track id

  • data_home (str) – path where mirdata will look for the dataset

  • dataset_name (str) – the identifier of the dataset

  • index (dict) – the dataset’s file index

  • metadata (dict or None) – a dictionary of metadata or None

class mirdata.core.cached_property(func)[source]

Cached propery decorator

A property that is only computed once per instance and then replaces itself with an ordinary attribute. Deleting the attribute resets the property. Source: https://github.com/bottlepy/bottle/commit/fa7733e075da0d790d809aa3d2f53071897e6f76

mirdata.core.copy_docs(original)[source]

Decorator function to copy docs from one function to another

mirdata.core.docstring_inherit(parent)[source]

Decorator function to inherit docstrings from the parent class.

Adds documented Attributes from the parent to the child docs.

mirdata.core.none_path_join(partial_path_list)[source]

Join a list of partial paths. If any part of the path is None, returns None.

Parameters

partial_path_list (list) – List of partial paths

Returns

str or None – joined path string or None

Annotations

mirdata annotation data types

class mirdata.annotations.Annotation[source]

Annotation base class

class mirdata.annotations.BeatData(times, positions=None)[source]

BeatData class

Variables
  • times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values

  • positions (np.ndarray or None) – array of beat positions (as ints) e.g. 1, 2, 3, 4

class mirdata.annotations.ChordData(intervals, labels, confidence=None)[source]

ChordData class

Variables
  • intervals (np.ndarray or None) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.

  • labels (list) – list chord labels (as strings)

  • confidence (np.ndarray or None) – array of confidence values between 0 and 1

class mirdata.annotations.EventData(intervals, events)[source]

TempoData class

Variables
  • intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.

  • events (list) – list of event labels (as strings)

class mirdata.annotations.F0Data(times, frequencies, confidence=None)[source]

F0Data class

Variables
  • times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values

  • frequencies (np.ndarray) – array of frequency values (as floats) in Hz

  • confidence (np.ndarray or None) – array of confidence values between 0 and 1

class mirdata.annotations.KeyData(intervals, keys)[source]

KeyData class

Variables
  • intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.

  • keys (list) – list key labels (as strings)

class mirdata.annotations.LyricData(intervals, lyrics, pronunciations=None)[source]

LyricData class

Variables
  • intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.

  • lyrics (list) – list of lyrics (as strings)

  • pronunciations (list or None) – list of pronunciations (as strings)

class mirdata.annotations.MultiF0Data(times, frequency_list, confidence_list=None)[source]

MultiF0Data class

Variables
  • times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values

  • frequency_list (list) – list of lists of frequency values (as floats) in Hz

  • confidence_list (list or None) – list of lists of confidence values between 0 and 1

class mirdata.annotations.NoteData(intervals, notes, confidence=None)[source]

NoteData class

Variables
  • intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.

  • notes (np.ndarray) – array of notes (as floats) in Hz

  • confidence (np.ndarray or None) – array of confidence values between 0 and 1

class mirdata.annotations.SectionData(intervals, labels=None)[source]

SectionData class

Variables
  • intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] times should be positive and intervals should have non-negative duration

  • labels (list or None) – list of labels (as strings)

class mirdata.annotations.TempoData(intervals, value, confidence=None)[source]

TempoData class

Variables
  • intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.

  • value (list) – array of tempo values (as floats)

  • confidence (np.ndarray or None) – array of confidence values between 0 and 1

mirdata.annotations.validate_array_like(array_like, expected_type, expected_dtype, none_allowed=False)[source]

Validate that array-like object is well formed

If array_like is None, validation passes automatically.

Parameters
  • array_like (array-like) – object to validate

  • expected_type (type) – expected type, either list or np.ndarray

  • expected_dtype (type) – expected dtype

  • none_allowed (bool) – if True, allows array to be None

Raises
  • TypeError – if type/dtype does not match expected_type/expected_dtype

  • ValueError – if array

mirdata.annotations.validate_confidence(confidence)[source]

Validate if confidence is well-formed.

If confidence is None, validation passes automatically

Parameters

confidence (np.ndarray) – an array of confidence values

Raises

ValueError – if confidence are not between 0 and 1

mirdata.annotations.validate_intervals(intervals)[source]

Validate if intervals are well-formed.

If intervals is None, validation passes automatically

Parameters

intervals (np.ndarray) – (n x 2) array

Raises
  • ValueError – if intervals have an invalid shape, have negative values

  • or if end times are smaller than start times.

mirdata.annotations.validate_lengths_equal(array_list)[source]

Validate that arrays in list are equal in length

Some arrays may be None, and the validation for these are skipped.

Parameters

array_list (list) – list of array-like objects

Raises

ValueError – if arrays are not equal in length

mirdata.annotations.validate_times(times)[source]

Validate if times are well-formed.

If times is None, validation passes automatically

Parameters

times (np.ndarray) – an array of time stamps

Raises

ValueError – if times have negative values or are non-increasing

Advanced

mirdata.validate

Utility functions for mirdata

mirdata.validate.log_message(message, verbose=True)[source]

Helper function to log message

Parameters
  • message (str) – message to log

  • verbose (bool) – if false, the message is not logged

mirdata.validate.md5(file_path)[source]

Get md5 hash of a file.

Parameters

file_path (str) – File path

Returns

str – md5 hash of data in file_path

mirdata.validate.validate(local_path, checksum)[source]

Validate that a file exists and has the correct checksum

Parameters
  • local_path (str) – file path

  • checksum (str) – md5 checksum

Returns

  • bool - True if file exists

  • bool - True if checksum matches

mirdata.validate.validate_files(file_dict, data_home, verbose)[source]

Validate files

Parameters
  • file_dict (dict) – dictionary of file information

  • data_home (str) – path where the data lives

  • verbose (bool) – if True, show progress

Returns

  • dict - missing files

  • dict - files with invalid checksums

mirdata.validate.validate_index(dataset_index, data_home, verbose=True)[source]

Validate files in a dataset’s index

Parameters
  • dataset_index (list) – dataset indices

  • data_home (str) – Local home path that the dataset is being stored

  • verbose (bool) – if true, prints validation status while running

Returns

  • dict - file paths that are in the index but missing locally

  • dict - file paths with differing checksums

mirdata.validate.validate_metadata(file_dict, data_home, verbose)[source]

Validate files

Parameters
  • file_dict (dict) – dictionary of file information

  • data_home (str) – path where the data lives

  • verbose (bool) – if True, show progress

Returns

  • dict - missing files

  • dict - files with invalid checksums

mirdata.validate.validator(dataset_index, data_home, verbose=True)[source]

Checks the existence and validity of files stored locally with respect to the paths and file checksums stored in the reference index. Logs invalid checksums and missing files.

Parameters
  • dataset_index (list) – dataset indices

  • data_home (str) – Local home path that the dataset is being stored

  • verbose (bool) – if True (default), prints missing and invalid files to stdout. Otherwise, this function is equivalent to validate_index.

Returns

missing_files (list)

List of file paths that are in the dataset index

but missing locally.

invalid_checksums (list): List of file paths that file exists in the

dataset index but has a different checksum compare to the reference checksum.

mirdata.download_utils

Utilities for downloading from the web.

class mirdata.download_utils.DownloadProgressBar(*_, **__)[source]

Wrap tqdm to show download progress

class mirdata.download_utils.RemoteFileMetadata(filename, url, checksum, destination_dir=None, unpack_directories=None)[source]

The metadata for a remote file

Variables
  • filename (str) – the remote file’s basename

  • url (str) – the remote file’s url

  • checksum (str) – the remote file’s md5 checksum

  • destination_dir (str or None) – the relative path for where to save the file

  • unpack_directories (list or None) – list of relative directories. For each directory the contents will be moved to destination_dir (or data_home if not provieds)

mirdata.download_utils.download_from_remote(remote, save_dir, force_overwrite)[source]

Download a remote dataset into path Fetch a dataset pointed by remote’s url, save into path using remote’s filename and ensure its integrity based on the MD5 Checksum of the downloaded file.

Adapted from scikit-learn’s sklearn.datasets.base._fetch_remote.

Parameters
  • remote (RemoteFileMetadata) – Named tuple containing remote dataset meta information: url, filename and checksum

  • save_dir (str) – Directory to save the file to. Usually data_home

  • force_overwrite (bool) – If True, overwrite existing file with the downloaded file. If False, does not overwrite, but checks that checksum is consistent.

Returns

str – Full path of the created file.

mirdata.download_utils.download_tar_file(tar_remote, save_dir, force_overwrite, cleanup)[source]

Download and untar a tar file.

Parameters
  • tar_remote (RemoteFileMetadata) – Object containing download information

  • save_dir (str) – Path to save downloaded file

  • force_overwrite (bool) – If True, overwrites existing files

  • cleanup (bool) – If True, remove tarfile after untarring

mirdata.download_utils.download_zip_file(zip_remote, save_dir, force_overwrite, cleanup)[source]

Download and unzip a zip file.

Parameters
  • zip_remote (RemoteFileMetadata) – Object containing download information

  • save_dir (str) – Path to save downloaded file

  • force_overwrite (bool) – If True, overwrites existing files

  • cleanup (bool) – If True, remove zipfile after unziping

mirdata.download_utils.downloader(save_dir, remotes=None, partial_download=None, info_message=None, force_overwrite=False, cleanup=False)[source]

Download data to save_dir and optionally log a message.

Parameters
  • save_dir (str) – The directory to download the data

  • remotes (dict or None) – A dictionary of RemoteFileMetadata tuples of data in zip format. If None, there is no data to download

  • partial_download (list or None) – A list of keys to partially download the remote objects of the download dict. If None, all data is downloaded

  • info_message (str or None) – A string of info to log when this function is called. If None, no string is logged.

  • force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.

  • cleanup (bool) – Whether to delete the zip/tar file after extracting.

mirdata.download_utils.extractall_unicode(zfile, out_dir)[source]

Extract all files inside a zip archive to a output directory.

In comparison to the zipfile, it checks for correct file name encoding

Parameters
  • zfile (obj) – Zip file object created with zipfile.ZipFile

  • out_dir (str) – Output folder

mirdata.download_utils.move_directory_contents(source_dir, target_dir)[source]

Move the contents of source_dir into target_dir, and delete source_dir

Parameters
  • source_dir (str) – path to source directory

  • target_dir (str) – path to target directory

mirdata.download_utils.untar(tar_path, cleanup)[source]

Untar a tar file inside it’s current directory.

Parameters
  • tar_path (str) – Path to tar file

  • cleanup (bool) – If True, remove tarfile after untarring

mirdata.download_utils.unzip(zip_path, cleanup)[source]

Unzip a zip file inside it’s current directory.

Parameters
  • zip_path (str) – Path to zip file

  • cleanup (bool) – If True, remove zipfile after unzipping

mirdata.jams_utils

Utilities for converting mirdata Annotation classes to jams format.

mirdata.jams_utils.beats_to_jams(beat_data, description=None)[source]

Convert beat annotations into jams format.

Parameters
  • beat_data (annotations.BeatData) – beat data object

  • description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.chords_to_jams(chord_data, description=None)[source]

Convert chord annotations into jams format.

Parameters
  • chord_data (annotations.ChordData) – chord data object

  • description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.events_to_jams(event_data, description=None)[source]

Convert events annotations into jams format.

Parameters
  • event_data (annotations.EventData) – event data object

  • description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.f0s_to_jams(f0_data, description=None)[source]

Convert f0 annotations into jams format.

Parameters
  • f0_data (annotations.F0Data) – f0 annotation object

  • description (str) – annotation descriptoin

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.jams_converter(audio_path=None, spectrogram_path=None, beat_data=None, chord_data=None, note_data=None, f0_data=None, section_data=None, multi_section_data=None, tempo_data=None, event_data=None, key_data=None, lyrics_data=None, tags_gtzan_data=None, tags_open_data=None, metadata=None)[source]

Convert annotations from a track to JAMS format.

Parameters
  • audio_path (str or None) – A path to the corresponding audio file, or None. If provided, the audio file will be read to compute the duration. If None, ‘duration’ must be a field in the metadata dictionary, or the resulting jam object will not validate.

  • spectrum_cante100_path (str or None) – A path to the corresponding spectrum file, or None.

  • beat_data (list or None) – A list of tuples of (annotations.BeatData, str), where str describes the annotation (e.g. ‘beats_1’).

  • chord_data (list or None) – A list of tuples of (annotations.ChordData, str), where str describes the annotation.

  • note_data (list or None) – A list of tuples of (annotations.NoteData, str), where str describes the annotation.

  • f0_data (list or None) – A list of tuples of (annotations.F0Data, str), where str describes the annotation.

  • section_data (list or None) – A list of tuples of (annotations.SectionData, str), where str describes the annotation.

  • multi_section_data (list or None) – A list of tuples. Tuples in multi_section_data should contain another list of tuples, indicating annotations in the different levels e.g. ([(segments0, level0), ‘(segments1, level1)], annotator) and a str indicating the annotator

  • tempo_data (list or None) – A list of tuples of (float, str), where float gives the tempo in bpm and str describes the annotation.

  • event_data (list or None) – A list of tuples of (annotations.EventData, str), where str describes the annotation.

  • key_data (list or None) – A list of tuples of (annotations.KeyData, str), where str describes the annotation.

  • lyrics_data (list or None) – A list of tuples of (annotations.LyricData, str), where str describes the annotation.

  • tags_gtzan_data (list or None) – A list of tuples of (str, str), where the first srt is the tag and the second is a descriptor of the annotation.

  • tags_open_data (list or None) – A list of tuples of (str, str), where the first srt is the tag and the second is a descriptor of the annotation.

  • metadata (dict or None) – A dictionary containing the track metadata.

Returns

jams.JAMS – A JAMS object containing the annotations.

mirdata.jams_utils.keys_to_jams(key_data, description)[source]

Convert key annotations into jams format.

Parameters
  • key_data (annotations.KeyData) – key data object

  • description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.lyrics_to_jams(lyric_data, description=None)[source]

Convert lyric annotations into jams format.

Parameters
  • lyric_data (annotations.LyricData) – lyric annotation object

  • description (str) – annotation descriptoin

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.multi_sections_to_jams(multisection_data, description)[source]

Convert multi-section annotations into jams format.

Parameters
  • multisection_data (list) – list of tuples of the form [(SectionData, int)]

  • description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.notes_to_jams(note_data, description)[source]

Convert note annotations into jams format.

Parameters
  • note_data (annotations.NoteData) – note data object

  • description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.sections_to_jams(section_data, description=None)[source]

Convert section annotations into jams format.

Parameters
  • section_data (annotations.SectionData) – section data object

  • description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.tag_to_jams(tag_data, namespace='tag_open', description=None)[source]

Convert lyric annotations into jams format.

Parameters
  • lyric_data (annotations.LyricData) – lyric annotation object

  • namespace (str) – the jams-compatible tag namespace

  • description (str) – annotation descriptoin

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.tempos_to_jams(tempo_data, description=None)[source]

Convert tempo annotations into jams format.

Parameters
  • tempo_data (annotations.TempoData) – tempo data object

  • description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

Contributing

We encourage contributions to mirdata, especially new dataset loaders. To contribute a new loader, follow the steps indicated below and create a Pull Request (PR) to the github repository. For any doubt or comment about your contribution, you can always submit an issue or open a discussion in the repository.

Installing mirdata for development purposes

To install mirdata for development purposes:

  • First run:

git clone https://github.com/mir-dataset-loaders/mirdata.git
  • Then, after opening source data library you have to install the dependencies for updating the documentation and running tests:

pip install .
pip install .[tests]
pip install .[docs]
pip install .[dali]

We recommend to install pyenv to manage your Python versions and install all mirdata requirements. You will want to install the latest versions of Python 3.6 and 3.7. Once pyenv and the Python versions are configured, install pytest. Make sure you installed all the pytest plugins to automatically test your code successfully. Finally, run:

pytest tests/ --local

All tests should pass!

Writing a new dataset loader

The steps to add a new dataset loader to mirdata are:

  1. Create an index

  2. Create a module

  3. Add tests

  4. Submit your loader

Before starting, if your dataset is not fully downloadable you should:

  1. Contact the mirdata team by opening an issue or PR so we can discuss how to proceed with the closed dataset.

  2. Show that the version used to create the checksum is the “canonical” one, either by getting the version from the dataset creator, or by verifying equivalence with several other copies of the dataset.

To reduce friction, we will make commits on top of contributors PRs by default unless the please-do-not-edit flag is used.

1. Create an index

mirdata’s structure relies on indexes. Indexes are dictionaries contain information about the structure of the dataset which is necessary for the loading and validating functionalities of mirdata. In particular, indexes contain information about the files included in the dataset, their location and checksums. The necessary steps are:

  1. To create an index, first cereate a script in scripts/, as make_dataset_index.py, which generates an index file.

  2. Then run the script on the canonical versions of the dataset and save the index in mirdata/datasets/indexes/ as dataset_index.json.

Here there is an example of an index to use as guideline:

More examples of scripts used to create dataset indexes can be found in the scripts folder.

tracks

Most MIR datasets are organized as a collection of tracks and annotations. In such case, the index should make use of the tracks top-level key. A dictionary should be stored under the tracks top-level key where the keys are the unique track ids of the dataset. The values are a dictionary of files associated with a track id, along with their checksums. These files can be for instance audio files or annotations related to the track id. File paths are relative to the top level directory of a dataset.

multitracks
records
2. Create a module

Once the index is created you can create the loader. For that, we suggest you use the following template and adjust it for your dataset. To quickstart a new module:

  1. Copy the example below and save it to mirdata/datasets/<your_dataset_name>.py

  2. Find & Replace Example with the <your_dataset_name>.

  3. Remove any lines beginning with # – which are there as guidelines.

You may find these examples useful as references:

For many more examples, see the datasets folder.

3. Add tests

To finish your contribution, include tests that check the integrity of your loader. For this, follow these steps:

  1. Make a toy version of the dataset in the tests folder tests/resources/mir_datasets/my_dataset/, so you can test against little data. For example:

    • Include all audio and annotation files for one track of the dataset

    • For each audio/annotation file, reduce the audio length to 1-2 seconds and remove all but a few of the annotations.

    • If the dataset has a metadata file, reduce the length to a few lines.

  2. Test all of the dataset specific code, e.g. the public attributes of the Track class, the load functions and any other custom functions you wrote. See the tests folder for reference. If your loader has a custom download function, add tests similar to this loader.

  3. Locally run pytest -s tests/test_full_dataset.py --local --dataset my_dataset before submitting your loader to make sure everything is working.

Note

We have written automated tests for all loader’s cite, download, validate, load, track_ids functions, as well as some basic edge cases of the Track class, so you don’t need to write tests for these!

Running your tests locally

Before creating a PR, you should run all the tests locally like this:

pytest tests/ --local

The –local flag skips tests that are built to run only on the remote testing environment.

To run one specific test file:

pytest tests/test_ikala.py

Finally, there is one local test you should run, which we can’t easily run in our testing environment.

pytest -s tests/test_full_dataset.py --local --dataset dataset

Where dataset is the name of the module of the dataset you added. The -s tells pytest not to skip print statments, which is useful here for seeing the download progress bar when testing the download function.

This tests that your dataset downloads, validates, and loads properly for every track. This test takes a long time for some datasets, but it’s important to ensure the integrity of the library.

We’ve added one extra convenience flag for this test, for getting the tests running when the download is very slow:

pytest -s tests/test_full_dataset.py --local --dataset my_dataset --skip-download

which will skip the downloading step. Note that this is just for convenience during debugging - the tests should eventually all pass without this flag.

Working with big datasets

In the development of large datasets, it is advisable to create an index as small as possible to optimize the implementation process of the dataset loader and pass the tests.

Working with remote indexes

For the end-user there is no difference between the remote and local indexes. However, indexes can get large when there are a lot of tracks in the dataset. In these cases, storing and accessing an index remotely can be convenient. Large indexes can be added to REMOTES, and will be downloaded with the rest of the dataset. For example:

"index": download_utils.RemoteFileMetadata(
    filename="acousticbrainz_genre_index.json.zip",
    url="https://zenodo.org/record/4298580/files/acousticbrainz_genre_index.json.zip?download=1",
    checksum="810f1c003f53cbe58002ba96e6d4d138",
)

Unlike local indexes, the remote indexes will live in the data_home directory. When creating the Dataset object, specify the custom_index_path to where the index will be downloaded (as a relative path to data_home).

Reducing the testing space usage

We are trying to keep the test resources folder size as small as possible, because it can get really heavy as new loaders are added. We kindly ask the contributors to reduce the size of the testing data if possible (e.g. trimming the audio tracks, keeping just two rows for csv files).

4. Submit your loader

Before you submit your loader make sure to:

  1. Add your module to docs/source/mirdata.rst following an alphabetical order

  2. Add your module to docs/source/table.rst following an alphabetical order as follows:

* - Dataset
  - Downloadable?
  - Annotation Types
  - Tracks
  - License

An example of this for the Beatport EDM key dataset:

* - Beatport EDM key
  - - audio: ✅
    - annotations: ✅
  - - global :ref:`key`
  - 1486
  - .. image:: https://licensebuttons.net/l/by-sa/3.0/88x31.png
       :target: https://creativecommons.org/licenses/by-sa/4.0

(you can check that this was done correctly by clicking on the readthedocs check when you open a PR). You can find license badges images and links here.

Pull Request template

When starting your PR please use the new_loader.md template, it will simplify the reviewing process and also help you make a complete PR. You can do that by adding &template=new_loader.md at the end of the url when you are creating the PR :

...mir-dataset-loaders/mirdata/compare?expand=1 will become ...mir-dataset-loaders/mirdata/compare?expand=1&template=new_loader.md.

Docs

Staged docs for every new PR are built, and you can look at them by clicking on the “readthedocs” test in a PR. To quickly troubleshoot any issues, you can build the docs locally by nagivating to the docs folder, and running make html (note, you must have sphinx installed). Then open the generated _build/source/index.html file in your web browser to view.

Troubleshooting

If github shows a red X next to your latest commit, it means one of our checks is not passing. This could mean:

  1. running black has failed – this means that your code is not formatted according to black’s code-style. To fix this, simply run the following from inside the top level folder of the repository:

black --target-version py38 mirdata/ tests/
  1. the test coverage is too low – this means that there are too many new lines of code introduced that are not tested.

  2. the docs build has failed – this means that one of the changes you made to the documentation has caused the build to fail. Check the formatting in your changes and make sure they are consistent.

  3. the tests have failed – this means at least one of the tests is failing. Run the tests locally to make sure they are passing. If they are passing locally but failing in the check, open an issue and we can help debug.

Documentation

This documentation is in rst format. It is built using Sphinx and hosted on readthedocs. The API documentation is built using autodoc, which autogenerates documentation from the code’s docstrings. We use the napoleon plugin for building docs in Google docstring style. See the next section for docstring conventions.

mirdata uses Google’s Docstring formatting style. Here are some common examples.

Note

The small formatting details in these examples are important. Differences in new lines, indentation, and spacing make a difference in how the documentation is rendered. For example writing Returns: will render correctly, but Returns or Returns : will not.

Functions:

def add_to_list(list_of_numbers, scalar):
    """Add a scalar to every element of a list.
    You can write a continuation of the function description here on the next line.

    You can optionally write more about the function here. If you want to add an example
    of how this function can be used, you can do it like below.

    Example:
        .. code-block:: python

        foo = add_to_list([1, 2, 3], 2)

    Args:
        list_of_numbers (list): A short description that fits on one line.
        scalar (float):
            Description of the second parameter. If there is a lot to say you can
            overflow to a second line.

    Returns:
        list: Description of the return. The type here is not in parentheses

    """
    return [x + scalar for x in list_of_numbers]

Functions with more than one return value:

def multiple_returns():
    """This function has no arguments, but more than one return value. Autodoc with napoleon doesn't handle this well,
    and we use this formatting as a workaround.

    Returns:
        * int - the first return value
        * bool - the second return value

    """
    return 42, True

One-line docstrings

def some_function():
    """
    One line docstrings must be on their own separate line, or autodoc does not build them properly
    """
    ...

Objects

"""Description of the class
overflowing to a second line if it's long

Some more details here

Args:
    foo (str): First argument to the __init__ method
    bar (int): Second argument to the __init__ method

Attributes:
    foobar (str): First track attribute
    barfoo (bool): Second track attribute

Cached Properties:
    foofoo (list): Cached properties are special mirdata attributes
    barbar (None): They are lazy loaded properties.
    barf (bool): Document them with this special header.

"""

Conventions

Loading from files

We use the following libraries for loading data from files:

Format

library

audio (wav, mp3, …)

librosa

midi

pretty_midi

json

json

csv

csv

jams

jams

Track Attributes

Custom track attributes should be global, track-level data. For some datasets, there is a separate, dataset-level metadata file with track-level metadata, e.g. as a csv. When a single file is needed for more than one track, we recommend using writing a _metadata cached property (which returns a dictionary, either keyed by track_id or freeform) in the Dataset class (see the dataset module example code above). When this is specified, it will populate a track’s hidden _track_metadata field, which can be accessed from the Track class.

For example, if _metadata returns a dictionary of the form:

{
    'track1': {
        'artist': 'A',
        'genre': 'Z'
    },
    'track2': {
        'artist': 'B',
        'genre': 'Y'
    }
}

the _track metadata for track_id=track2 will be:

{
    'artist': 'B',
    'genre': 'Y'
}
Load methods vs Track properties

Track properties and cached properties should be trivial, and directly call a load_* method. There should be no additional logic in a track property/cached property, and instead all logic should be done in the load method. We separate these because the track properties are only usable when data is available locally - when data is remote, the load methods are used instead.

Missing Data

If a Track has a property, for example a type of annotation, that is present for some tracks and not others, the property should be set to None when it isn’t available.

The index should only contain key-values for files that exist.

Custom Decorators

cached_property

This is used primarily for Track classes.

This decorator causes an Object’s function to behave like an attribute (aka, like the @property decorator), but caches the value in memory after it is first accessed. This is used for data which is relatively large and loaded from files.

docstring_inherit

This decorator is used for children of the Dataset class, and copies the Attributes from the parent class to the docstring of the child. This gives us clear and complete docs without a lot of copy-paste.

copy_docs

This decorator is used mainly for a dataset’s load_ functions, which are attached to a loader’s Dataset class. The attached function is identical, and this decorator simply copies the docstring from another function.

coerce_to_bytes_io/coerce_to_string_io

These are two decorators used to simplify the loading of various Track members in addition to giving users the ability to use file streams instead of paths in case the data is in a remote location e.g. GCS. The decorators modify the function to:

  • Return None if None if passed in.

  • Open a file if a string path is passed in either ‘w’ mode for string_io or wb for bytes_io and pass the file handle to the decorated function.

  • Pass the file handle to the decorated function if a file-like object is passed.

This cannot be used if the function to be decorated takes multiple arguments. coerce_to_bytes_io should not be used if trying to load an mp3 with librosa as libsndfile does not support mp3 yet and audioread expects a path.

FAQ

How do I add a new loader?

Take a look at our Contributing docs!

How do I get access to a dataset if the download function says it’s not available?

We don’t distribute data ourselves, so unfortunately it is up to you to find the data yourself. We strongly encourage you to favor datasets which are currently available.

Can you send me the data for a dataset which is not available?

No, we do not host or distribute datasets.

How do I request a new dataset?

Open an issue and tag it with the “New Loader” label.

What do I do if my data fails validation?

Very often, data fails vaildation because of how the files are named or how the folder is structured. If this is the case, try renaming/reorganizing your data to match what mirdata expects. If your data fails validation because of the checksums, this means that you are using data which is different from what most people are using, and you should try to get the more common dataset version, for example by using the data loader’s download function.

How do you choose the data that is used to create the checksums?

Whenever possible, the data downloaded using .download() is the same data used to create the checksums. If this isn’t possible, we did our best to get the data from the original source (the dataset creator) in order to create the checksum. If this is again not possible, we found as many versions of the data as we could from different users of the dataset, computed checksums on all of them and used the version which was the most common amongst them.

Does mirdata provide data loaders for pytorch/Tensorflow?

For now, no. Music datasets are very widely varied in their annotation types and supported tasks. To make a data loader, there would need to be “standard” ways to encode the desired inputs/outputs - unfortunately this is not universal for most datasets and usages. Still, this library provides the necessary first step for building data loaders and it is easy to build data loaders on top of this. For more information, see Using mirdata with tensorflow.

Why the name, mirdata?

mirdata = mir + data. MIR is an acronym for Music Information Retrieval, and the library was built for working with data.

If I find a mistake in an annotation, should I fix it in the loader?

No. All datasets have “mistakes”, and we do not want to create another version of each dataset ourselves. The loaders should load the data as released. After that, it’s up to the user what they want to do with it.

Does mirdata support data which lives off-disk?

Yes. While the simple useage of mirdata assumes that data lives on-disk, it can be used for off-disk data as well. See Accessing data remotely for details.