Skip to content

Downloading Datasets

Login Required

This command needs you to login to be able to successfully execute

With this command, you can easily download datasets avaiable for a particular challenge.

By default, all dataset files are downloaded:

aicrowd dataset download -c CHALLENGE

where CHALLENGE is the challenge slug/url

slug-url

Multiple files might be downloaded in parallel depending on the number of your CPU cores.

Downloading specific files

If you only want to download some specific files, you have two options:

Specifying patterns

In the dataset listing, all the datasets have titles. You can specify the glob pattern for the files you want to download.

For example,

aicrowd dataset download -c CHALLENGE '*train*'
would only download the files which have the word 'train' in them

Multiple glob patterns can be specified.

Specifying indices

In the dataset listing, all the datasets are indexed. You can specify the indices for which you want to download the datasets.

For example,

aicrowd dataset download -c CHALLENGE 0
would only download the first dataset in the listing

Multiple indices can be specified.

Downloading more (or less) files in parallel

You can control the number of threads which are spawned for downloading datasets by specifying the -j option.

For example,

aicrowd dataset download -c CHALLENGE -j 4
would download 4 files simultaneously