MS COCO 2014 Dataset?


Hi, would it be possible to add MS COCO 2014’s train/val/test datasets?

The train dataset is 13.5 GB in size and it would be very helpful if you can add it :slight_smile:



Hi @cbsudux,

The easiest and immediate way to have MS COCO 2014 Dataset ready for your purposes is to upload it directly to your neptune storage.

Use data upload command:

  1. pip install neptune-cli -U make sure that you have the latest version of the neptune-cli
  2. neptune data upload -r --project PROJECT path
    • PROJECT is your project short ID
    • path is something like /local/path/to/MS-COCO-2014

Your data will be uploaded to the PROJECT/uploads/MS-COCO-2014.

Happy training :wink:



I know how to upload a dataset. :slight_smile:

It’s very difficult for me to upload something that is 13.5 GB is size and I would be very happy if someone from the Neptune team could do it.



Hey @cbsudux,

Sure! What sort of difficulties do you face? Maybe we can help in some way? :slight_smile:



The Download gets interrupted for me. It’s either due to fluctuating wifi or some interruption. I’ve tried around 10 times already.

I’ve seen a few kaggle datasets being uploaded. Would it be possible to upload this too?

It will be useful to the community in general :slight_smile:


Hi @cbsudux,

On Tuesday morning I will discuss this with the team. We need to check few things i.e. data format and license. I will get back to you on Tuesday afternoon :slight_smile:

In the mean time you can copy&paste error that you have -> I hope that mentioned interruptions manifestate themselves somehow.




Have you decided on something? :slight_smile:
The dataset is instrumental to my research and I’d greatly appreciate it if you could upload COCO.



Hi @cbsudux,

I have good news for you. We have decided to upload MS COCO to the /public directory :slight_smile:. However, first I need to check the license of this dataset.

Also, we have decided that we want to make it one step further. We will create a larger collection of important ML/DL datasets. Do you have in mind other datasets that you think we should upload as well?




Thanks a lot! I’d really appreciate it if you could do it soon.

Well, the other dataset I can think of is the VQA dataset. I’m sure there are more.




Nice! Give me exact link to MS COCO and VQA datasets.


MS COCO 2014 is more widely used


Hi @cbsudux,

COCO dataset is now available here: /public/datasets/coco

Happy training,


Thanks a lot!