MS COCO 2014 Dataset?


#1

Hi, would it be possible to add MS COCO 2014’s train/val/test datasets?

The train dataset is 13.5 GB in size and it would be very helpful if you can add it :slight_smile:

Cheers


#2

Hi @cbsudux,

The easiest and immediate way to have MS COCO 2014 Dataset ready for your purposes is to upload it directly to your neptune storage.

Use data upload command:

  1. pip install neptune-cli -U make sure that you have the latest version of the neptune-cli
  2. neptune data upload -r --project PROJECT path
    where:
    • PROJECT is your project short ID
    • path is something like /local/path/to/MS-COCO-2014

Your data will be uploaded to the PROJECT/uploads/MS-COCO-2014.

Happy training :wink:
Kamil


#3

Hi,

I know how to upload a dataset. :slight_smile:

It’s very difficult for me to upload something that is 13.5 GB is size and I would be very happy if someone from the Neptune team could do it.

Cheers


#4

Hey @cbsudux,

Sure! What sort of difficulties do you face? Maybe we can help in some way? :slight_smile:

Best,
Kamil


#5

The Download gets interrupted for me. It’s either due to fluctuating wifi or some interruption. I’ve tried around 10 times already.

I’ve seen a few kaggle datasets being uploaded. Would it be possible to upload this too?

It will be useful to the community in general :slight_smile:


#6

Hi @cbsudux,

On Tuesday morning I will discuss this with the team. We need to check few things i.e. data format and license. I will get back to you on Tuesday afternoon :slight_smile:

In the mean time you can copy&paste error that you have -> I hope that mentioned interruptions manifestate themselves somehow.

Best,
Kamil


#7

Hi,

Have you decided on something? :slight_smile:
The dataset is instrumental to my research and I’d greatly appreciate it if you could upload COCO.

Cheers


#8

Hi @cbsudux,

I have good news for you. We have decided to upload MS COCO to the /public directory :slight_smile:. However, first I need to check the license of this dataset.

Also, we have decided that we want to make it one step further. We will create a larger collection of important ML/DL datasets. Do you have in mind other datasets that you think we should upload as well?

Best,
Kamil


#9

Hi,

Thanks a lot! I’d really appreciate it if you could do it soon.

Well, the other dataset I can think of is the VQA dataset. I’m sure there are more.

Cheers


#10

@cbsudux

Nice! Give me exact link to MS COCO and VQA datasets.


#11

http://cocodataset.org/#download

MS COCO 2014 is more widely used


#12

Hi @cbsudux,

COCO dataset is now available here: /public/datasets/coco

Happy training,
Kamil


#13

Thanks a lot!