Avito dataset (for newbies + Windows OS)


#1

Sorry, i read your previous post about competition. But finally i can’t upload data i need. If it is possible help me, please.


#2

Hi @Alexander,

As @jakub_czakon has said in another post, unfortunately we cannot upload this competition’s data for legal reasons. However, maybe I can offer a workaround for your problems with uploading the data to Neptune (in the meantime we’ll be working to improve data management).

  1. Open a Jupyter notebook within Neptune.

  2. Install Kaggle CLI:

    !pip install kaggle
    
  3. Provide your Kaggle API key (you can find it in your account page: https://www.kaggle.com/<your username>/account in the API section):

    !echo '{"username":"...","key":"..."}' > /root/.kaggle/kaggle.json
    
  4. Download the data to /output directory:

    !kaggle competitions download -c avito-demand-prediction -p /output/avito/ -f test_jpg.zip
    

    This command needs to be repeated for all the required files (you could omit the -f parameter, but this particular competition has its data uploaded twice - in a single large file and in a few smaller ones, so you would download 50G of data twice).

  5. You can see what you downloaded:

    !ls -l /output/avito/
    
  6. Unzip the files (this will take a while!):

    !unzip /output/avito/train_jpg.zip -d /output/avito/
    
  7. Now you can close the notebook and use the data from this experiment in another experiment using --input. For example,

    neptune run --input /AVITO-123/output/avito/
    

    Within the experiment, the data will be available under

    /input/avito/
    
    

Hope this helps! If you run into any problems, let us know and we’ll figure something out :slight_smile:


#3

Btw, @Alexander can you describe the problems with upload you experienced?


#4

Hi @piotr.lusakowski and @Alexander,

Together with @jakub_czakon, we will double check the rules of the competition again. We also want to make it easy to our Users to participate in this challenge.

Cheers,
Kamil