Neptune release 2017.10.05


#1

New version of Neptune has been just released.

NOTE: This release introduces changes breaking backward compatibility.
You have to upgrade your neptune-cli:
pip install -U neptune-cli

Released components:

Change log:

  • Added new, simplified experiment parameters. You need to change your code to fit Neptune 2.1 parameters model.
    Changes introduced in parameters:
    • neptune.Context().params is no longer supported,
    • parameter section is no longer supported in neptune.yaml config file,
    • read more about new parameters here.
  • Introduced new leaderboard with the following new functionalities:
    • reordering columns via drag and drop, addressing,
    • resizing columns,
    • editing experiment’s meta data inline from the leaderboard (name, tags, notes),
    • new columns:
      • running time,
      • source code size,
      • source code md5sum,
      • git hash,
      • worker type,
      • environment.
    • Added ability to place column in a fixed, right part of the leaderboard, that doesn’t move while scrolling (helpful for comparing experiments by specific column).
    • Introduced Short ID for easier experiment distinction.
    • Added ability to resize dashboard pane and almost hide it when notebook is enabled.
  • Added new environment: pytorch-0.2.0.
  • Changed directory where code is executed from /source to /neptune. User can write to /neptune. Partially addressed problem.
  • Fixed bug with not working neptune run --environment.

Neptune.yaml error global_path
#2

So we can only define our “parameters” in the command line ?


#3

I don’t understand this change about parameters.
If I have a model that deal with 20 parameters (or more), I have to define all of them in the command line ?
I really liked the possibility to have parameters with their description in the .yaml file… :’(


#4

@NOWIS You can still define parameters in the yaml file. Please read more about how to do that here: https://docs.neptune.ml/config/experiment/#cmd-args.
We changed our approach to parameters, because previous approach required our users to change their code to use Neptune parameters. In the previous approach You had to use Neptune specific neptune.Context().params.x to get value of the x param. Now parameters are passed to your source code like in any other Python script, via sys.argv, so you do not have to change your code to start using Neptune.


#5

@rhryciuk Alright, thanks ! It just wasn’t specified clearly in the parameters documentation ! I understand better now.
The “Example” section in the config doc is obsolete and I think example are really useful to see the right way to setup an experiment. It would be great to update it in the future :wink:
Thanks again for the help !


#6

@NOWIS Thanks for your comment. We have updated the “Example” page in “Config” section. Please refresh the page (without the browser cache Ctrl+Shift+R).


#7

@rhryciuk I’m still confused. Say I want to run a grid search over parameters x = [1,2], y = [3,4], is this to be put in neptune.yaml? https://docs.neptune.ml/config/experiment/ has a section on grid search example but i’m lost where to put these config settings.


#8

You should add this to the end of your yaml:
cmd-args: "--x %[1, 2] --y %[3,4]"


#9

I understand the benefits of this feature, but I also believe that this become a real problem when we want to deal with a large number of parameters.
As an example, here is .yaml that I’m using for training model in LightGBM.

  • Can we still give a description of parameters in the .yaml file ? Descriptions are pretty useful when we have many parameters.
  • I feel like the file with cmd-args will be completely unreadable. Don’t you think ? Is there another way to deal with args than with this one-line cmd-args thing ?

Sorry to insist, it’s just that I really feel like we’re losing something important in this release.

parameters:
  boosting: 
    description: boosting type, can be 'gbdt', 'rf', 'dart', or 'goss'. default = 'gbdt'. 
    default: 'gbdt'

  application:
    description: type of task, default = 'regression'. Can be 'regression', 'binary', 'multiclass', etc...
    default: 'regression'

  learning_rate:
    description: learning_rate, default = 0.1.
    default: 
      values: [0.02, 0.05]

  num_leaves:
    description: number of leaves in one tree, default = 31.
    default: 
      values: [16, 64]

  num_threads: 
    description: Number of threads. For the best speed, number of real CPU core. default = 1.
    default: 4

  max_depth:
    description: Max depth for tree model. Deal with overfit. < 0 means no limit. default = -1.
    default: -1

  min_data_in_leaf:
    description: Minimal number of data in a leaf. Can use to deal with over-fit. default = 20.
    default: 20

  min_sum_hessian_in_leaf:
    description: Minimal sum hessian in a leaf. Can use to deal with over-fit. default = 1e-3.
    default: 1e-3

  feature_fraction: 
    description: Random selection of features at each iteration. Can use for over-fit & speed. default = 1.0.
    default: 
      values: [0.8, 0.9]

  bagging_fraction:
    description: Random selection of data. Can use for over-fit & speed. Must set bagging_freq to a non zero value as well.
    default: 
      values: [0.8, 0.9]

  bagging_freq:
    description: Frequency for bagging, 0 means disable bagging. Will perform bagging at every n iterations.
    default: 
      values: [1, 5]

  lambda_l1:
    description: l1 regularization, default = 0.
    default: 
      values: [1, 4]

  lambda_l2:
    description: l2 regularization, default = 0.
    default: 
      values: [1, 4]

  min_gain_to_split:
    description: The minimal gain to perform split.
    default: 
      values: [0.1, 0.5]

  max_bin:
    description: Max number of bin that feature values will bucket in. Small bin reduce accuracy, but deal with over-fit. LightGBM will auto compress memory according max_bin. default = 255.
    default: 255

  min_data_in_bin:
    description: Min number of data inside one bin. Avoid one-data-one-bin -> over-fitting. default = 5.
    default: 5

  use_missing:
    description: Set to false will disable the special handle of missing value. default = True.
    default: True


  is_unbalance:
    description: Used in binary classification. Set this to true if training data are unbalance. default = False
    default: True```

#10

Thanks. This allows me to launch the grid search without encountering error in my terminal.

However, in Neptune UI, I get an error select_experiment.py: error: argument --seed: invalid int value: '100.0'

The --seed is a parameter I pass to the experiment that controls the random state of a sampler. It’s supposed to be integer. In my neptune.yaml, I have this line:

cmd-args: "--seed %[100,123] --down-sample %[0.1,0.2,0.5] --shuffle-seed %[111,123]" where the seed values are 100 and 123, not 100.0 or 123.0. I’m not sure how/why Neptune recognized the value range as floats and throws me the error. Could you please advise?


#11

We’ll take a look on this floating point grid search parameters on Monday. Yes, it is an unexpected behavior and probably a bug.


#12

@TianhaoLu We are working on this issue. If a value list or a range contains only integers, the single values should be integers too. A fix will be included in the next release. For now, you can just declare your parameters as float and cast them to int.


#13

@NOWIS you are right. We didn’t notice the drawback of the new approach. In the upcoming release (4 weeks from now) we will introduce another change that basically would be a merge of the new approach and the old one. In particular, you would be able to define your parameters in YAML like that:

# description of parameter_x
parameter_x: 4

# description of parameter_y
parameter_y: [1, 4]

We decided not to support a special “description” field since regular YAML comments can be used for that. I’m sorry for inconvenience and thanks for the feedback!
Piotr


#14

Thanks ! I’m glad you’re open to suggestions and criticism ! :slight_smile:
Simon


#15

@TianhaoLu We released fix to your problem.
Please run pip install -U neptune-cli to upgrade your neptune CLI


#16

From my point of view, parameters idea go to wrong direction. As mention earlier, currently I need to change all my parameters to cmd command, which is no more readable as before. Now I need also parse input by myself.

Also, about description of parameters vs comment: description was very useful because it was also available at dashboard at parameters tab. Currently this field disappear, which is not very helpful, because I do not know what some parameters means. I would like to get it back again, it was very helpful :slight_smile:


#17

Regarding the need of changing parameters to command line arguments, it won’t be necessary. We will also restore ctx.parameters object, so you would be able to use it again - it will also contain command line arguments tracked by Neptune.

After taking into consideration your feedback and few other users who contacted me directly, we will restore it as well. Next time, before we decide to make a change that is not backward compatible we will consult it on this forum first - once again I am sorry.


#18

Hi @anitaka, for the cmd-args in neptune.yaml, what shall i put for string parameters?

For example, i have 2 param “x” and “model”

Do I do cmd-args: "--x %[1, 2] --model %["svm", "gbt"]"?


#19

I recommend using single quotes.
Like this
cmd-args: '--x %[1, 2] --model %["svm", "gbt"]'


#20

Thanks, seems it works!