Support simple statistical functions for numeric outputs in the experiments page

The experiments page is a great way to compare the performance between different configurations of the same experiment, however the display of the “Numeric Logs” shows the last value for that scalar.
A typical optimization objective is to minimize or maximise some scalar training objective.
Practically this means that the value of the objective function will start out high, decrease to a global minimum, then if overfitting occurs it will start to increase again.
It is important to know the minimum value that that scalar took over the course of the experiment.

Currently I have to select and compare all the experiments and manually inspect and record the minimum for each.
I don’t see that there is a way to show for instance min(value), max(value), mean(value) in the experiments page.
Are there any plans to implement this function?

Hi @spotiris,

Thank you for this suggestion.

Solution

Right now, you can do it in the following way, programmatically. At the end of your experiment (where you call stop or end experiment in some other way), you can fetch metrics data and calculate summary statistics:

project = neptune.init('organization/project')
exp = project.get_experiments(exp.id)[0]
metrics_df = exp.get_numeric_channels_values('metric_1', 'metric_2')

In the above example, metrics_df is pandas.DataFrame, so calculating min, max, mean, std is straightforward. Once you have your values, send them as regular metrics, for example:

# calculate mean
metric_1_mean = metrics_df['metric_1'].mean()

# option 1: log to Neptune as metric
neptune.log_metric('metric_1_mean', metric_1_mean)

# option 2: log to Neptune as property
neptune.set_property('metric_1_mean', metric_1_mean)

Feature request

In general, it seems interesting to introduce such summaries of metrics in-app. Do you think that mean, max, min are sufficient? Or, maybe we should allow users to develop their own functions, register them, and execute at the end of the experiment (like callbacks)? The result would be appended as a column to the experiments table.

What do you think?

Best,
Kamil

Hi @spotiris,

I just wanted to check if you had a chance to check my answer. I am also curious about your view on feature request that I drafted.

Looking forward to discuss it :slight_smile:

Best,
Kamil

Hi @kamil.kaczmarek,

I have thought of writing a new field of ‘min_loss’ for new experiments as you have suggested, and I think it is currently achieveable to do so.
However it is difficult to do that for existing experiments which require comparison to new experiments.

I think that the feature I have requested would be convenient, which is for me one of the most important functions of the experiments page.
This would be better as a feature of the frontend rather than to have a place in the API because it would allow the use of this feature with experiments which did not produce the mean/min/max statistics at train-time.

Another way to formulate the feature requrest would be to rather than ‘min/mean/max’ of a channel, report a given percentile of the channel values.
For example a min function would be the 0th percentile, the max would be 100th and the median would be 50th (although this does not necessarily reflect the mean).

Thanks for the quick response, sorry for the delayed reply!
Kind regards,
Steve

Hi @spotiris,

Thank you for this additional explanation.

I feel that such functions in frontend might be very handy when you want to compare old experiments with more recent ones. I can see value in having summary functions available in the frontend - experiments view specifically. I noted it as a feature requirement.

Right now, workaround that I suggested in one of my previous posts might work for you. One inconvenience is that in order to compare with the latest experiments you need to run this code:

project = neptune.init('organization/project')
experiments = project.get_experiments(state='succeeded')

So here, experiments is a list of neptune Experiment objects, from which you retrieve metric values (as Pandas dataframe) like this:

metrics_df = exp.get_numeric_channels_values('metric_1', 'metric_2')

Check docs for get_experiments - there are several options to filter experiments.

I hope this helps :slightly_smiling_face:

Best,
Kamil