Not possible to send infs and nans #183

wjaskowski · 2019-10-02T14:05:13Z

What is the reason to don't allow sending inf's and nan's as metric values? I imagine that it is impossible to plot them but this is still some information.

kamil-kaczmarek · 2019-10-04T13:08:51Z

Hi @wjaskowski thanks for reaching out.

Indeed right now we do not accept NaN/None/(+/-)Inf values.

However, we had some internal discussion about it.

One idea is to make it similar to what TensorBoard does: each Nan/None is displayed as a graphic icon, like a triangle or star. Location on "y" axis is determined by preceding numeric value.
Location on "x" axis is preceding value +1.

what do you think?

wjaskowski · 2019-10-04T13:16:13Z

Sounds good. You might also want to consider placing those triangles on the bottom/top of the visible plot.

But visualization is one thing - the most important one is to be able to send and download the data.

kamil-kaczmarek · 2019-10-04T13:42:43Z

Thanks for suggestion 🙂

We will consider it as well.

fwindolf · 2023-05-22T09:01:40Z

This is a somewhat stale issue but the first thing that comes up when you google the behaviour.

Any news/updates on this?

SiddhantSadangi · 2023-05-22T11:43:59Z

Hello @fwindolf ,
This feature request is quite deep in our backlog, so currently, there is no ETA for it, unfortunately.
Is this behavior a blocker for your workflows?

fwindolf · 2023-05-22T21:02:48Z

Not really a blocker, but having NaNs occur during training for whatever reason seems to be a common enough problem to justify experiment tracking not completely breaking imo.

So a

run["my_metric"].append(1.0)
run["my_metric"].append(float("nan"))
run["my_metric"].append(3.0)

will only show the 1.0. I see why adding NaN support would open up quite a few edge cases for visualizations etc, but maybe a short term fix could be simply ignoring NaN, +-inf etc during the list iteration when syncing the metric.

SiddhantSadangi · 2023-05-23T09:00:16Z

Would replacing the nan/inf values with 0/some high-end value while logging be a viable workaround in your case?
Something like:

import math
metric = float("nan")
if math.isna(metric ):
    run["my_metric'].append(0)

I've also submitted your feedback around ignoring NaN/inf to the product team. Thank you :)

SiddhantSadangi · 2023-06-01T14:26:34Z

Hello @fwindolf ,
Just checking if the above workaround works for you

fwindolf · 2023-06-01T18:18:15Z

Sorry I missed the notification of the last comment.

We solved it by not logging nans as 0, inf as a big number which is okay for now. It skews the readability of graphs but it's better than not seeing anything.

Thanks for forwarding the issue!

SiddhantSadangi · 2023-06-01T19:07:37Z

Did you mean ~~not~~ logging nans as 0? :)

rschiewer · 2023-06-12T09:48:05Z

Hi there! Is this something that is actively worked on? I experienced some solid trouble recently because my training process diverged and the logging did not show where the NANs started to show up at first. This can be very valuable information for debugging. What is bad about the way e.g. tensorboard handles NAN/inf?

While the workaround is fine in most cases, my model showed values of around zero all the time and then started to diverge so replacing NANs with zeros in principle works but is not ideal in my situation.

SiddhantSadangi · 2023-06-15T09:28:14Z

Hello @rschiewer ,

The product team is currently scoping this. This seems to involve relatively high engineering effort, so there is no ETA as of now, unfortunately :(

In your case, since the values hover around zero, can you replace NaNs with a high value so that they show up in charts, and you can then know when your model starts diverging?

SiddhantSadangi · 2023-12-06T08:54:47Z

Hey everyone! Just a quick update here.

Neptune v1.8.3 now skips trying to log NaN and Inf values and throws a warning instead. This means you no longer have to check for nan/inf values in your code🥳

DanTremonti · 2024-08-05T06:38:28Z

Bump!

Neptune v1.8.3 now skips trying to log NaN and Inf values and throws a warning instead.

Thanks for this change team.

One idea is to make it similar to what TensorBoard does: each Nan/None is displayed as a graphic icon, like a triangle or star.

This sounds like a very useful way to visualize out of range values. I suppose logging things going wrong is a crucial part of experimenting and in this way it will also help with comparison against different experiments. Eager to hear your thoughts on this.

SiddhantSadangi · 2024-08-05T08:58:27Z

Hey @DanTremonti 👋
As you can see, this is a long-standing feature request 😃

There are several aspects we need to consider before finalizing a solution here, mainly related to front-end performance, as any custom rendering will need to be managed at the front end.

Although we are aware of this request, we don't anticipate any changes in the short term as we are currently focusing on performance improvements.

I'll keep this thread updated ✅

DanTremonti · 2024-08-05T14:46:29Z

Thank you for the quick response @SiddhantSadangi
I'm happy to hear that this feature request is being consider :)

brynhayder · 2024-08-22T11:38:46Z

Thank you for your work.

I don't see this skipping behaviour in the docs of either append or FloatSeries. For the benefit of future users, please could you add it?

When the warning says the values are skipped, is the counter on the series advanced or not? Considering logging loss values throughout training, you might want the step counter to remain consistent regardless of whether the loss was nan. The series would then contain missing values, which the user knows should at least be invalid float. Either way, please can you make the behaviour clear in the docs?

SiddhantSadangi · 2024-08-22T13:01:26Z

Hey @brynhayder 👋
Thanks for your comment!

The skipping behaviour is mentioned here: https://docs.neptune.ai/help/value_of_unsupported_type/#working-around-none-inf-or-nan. However, we should also mention this with the API reference of append. I'll pass this on to the docs team ✅

The counter on the series is currently not advanced on encountering nan or inf values. But your feedback does make sense and I totally resonate. I'll pass this feedback to the product team ✅

kamil-kaczmarek self-assigned this Oct 4, 2019

kamil-kaczmarek added the feature request label Oct 4, 2019

SiddhantSadangi unassigned kamil-kaczmarek Jun 1, 2023

SiddhantSadangi assigned SiddhantSadangi and parthpankajtiwary and unassigned SiddhantSadangi Jun 1, 2023

Blaizzy assigned SiddhantSadangi Jun 15, 2023

SiddhantSadangi removed their assignment Sep 19, 2023

SiddhantSadangi added the api label Oct 10, 2023

SiddhantSadangi assigned SiddhantSadangi and unassigned parthpankajtiwary Aug 5, 2024

SiddhantSadangi assigned AurimasGr and unassigned SiddhantSadangi Aug 12, 2024

SiddhantSadangi assigned dzwiedziu and unassigned AurimasGr Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not possible to send infs and nans #183

Not possible to send infs and nans #183

wjaskowski commented Oct 2, 2019

kamil-kaczmarek commented Oct 4, 2019

wjaskowski commented Oct 4, 2019

kamil-kaczmarek commented Oct 4, 2019

fwindolf commented May 22, 2023

SiddhantSadangi commented May 22, 2023

fwindolf commented May 22, 2023

SiddhantSadangi commented May 23, 2023

SiddhantSadangi commented Jun 1, 2023

fwindolf commented Jun 1, 2023

SiddhantSadangi commented Jun 1, 2023

rschiewer commented Jun 12, 2023 •

edited

Loading

SiddhantSadangi commented Jun 15, 2023

SiddhantSadangi commented Dec 6, 2023 •

edited

Loading

DanTremonti commented Aug 5, 2024

SiddhantSadangi commented Aug 5, 2024

DanTremonti commented Aug 5, 2024

brynhayder commented Aug 22, 2024 •

edited

Loading

SiddhantSadangi commented Aug 22, 2024

Not possible to send infs and nans #183

Not possible to send infs and nans #183

Comments

wjaskowski commented Oct 2, 2019

kamil-kaczmarek commented Oct 4, 2019

wjaskowski commented Oct 4, 2019

kamil-kaczmarek commented Oct 4, 2019

fwindolf commented May 22, 2023

SiddhantSadangi commented May 22, 2023

fwindolf commented May 22, 2023

SiddhantSadangi commented May 23, 2023

SiddhantSadangi commented Jun 1, 2023

fwindolf commented Jun 1, 2023

SiddhantSadangi commented Jun 1, 2023

rschiewer commented Jun 12, 2023 • edited Loading

SiddhantSadangi commented Jun 15, 2023

SiddhantSadangi commented Dec 6, 2023 • edited Loading

DanTremonti commented Aug 5, 2024

SiddhantSadangi commented Aug 5, 2024

DanTremonti commented Aug 5, 2024

brynhayder commented Aug 22, 2024 • edited Loading

SiddhantSadangi commented Aug 22, 2024

rschiewer commented Jun 12, 2023 •

edited

Loading

SiddhantSadangi commented Dec 6, 2023 •

edited

Loading

brynhayder commented Aug 22, 2024 •

edited

Loading