Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make time resource versions more predictable #45

Open
jwntrs opened this issue Aug 19, 2019 · 11 comments
Open

Make time resource versions more predictable #45

jwntrs opened this issue Aug 19, 2019 · 11 comments

Comments

@jwntrs
Copy link

jwntrs commented Aug 19, 2019

What challenge are you facing?

Right now time resource versions are unpredictable. The resource check will output the current timestamp after its interval has elapsed (or whatever other configuration is used). However, if the check gets delayed then the gap between versions is unpredictable.

What would make this better?

Make time resource versions more predictable. Instead of emitting the current timestamp, emit versions at set times based on the interval specified. This means that if the last version emitted was 1:42:05 and the interval was 1m then if the check runs at say 1:45:30 the following versions would be emitted: 1:43:05, 1:44:05, 1:45:05. This way regardless of when the check actually runs, we end up with a deterministic set of versions.

@evanchaoli
Copy link

If interval is 3m, last version emitted at 1:40:00, and check runs at 1:45:00, what version should be emitted?

@jwntrs
Copy link
Author

jwntrs commented Aug 20, 2019

Would be emitted by the current time resource? Or should be emitted, as proposed by this issue?

Currently it would emit 1:45:00.
This issue argues that it should emit 1:43:00 instead.

@abhishekgupta-lab
Copy link

I am also facing same issue and its very unpredictable sometimes it works correctly and trigger it once but sometimes it triggers job twice. do we know what is causing this unpredictable behavior and ETA on fix?

@owenfarrell
Copy link
Member

So, I started hacking away at this yesterday, but came across some key questions. Below are the assumptions that I've started running with.

A1: When provided with a previous version, check should a list of all valid versions since the specified version

The assumption here is to strictly adhere to the contract for implementing a resource and that anyone who wants to trigger jobs for every version can configure the pipeline to do just that.

Example:

resources:
- name: this_resource_should_generate_3_new_versions_each_time_it_is_checked
  type: time
  check_every: 15m
  source:
    interval: 5m

A2: When defining an interval and a start, new versions will be generated based on the start time (not the previous version)

The assumption here is to essentially "reset" the interval calculation each day. This ensures consistency for situations where the specified interval doesn't divide evenly in to a 24 hour period or where there aren't 24 hours in a day (switching to/from daylight saving time).

Example:

resources:
- name: this_interval_does_not_divide_evenly_in_to_24h
  type: time
  source:
    interval: 35m
    start: 3:00am
    stop: 6:00am

The above configuration would generate the following versions every day (note that these are the versions generated, not when the versions are generated).

  • 3:00:00
  • 3:35:00
  • 4:05:00
  • 4:40:00
  • 5:15:00
  • 5:50:00

A3: Implementing this kind of consistency only applies when an interval is defined.

If an interval is not defined, then the existing behavior (generating a version based on the current timestamp) will continue when within the start/stop window.

@vito
Copy link
Member

vito commented Jun 9, 2020

@owenfarrell Awesome - happy to see this being worked on!

A1 and A2 make sense to me.

For A3, to be honest I'm having a hard time interpreting what start + stop without interval should even mean. If it just returns the current time, that means the version history would be coupled to the cluster-wide checking interval, which seems like a bad idea. Does anyone use this functionality? Maybe we should just have it return the start time and ignore stop? 🤔 (Or just make interval required?)

@owenfarrell
Copy link
Member

tl;dr I think there are legitimate use cases for the existing validation logic (interval and/or stop + start). But I also see the rationale for making stop optional.

Our use case was specifically to trigger infrequent jobs that don't have strict timing requirements (3rd party repository checks, key rotations, etc.).

If it just returns the current time, that means the version history would be coupled to the cluster-wide checking interval

True story. In our scenario, we care that a new version is generated, but we don't particularly care what the version actually is.

Given the interval for our jobs, we don't want to burden Concourse with the management/overhead of running check on a frequent cadence when new versions are rarely (relatively speaking) generated.

So our approach relies on check_every + start/stop (but no interval) to generate exactly 1 new version at some point in the start/stop window.

@vito
Copy link
Member

vito commented Jun 9, 2020

@owenfarrell Gotcha. In that case having it just return the start time (once it's elapsed) sounds reasonable to me. Though maybe we should rename it to at or something since stop isn't really respected.

So there would be 3 use cases:

  • interval: run on a fixed interval
  • interval + start + stop: run on a fixed interval within the time range
  • at: run only at the specified time

How does that sound?

@owenfarrell
Copy link
Member

@vito That makes sense to me.

I have a draft PR out there which addresses the original report (but not at). The logic looks a bit gnarly, but mostly because it's not DRY yet.

Any initial feedback would be appreciated!

@owenfarrell
Copy link
Member

@vito - I could use a bit of guidance on one more scenario that I've run in to (and it relates to @xtremerui's question on #51).

A2: When defining an interval and a start, new versions will be generated based on the start time (not the previous version)

One implication of this assumption is that versions being generated out in the wild today will no longer be valid in the new implementation since start times are defined down to minute-level precision.

There are a couple of knock-on implications that come about as a result:

  1. Strict adherence to the contract for implementing a resource would suggest that if/when provided a version with sub-minute precision, that version is not returned in the response from check since that version is no longer valid.
  2. When specifying an interval, the first execution after upgrading will be a bit delayed. For example: if the last version generated by v1.2.1 of this resource is 2020-06-05T15:12:34.567-04:00, a 1m interval won't generate another version until 2020-06-05T15:14:00-04:00. That's a small scale example, but illustrates the issue at hand.

Before I go any further down this rabbit hole (which is going to have a significant impact on the unit tests for check), I wanted to ground these implications.

Thoughts?

@vito
Copy link
Member

vito commented Jun 10, 2020

@owenfarrell Those both sound like what I would expect. I appreciate the detailed thoughts on this! 🙂

@Kump3r
Copy link

Kump3r commented Aug 30, 2022

Hello all,

We are also facing the same issue.
I am just checking in, that this is also relevant for us.
Thank you for the efforts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants