-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate TripUpdate.schedule_relationship = ADDED, add TripUpdate.schedule_relationship = NEW / REPLACEMENT to specify new / replaced trips which do not run on a schedule from the GTFS static. #504
base: master
Are you sure you want to change the base?
Conversation
…elds in TripProperties and StopTimeProperties to support fields needed for such trips
Nice to see some movement in that direction! I think you used a markdown editor that changed formatting on a lot of tables which makes it hard to see the actual diff from your proposal. Would it be possible to fix that? You put a lot in the PR description that's not actually in the proposed changes. Is that just to start the discussion? Some of it is quite consequential, like I'm a bit puzzled on how a consumer is supposed to ingest ADDED changes like this with arbitrary trips with no more information than an headsign. Which route is that on? Is those added trips supported only on existing routes in the GTFS? If the answer is no, we're getting quite close to the service change proposal : https://bit.ly/gtfs-service-changes-v3_1 |
Thanks for opening this PR! OTP has had an implementation of ADDED for a long time but its behaviour is severely underspecified. I'd love to formalise it. Yes, OTP allows you to create completely new free form trips that have no relation to an existing pattern or trip. It tries match the given route id to an existing one but if none is in the message a dummy one is created. For once, OTP is really following the "just give us what you have, and we will try to work it out" strategy. The only requirement we have is that the stop ids must match the static GTFS. The question is what should happen when they don't. Should the entire update be dropped or individual stops? Does that even need to be specified? I agree with what @gcamp said about the markdown tables and the issue description. Lastly, you might find it easier to get this through review if you split it into two PRs: one for ADDED and one for REPLACEMENT. That's just a guess though. |
I think that the requirement for the whole trip to be specified is written in the code. Let me know if it is not clear enough. I'll fix the formatting later today. |
"The whole journey of the added trip must be specified" is a fact in the core of this PR, noted in the updated definition of
The route and direction of the ADDED trip is specified in |
I do not want to specify the behaviour of missing stops at this moment because it may depend on the client's capability for dynamically adding stops via |
So this is pretty much a codifcation of what OTP has been supporting for several years. This would of course be very convenient for us but I would like to hear more voices from the community, in particular producers. I know that HSL (Helsinki) is using this as both a producer and consumer (OTP) for many years. MBTA has also indicated that they use ADDED as a producer. @optionsome @jfabi @sam-hickey-ibigroup |
Accept suggestion by @leonardehrenfried for definition of TripUpdate.ScheduleRelationship = ADDED Co-authored-by: Leonard Ehrenfried <[email protected]>
formatting fix Co-authored-by: Leonard Ehrenfried <[email protected]>
Producing it for over 12 years too. |
HSL doesn't produce or consume ADDED or REPLACEMENT updates currently, if that was what you were referring to. |
What happens when you ADD a trip and then CANCEL it again? Should it be become invisible in the system ( |
That's a good question. I still need to think about how things will work. My producer implementation cancels an added trip using TripUpdate.schedule_relationship = The questions are that:
|
As we are considering |
Actually, @skinkie is right. If you use FULL_DATASET then the moment you fetch the new version of the RT feed the old ADDED trip will completely vanish and it neither exists as DELETED nor CANCELLED. It's like it never existed. However, once there is movement towards specifying INCREMENTAL we will have to revisit this. |
Does anyone know how Google and Apple handle ADDED? @eliasmbd I don't know who the relevant person from Apple would be. Could you tag them? |
@leonardehrenfried at Google the thing was limited to stop sequences previously observed. Hence if the ADDED trip was an instance of a stop sequence that is part of the database, it could be processed. I don't know if it is already capable of processing a partial instance of a stop sequence. https://support.google.com/transitpartners/answer/10106497?hl=en#zippy=%2Cadd-with-tripupdates |
For @miklcct:
I think this is a reasonable concern. I'm supportive of a different name for the new enum to avoid this (perhaps
Yep, that's the nature / risk of being an early adopter for a spec proposal. But easier for these 5 than the 250+ feeds I counted that have specified an ADDED enum in the past four months. For @skinkie:
I'd be sympathetic to this argument if the enum hadn't been around for more than a decade at this point. Plus, I think you give ourselves too much credit in our ability to get that many feeds proactively cleaned up. It's work that needs to be done, but I'd rather it be explicit what kind of feed behavior we are dealing with. |
Apologies, forgot to respond to the question about |
How many of your 250+ feeds are actually using ADDED to specify new services in line with my proposal / OpenTripPlanner implementation? Most of the +1 are already producing ADDED as a mean to specify short term planned train services, we can't be sure if there are much more feeds in the wild which are using the same implementation as those who have +1'd as well.
A decade ago, REPLACEMENT was removed from the spec, unfortunately Sydney Trains were not notified and they still keep using it because there are no alternatives (they would have voted against such removal if they knew about the problem). Do Google still process REPLACEMENT trips, and are you aware of anyone, apart from Sydney Trains, who keeps producing REPLACEMENT trips over the decade? Are they using the pre-2014 specification which is hard to implement, or the OpenTripPlanner implementation which requires specifying the whole trip? OpenTripPlanner community's finding is that this feature (ADDED and REPLACEMENT) hasn't been used actively for years and I contributed a number of bug fixes to support my application. I need some consensus to decide how I should progress on this. Should I work on using a new field called NEW instead of ADDED? Similarly, should I use another new field called MODIFIED instead of REPLACEMENT? (I choose these two words because they are how OpenTripPlanner calls these trip in the internal model, but feel free to suggest yours) |
This is also not true. Both where implemented in OTP 1.x, the active OTP development community (read: Entur) moved to OTP 2.x. Still many users were still using OTP 1.x, especially for the realtime support. So the Dutch community has been using this since 2013 for railways (major) and for busses (minor). I would prefer that we take this discussion in a versioning direction, rather than introducing new fields. |
I don't know. I can try to do some research on this, but it's the uncertainty that makes me uncomfortable retroactively adding new required semantics for this trip type. |
This is why I think it would be a good idea to start to introduce something like 'compliant with GTFS-RT version' field, so we can proactively distinguish the past and the future. |
There is already a If we want to move forward in terms of versioning, I'll need it to be 2.1 to guarantee these behaviour specified in my proposal (then it will need to be bumped to 2.2 if some other underspecified field, such as the behaviour of incremental feeds, are formally specified). A producer who are aware of this will then have to set the value in the feed to 2.1, while new consumers who are aware of this feature can retain guesswork if the feed version is not specified as 2.1 or above. |
According to this discussion, Translink was still using REPLACEMENT trips as of 2019. |
I've got lots of thoughts on versioning but the tl;dr is that it's not something I'd support for making frequent backwards-compatible-breaking-for-producers changes to the spec. Which is to say, if we are ever going to bump the major version number for GTFS-Realtime, we'd want to bundle a number of changes together, not just this one. So put another way, if you want to make progress on this proposal in 2025, I don't think versioning is going to be the quickest path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One additional general comment: only some of the new fields are marked as experimental. Was this intended? It seems all new fields should be noted as experimental.
gtfs-realtime/spec/en/reference.md
Outdated
@@ -454,12 +475,14 @@ Note that if the trip_id is not known, then station sequence ids in TripUpdate a | |||
|
|||
TripDescriptor.route_id cannot be used within an Alert EntitySelector to specify a route-wide alert that affects all trips for a route - use EntitySelector.route_id instead. | |||
|
|||
If `schedule_relationship` is `ADDED`, `trip_id` must be set to a value not exist in the GTFS feed, and `route_id` must be set to associate the trip to a route. `start_date` should be set, and `direction_id` may be set for the new trip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This route_id
must be in the GTFS feed? If yes, that should be stated here.
gtfs-realtime/spec/en/reference.md
Outdated
@@ -164,17 +165,18 @@ Note that the update can describe a trip that has already completed.To this end, | |||
|------------------|------------|----------------|-------------------|-------------------| | |||
| **trip** | [TripDescriptor](#message-tripdescriptor) | Required | One | The Trip that this message applies to. There can be at most one TripUpdate entity for each actual trip instance. If there is none, that means there is no prediction information available. It does *not* mean that the trip is progressing according to schedule. | | |||
| **vehicle** | [VehicleDescriptor](#message-vehicledescriptor) | Optional | One | Additional information on the vehicle that is serving this trip. | | |||
| **stop_time_update** | [StopTimeUpdate](#message-stoptimeupdate) | Conditionally required | Many | Updates to StopTimes for the trip (both future, i.e., predictions, and in some cases, past ones, i.e., those that already happened). The updates must be sorted by stop_sequence, and apply for all the following stops of the trip up to the next specified stop_time_update. At least one stop_time_update must be provided for the trip unless the trip.schedule_relationship is CANCELED, DELETED, or DUPLICATED. If the trip is canceled or deleted, no stop_time_updates need to be provided. If stop_time_updates are provided for a canceled or deleted trip then the trip.schedule_relationship takes precedence over any stop_time_updates and their associated schedule_relationship. If the trip is duplicated, stop_time_updates may be provided to indicate real-time information for the new trip. | | |||
| **stop_time_update** | [StopTimeUpdate](#message-stoptimeupdate) | Conditionally required | Many | Updates to StopTimes for the trip (both future, i.e., predictions, and in some cases, past ones, i.e., those that already happened). The updates must be sorted by stop_sequence, and apply for all the following stops of the trip up to the next specified stop_time_update.<br>If trip.schedule_relationship is SCHEDULED, at least one stop_time_update must be provided for the trip.<br>If trip.schedule_relationship is ADDED or REPLACEMENT, stop_time_updates must be provided for all stops in the added or replacement trip, and the stop times in the static GTFS are not used.<br>If the trip is canceled or deleted, no stop_time_updates need to be provided. If stop_time_updates are provided for a canceled or deleted trip then the trip.schedule_relationship takes precedence over any stop_time_updates and their associated schedule_relationship. If the trip is duplicated, stop_time_updates may be provided to indicate real-time information for the new trip. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ADDED and REPLACEMENT trips, do stop_time_updates need to be provided for all stops at all times when the trip is in the feed, even when those stop_time_updates refer to times in the past? If yes, it would be good to state that to avoid ambiguity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the text already says that stop_time_updates must be provided for all stops in the added or replacement trip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. But the first line says, “Updates to StopTimes for the trip (both future, i.e., predictions, and in some cases, past ones, i.e., those that already happened).” The “in some cases” implies past times do not need to be provided, and in practice many producers do not publish past stop times. If all times (including those in the past) need to be always provided for ADDED and REPLACEMENT trips, then I would suggest stating that explicitly.
Another thing to consider from https://github.com/google/transit/blob/master/gtfs-realtime/spec/en/trip-updates.md:
You are allowed, but not required, to drop past stop times. Producers should not drop a past StopTimeUpdate if it refers to a stop with a scheduled arrival time in the future for the given trip (i.e. the vehicle has passed the stop ahead of schedule), as otherwise it will be concluded that there is no update for this stop.
Updates can be supplied for both past and future events. The producer is allowed, although not required, to drop past events.
It makes sense that all stops with future times on ADDED and REPLACEMENT trips must be published as there is no GTFS trip to link back to, but is there a reason ADDED and REPLACEMENT trips need to include times in the past?
gtfs-realtime/spec/en/reference.md
Outdated
**Fields** | ||
|
||
| _**Field Name**_ | _**Type**_ | _**Required**_ | _**Cardinality**_ | _**Description**_ | | ||
|------------------|------------|----------------|-------------------|-------------------| | ||
| **trip_id** | [string](https://protobuf.dev/programming-guides/proto2/#scalar) | Conditionally required | One | The trip_id from the GTFS feed that this selector refers to. For non frequency-based trips (trips not defined in GTFS frequencies.txt), this field is enough to uniquely identify the trip. For frequency-based trips defined in GTFS frequencies.txt, trip_id, start_time, and start_date are all required. For scheduled-based trips (trips not defined in GTFS frequencies.txt), trip_id can only be omitted if the trip can be uniquely identified by a combination of route_id, direction_id, start_time, and start_date, and all those fields are provided. When schedule_relationship is DUPLICATED within a TripUpdate, the trip_id identifies the trip from static GTFS to be duplicated. When schedule_relationship is DUPLICATED within a VehiclePosition, the trip_id identifies the new duplicate trip and must contain the value for the corresponding TripUpdate.TripProperties.trip_id. | | ||
| **route_id** | [string](https://protobuf.dev/programming-guides/proto2/#scalar) | Conditionally required | One | The route_id from the GTFS that this selector refers to. If trip_id is omitted, route_id, direction_id, start_time, and schedule_relationship=SCHEDULED must all be set to identify a trip instance. TripDescriptor.route_id should not be used within an Alert EntitySelector to specify a route-wide alert that affects all trips for a route - use EntitySelector.route_id instead. | | ||
| **trip_id** | [string](https://protobuf.dev/programming-guides/proto2/#scalar) | Conditionally required | One | The trip_id from the GTFS feed that this selector refers to. For non frequency-based trips (trips not defined in GTFS frequencies.txt), this field is enough to uniquely identify the trip. For frequency-based trips defined in GTFS frequencies.txt, trip_id, start_time, and start_date are all required. For scheduled-based trips (trips not defined in GTFS frequencies.txt), trip_id can only be omitted if the trip can be uniquely identified by a combination of route_id, direction_id, start_time, and start_date, and all those fields are provided. When schedule_relationship is ADDED, it must be specified with a unique value not exist in the GTFS static, When schedule_relationship is DUPLICATED within a TripUpdate, the trip_id identifies the trip from static GTFS to be duplicated. When schedule_relationship is DUPLICATED within a VehiclePosition, the trip_id identifies the new duplicate trip and must contain the value for the corresponding TripUpdate.TripProperties.trip_id. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to define rules for trip_id
for REPLACEMENT trips. I think the idea is that trip_id
for REPLACEMENT trips must match a GTFS trip_id
and fully replaces the original GTFS trip_id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same with other values except ADDED, which matches one of an existing trips.
gtfs-realtime/spec/en/reference.md
Outdated
@@ -215,7 +220,7 @@ The relation between this StopTime and the static schedule. | |||
|-------------|---------------| | |||
| **SCHEDULED** | The vehicle is proceeding in accordance with its static schedule of stops, although not necessarily according to the times of the schedule. This is the **default** behavior. At least one of arrival and departure must be provided. Frequency-based trips (GTFS frequencies.txt with exact_times = 0) should not have a SCHEDULED value and should use UNSCHEDULED instead. | | |||
| **SKIPPED** | The stop is skipped, i.e., the vehicle will not stop at this stop. Arrival and departure are optional. When set `SKIPPED` is not propagated to subsequent stops in the same trip (i.e., the vehicle will stop at subsequent stops in the trip unless those stops also have a `stop_time_update` with `schedule_relationship: SKIPPED`). Delay from a previous stop in the trip *does* propagate over the `SKIPPED` stop. In other words, if a `stop_time_update` with an `arrival` or `departure` prediction is not set for a stop after the `SKIPPED` stop, the prediction upstream of the `SKIPPED` stop will be propagated to the stop after the `SKIPPED` stop and subsequent stops in the trip until a `stop_time_update` for a subsequent stop is provided. | | |||
| **NO_DATA** | No data is given for this stop. It indicates that there is no realtime timing information available. When set NO_DATA is propagated through subsequent stops so this is the recommended way of specifying from which stop you do not have realtime timing information. When NO_DATA is set neither arrival nor departure should be supplied. | | |||
| **NO_DATA** | No data is given for this stop. It indicates that there is no realtime timing information available. When set NO_DATA is propagated through subsequent stops so this is the recommended way of specifying from which stop you do not have realtime timing information. When NO_DATA is set neither arrival nor departure should be supplied. NO_DATA must not be used for added or replacement trips, as the StopTimeUpdate defines the stop list of the trip. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not allowing NO_DATA to be used for ADDED or REPLACEMENT trips means that producers can output ADDED or REPLACEMENT trips only when they have real-time data available for the trip (SCHEDULED = "The vehicle is proceeding in accordance with its static schedule of stops..."). With the addition of scheduled_time
, there is an opportunity to allow consumers to define these ADDED or REPLACEMENT trips before real-time data is available. This would allow producers to provide valuable data to consumers earlier so customers get the info sooner. What do others think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your idea is good, and I have to change the definition of NO_DATA
to "no real-time data is given for this stop".
The original behaviour of ADDED trips was never specified. No one actually knew what it meant. How is it a breaking change if there wasn't even a behaviour to break? If you relied on some unspecified behaviour for your systems to work, it is your problem. |
What I've often seen is more like option 1 in the https://gtfs.org/documentation/realtime/examples/migration-duplicated/ where the ADDED trips tripId refers to the static GTFS trip id that is being copied. This would now be strictly disallowed because this proposal states "In this proposal, TripUpdate.schedule_relationship = ADDED should be used to add trips which do not duplicate an existing trip" and "trip_id in the TripDescriptor for added trips must be completely new (not found in GTFS static)" |
I promised I'd come back with a few more data points on my end.
So I come back to my original point. Though ADDED was never formalized, there are a significant number of existing feeds that are using the enum in a way that would become invalid with the proposed change. Thus, I consider this a backwards-compatible-breaking change and I'm still -1 on the proposal as currently written. But as I said, I'm supportive of a well-documented path for feeds that want to specify new trips. I think that's best achieved through a new ScheduleRelationship enum tag. |
Agreed with @bdferris-v2 that even though ADDED was not previously fully defined, the changes proposed here would break existing feeds. I am also in favor of changing this proposal to use a new ScheduleRelationship value. |
The voting period is now over. The result was 5 yes and 1 no. According to the Amendment process, for this vote, unanimous consensus is NOT reached, hence this proposal is not accepted. As the advocate my intention is to continue to work on the proposal. |
Thank you for your continued advocacy and for keeping us informed of the results. Wishing you the best in your ongoing efforts. |
Unfortunately my proposal isn't workable because there are existing feeds and implementations where As a result, I will work on a new version where the use of For |
da8039f
to
4361f12
Compare
I would be more in favor of a version bump than adding a new enumeration. |
I have updated the PR and would like some discussion. I aim to present this for voting (with a NEW enum) in early February. |
|
||
### Using ADDED and NEW entities in the same feed | ||
|
||
If you are a producer who has been using the `ADDED` enumeration to specify trips which are unrelated to the schedule, to avoid disruption to existing consumers it is recommended that you continue to produce `ADDED` entitles for these trips but also add `NEW` entitles for the same trip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this approach, there for I think it's good to seriously consider a version bump where the 'ADDED' enumeration is completely removed as @skinkie suggested.
The use of
TripUpdate.schedule_relationship = ADDED
was unspecified and different producers / consumers used it in different ways. For example, it is sometimes used to specify additional departures on an existing route, but it is also used to specify departures which can't be matched to any existing trips.This PR attempts to fully deprecate ADDED, and introduce NEW and REPLACEMENT, based on the implementation of OpenTripPlanner which specifies the whole journey to be added or replaced. Additional fields, such as headsigns, and pickup / drop-off types, are introduced as required to support the full specification of completely new trips.
NEW
In this proposal,
TripUpdate.schedule_relationship = NEW
should be used to add trips which do not duplicate an existing trip. Such trips are considered to be unrelated to any existing trips in the GTFS Static and can serve an arbitrary pattern, including completely new patterns not found in the GTFS Static.A typical use case is for relief trips for extra demand, typically after big events.
NEW trips are intended as a migration target for existing feeds which use ADDED to specify new trips unrelated to the static GTFS, including OpenTripPlanner.
trip_id
in theTripDescriptor
for new trips must be completely new (not found in GTFS static) and unique, and astart_date
should also be specified as well (I am not using the word "must" here because it is permitted not to specify start_date to match scheduled trips, in this case the trip is assumed to run today).The whole journey of the added trip must be specified, in stop order, as
StopTimeUpdate
s inside theTripUpdate
without any omission. Fields are added toTripProperties
andStopTimeProperties
for esssential information such as names, headsigns, pickup / drop off types.REPLACEMENT
I propose to un-deprecate
TripUpdate.schedule_relationship = REPLACEMENT
as well. It works in the same way asNEW
, apart from that theTripDescriptor
must match one instance of a scheduled trip (like other values ofTripUpdate.schedule_relationship
), and that instance is replaced with the complete replacement trip specified in form ofStopTimeUpdate
s like an added trip. The original stop times in the GTFS static are not considered by the replacement trip in any form to avoid confusion. The replacement trip can serve an arbitrary pattern with an arbitrary schedule, the only expectation is that the passenger should associate the replacement trip to actually be a replacement of the original trip.A typical use case is for short-term timetable change, or short-term (near real-time) diversion, where the fact that the
trip_id
remains the same can be used by journey planners to notify the user that the booked service has been changed. (In particular, I have successfully used this feature to handle real-time train diversions in GB in OpenTripPlanner and route users to alight at diverted stops, which is something neither Google Maps nor Citymapper can do now)This is the behaviour implemented in OpenTripPlanner, which is equivalent to deleting the matched trip, and processing the replacement
TripUpdate
as an ADDED trip mentioned above.Relationship to
TripModification
TripModification
provides a way to modify trips en-masse by specifying a list of trip IDs where the same detour can be applied. However, it is not suited to change the schedule on a per-trip basis, replacing the trip with a completely different schedule after any diversions with different running times (common due to pathing constraints on railways).It should be forbidden to modify the same trip via a
REPLACEMENT
trip update and also via aTripModification
.