Generate Raptor transfer cache in parallel #6326

miklcct · 2024-12-09T19:18:37Z

Summary

This makes the Raptor cache generating process run in parallel.

Our GB-wide deployment pre-caches 4 configurations on startup. Before applying this fix, it takes 16 minutes to cache 4 configurations for the whole GB on a 16-core machine:

Dec 09 17:54:06 lemon java[1497422]: 17:54:06.305 INFO [main]  (ConstructApplication.java:242) Creating initial raptor transfer cache progress tracking started.
Dec 09 17:58:31 lemon java[1497422]: 17:58:31.315 INFO [main]  (RaptorRequestTransferCache.java:44) Initializing cache with request: StreetRelevantOptions{transferMode: WALK, walk: WalkPreferences{reluctance: 1.68, boardCost: $300}, street: StreetPreferences{drivingDirection: LEFT, accessEgress: AccessEgressPreferences{maxDuration: DurationForStreetMode{default:2h}}}}
Dec 09 17:58:31 lemon java[1497422]: 17:58:31.323 INFO [main]  (ConstructApplication.java:248) Creating initial raptor transfer cache progress: 1 of 4 (25%)
Dec 09 18:02:42 lemon java[1497422]: 18:02:42.391 INFO [main]  (RaptorRequestTransferCache.java:44) Initializing cache with request: StreetRelevantOptions{transferMode: WALK, wheelchair, walk: WalkPreferences{reluctance: 1.68, boardCost: $300}, street: StreetPreferences{drivingDirection: LEFT, accessEgress: AccessEgressPreferences{maxDuration: DurationForStreetMode{default:2h}}}, wheelchairPreferences: WheelchairPreferences{trip: AccessibilityPreferences{}, stop: AccessibilityPreferences{}, slopeExceededReluctance: 50.0, stairsReluctance: 25.0}}
Dec 09 18:02:42 lemon java[1497422]: 18:02:42.394 INFO [main]  (ConstructApplication.java:248) Creating initial raptor transfer cache progress: 2 of 4 (50%)
Dec 09 18:06:41 lemon java[1497422]: 18:06:41.853 INFO [main]  (RaptorRequestTransferCache.java:44) Initializing cache with request: StreetRelevantOptions{transferMode: WALK, walk: WalkPreferences{reluctance: 1.0, boardCost: $0}, street: StreetPreferences{drivingDirection: LEFT, accessEgress: AccessEgressPreferences{maxDuration: DurationForStreetMode{default:2h}}}}
Dec 09 18:06:41 lemon java[1497422]: 18:06:41.855 INFO [main]  (ConstructApplication.java:248) Creating initial raptor transfer cache progress: 3 of 4 (75%)
Dec 09 18:10:50 lemon java[1497422]: 18:10:50.696 INFO [main]  (RaptorRequestTransferCache.java:44) Initializing cache with request: StreetRelevantOptions{transferMode: WALK, wheelchair, walk: WalkPreferences{reluctance: 1.0, boardCost: $0}, street: StreetPreferences{drivingDirection: LEFT, accessEgress: AccessEgressPreferences{maxDuration: DurationForStreetMode{default:2h}}}, wheelchairPreferences: WheelchairPreferences{trip: AccessibilityPreferences{}, stop: AccessibilityPreferences{}, slopeExceededReluctance: 50.0, stairsReluctance: 25.0}}
Dec 09 18:10:50 lemon java[1497422]: 18:10:50.698 INFO [main]  (ConstructApplication.java:248) Creating initial raptor transfer cache progress: 4 of 4 (100%)
Dec 09 18:10:50 lemon java[1497422]: 18:10:50.698 INFO [main]  (ConstructApplication.java:251) Creating initial raptor transfer cache progress tracking complete. 4 done in 16m44s (0 per second).

After applying this patch, it only takes 5 minutes:

Dec 09 18:46:09 lemon java[1500378]: 18:46:09.979 INFO [main]  (ConstructApplication.java:242) Creating initial raptor transfer cache progress tracking started.
Dec 09 18:47:31 lemon java[1500378]: 18:47:31.036 INFO [main]  (RaptorRequestTransferCache.java:44) Initializing cache with request: StreetRelevantOptions{transferMode: WALK, walk: WalkPreferences{reluctance: 1.68, boardCost: $300}, street: StreetPreferences{drivingDirection: LEFT, accessEgress: AccessEgressPreferences{maxDuration: DurationForStreetMode{default:2h}}}}
Dec 09 18:47:31 lemon java[1500378]: 18:47:31.044 INFO [main]  (ConstructApplication.java:248) Creating initial raptor transfer cache progress: 1 of 4 (25%)
Dec 09 18:48:46 lemon java[1500378]: 18:48:46.023 INFO [main]  (RaptorRequestTransferCache.java:44) Initializing cache with request: StreetRelevantOptions{transferMode: WALK, wheelchair, walk: WalkPreferences{reluctance: 1.68, boardCost: $300}, street: StreetPreferences{drivingDirection: LEFT, accessEgress: AccessEgressPreferences{maxDuration: DurationForStreetMode{default:2h}}}, wheelchairPreferences: WheelchairPreferences{trip: AccessibilityPreferences{}, stop: AccessibilityPreferences{}, slopeExceededReluctance: 50.0, stairsReluctance: 25.0}}
Dec 09 18:48:46 lemon java[1500378]: 18:48:46.026 INFO [main]  (ConstructApplication.java:248) Creating initial raptor transfer cache progress: 2 of 4 (50%)
Dec 09 18:50:02 lemon java[1500378]: 18:50:02.286 INFO [main]  (RaptorRequestTransferCache.java:44) Initializing cache with request: StreetRelevantOptions{transferMode: WALK, walk: WalkPreferences{reluctance: 1.0, boardCost: $0}, street: StreetPreferences{drivingDirection: LEFT, accessEgress: AccessEgressPreferences{maxDuration: DurationForStreetMode{default:2h}}}}
Dec 09 18:50:02 lemon java[1500378]: 18:50:02.287 INFO [main]  (ConstructApplication.java:248) Creating initial raptor transfer cache progress: 3 of 4 (75%)
Dec 09 18:51:18 lemon java[1500378]: 18:51:18.349 INFO [main]  (RaptorRequestTransferCache.java:44) Initializing cache with request: StreetRelevantOptions{transferMode: WALK, wheelchair, walk: WalkPreferences{reluctance: 1.0, boardCost: $0}, street: StreetPreferences{drivingDirection: LEFT, accessEgress: AccessEgressPreferences{maxDuration: DurationForStreetMode{default:2h}}}, wheelchairPreferences: WheelchairPreferences{trip: AccessibilityPreferences{}, stop: AccessibilityPreferences{}, slopeExceededReluctance: 50.0, stairsReluctance: 25.0}}
Dec 09 18:51:18 lemon java[1500378]: 18:51:18.352 INFO [main]  (ConstructApplication.java:248) Creating initial raptor transfer cache progress: 4 of 4 (100%)
Dec 09 18:51:18 lemon java[1500378]: 18:51:18.353 INFO [main]  (ConstructApplication.java:251) Creating initial raptor transfer cache progress tracking complete. 4 done in 5m8s (0 per second).

Also, the journey planning response time for a new configuration has been reduced correspondingly from more than 4 minutes to around 1.5 minutes.

Issue

#6312

Unit tests

None. This is a performance improvement only with no externally visible change.

Documentation

N/A

Changelog

Bumping the serialization version id

Not needed

codecov · 2024-12-09T19:24:48Z

Codecov Report

Attention: Patch coverage is 93.75000% with 1 line in your changes missing coverage. Please review.

Project coverage is 69.85%. Comparing base (5f9b448) to head (2ae8816).
Report is 27 commits behind head on dev-2.x.

Files with missing lines	Patch %	Lines
...thm/raptoradapter/transit/RaptorTransferIndex.java	92.85%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             dev-2.x    #6326   +/-   ##
==========================================
  Coverage      69.85%   69.85%           
- Complexity     17921    17931   +10     
==========================================
  Files           2035     2035           
  Lines          76495    76520   +25     
  Branches        7824     7826    +2     
==========================================
+ Hits           53434    53454   +20     
- Misses         20324    20326    +2     
- Partials        2737     2740    +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

t2gran · 2024-12-11T18:50:07Z

I think this is going to reduce the throughput slightly. The issue is that one request that require new transfers to be generated will steel processor time from other requests. I am not sure how the affect memory fetches, but It might have a negative effect on running trip searches - at least if a planning request is swapped out in favour of calculating transfers.

The threads also loose log-trace-parameters-propagation and graceful timeout handling.

The parrallel procecing at least need to be feature enabled using OTPFeature.ParallelRouting.isOn().

optionsome · 2024-12-12T11:22:43Z

There is also a possibility that these are only computed in parallel before start up but not after server is running. I don't know whether this code is used for both cases or not.

miklcct · 2024-12-12T13:03:20Z

There is also a possibility that these are only computed in parallel before start up but not after server is running. I don't know whether this code is used for both cases or not.

I specifically need it to compute in parallel in order to make our response time down from 4 minute to 1 minute.

optionsome · 2024-12-12T15:02:34Z

Only check the feature flag during run time, not during start-up.

optionsome · 2024-12-13T20:09:59Z

I tested this in our dev environment (without parallel routing, so just for start-up). I did not witness the transfer cache processing being faster, it was maybe even slightly slowed on the machines we use. I benchmarked on D4ads v5 machines (4 vcpu) in Azure.

miklcct · 2024-12-13T20:14:19Z

How many transfers do you have, and what's the load of your machine?

We are running on our own hardware (we have just placed a new physical server into the data centre a few weeks ago).

optionsome · 2024-12-13T21:01:08Z

How many transfers do you have, and what's the load of your machine?

I tried it with few different instances with slightly different configurations but 3-7 cache requests but I'm not sure about the total number of transfers but I tried with a countrywide deployment, for example. There shouldn't be much other load on the machines but I tested these in kubernetes environments so there are some kube services running on the machines and some of the deployments had some real-time updaters configured also.

I wouldn't mind if someone else could also do some benchmarking if these changes have a positive effect or not.

# Conflicts: # application/src/main/java/org/opentripplanner/routing/algorithm/raptoradapter/transit/RaptorTransferIndex.java

...ntripplanner/routing/algorithm/raptoradapter/transit/request/RaptorRequestTransferCache.java

leonardehrenfried · 2025-01-03T10:08:53Z

I deployed this to one of my environments and I did see a slight speed up. Definitely not as much as @miklcct is seeing but still a bit faster.

t2gran · 2025-01-03T10:22:00Z

OTP is designed to run one request in one thread - this optimize throughput, but have the disadvantage that long lasting request use more time. When testing this it is important to have enough requests to "activate" all threads. This can be done using a test client which run N threads and immediately send a new request when getting a result back. Then run the same test with N = 1, .., num-of-cores. Another factor is the number of requests that trigger a new transfer generation for the cashe. It is hard to reason about the performance numbers posted above without knowing how the tests are performed. One of the key things we would like to know is the effect a "long lasting request" has on other request. Not, just avg. response times.

I am ok with this after the change where we can turn off this feature in REQUEST_SCOPE.

optionsome · 2025-01-03T10:26:05Z

I deployed this to one of my environments and I did see a slight speed up. Definitely not as much as @miklcct is seeing but still a bit faster.

Ok, I will retest this in our environment at some point to re-validate how it affects the performance in our setup. If we can still observe slightly negative effect, perhaps the start up parallelism should be put behind another feature flag.

miklcct · 2025-01-03T10:35:46Z

I deployed this to one of my environments and I did see a slight speed up. Definitely not as much as @miklcct is seeing but still a bit faster.

How many transfers do you have? I have now got 5.7M WALK transfers and 16.1M BIKE transfers in my whole graph on the test server configured with 20 minutes transfers for the whole GB, which is 10.3 GB in size.

leonardehrenfried · 2025-01-03T10:39:05Z

I don't know the exact number but it's much much smaller. About 50k stops.

t2gran · 2025-01-03T10:41:15Z

Related to this, at Entur we will roll-back PR #6238 - it degrade the performance al lot.

The real solution to this problem is to optimize the transfer calculations - it should not touch the AStar paths at all. As part of this I would also like to see what would happen if we calculated the transfers in every request, using a lazy approach. This would possibly have a good effect on large graphs.

optionsome · 2025-01-03T10:58:17Z

Related to this, at Entur we will roll-back PR #6238 - it degrade the performance al lot.

Why aren't the performance problems visible in the speed tests?

miklcct · 2025-01-03T11:11:36Z

As part of this I would also like to see what would happen if we calculated the transfers in every request, using a lazy approach. This would possibly have a good effect on large graphs.

I am looking forward to this and this may probably be our long-term solution. If it means a local journey with a new configuration can be done in 15 seconds instead of 2 minutes (even if a long-distance journey still requires 2 minutes) we can more easily explain this to our customers.

miklcct · 2025-01-04T12:22:23Z

Related to this, at Entur we will roll-back PR #6238 - it degrade the performance al lot.

Why aren't the performance problems visible in the speed tests?

Do the speed test have requests which generate the run time raptor cache on a large transfer graph? It should show immediately.

This PR combined with the roll back of #6238 has finally make reasonable bike transfers on the whole GB realistic, even for a long ride between London Terminals.

parallel raptor cache generation

d6cd1e1

miklcct requested a review from a team as a code owner December 9, 2024 19:18

habrahamsson-skanetrafiken requested a review from t2gran December 10, 2024 11:02

add feature check around the parallel operation

1bf3f65

optionsome self-requested a review December 12, 2024 15:00

Always parallelize cache building during server startup

cded867

optionsome added Optimization The feature is to improve performance. Digitransit Test Feature is under testing in Digitransit environment(s) labels Dec 12, 2024

optionsome added this to the 2.7 (next release) milestone Dec 12, 2024

optionsome removed the Digitransit Test Feature is under testing in Digitransit environment(s) label Dec 16, 2024

Merge branch 'dev-2.x' into parallel-raptor-cache

07c8964

# Conflicts: # application/src/main/java/org/opentripplanner/routing/algorithm/raptoradapter/transit/RaptorTransferIndex.java

optionsome requested review from leonardehrenfried and removed request for t2gran December 19, 2024 14:42

use an enum to specify the request source

082dafb

t2gran reviewed Jan 3, 2025

View reviewed changes

...ntripplanner/routing/algorithm/raptoradapter/transit/request/RaptorRequestTransferCache.java Outdated Show resolved Hide resolved

apply review feedback

2ae8816

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate Raptor transfer cache in parallel #6326

Generate Raptor transfer cache in parallel #6326

miklcct commented Dec 9, 2024

codecov bot commented Dec 9, 2024 •

edited

Loading

t2gran commented Dec 11, 2024

optionsome commented Dec 12, 2024

miklcct commented Dec 12, 2024

optionsome commented Dec 12, 2024

optionsome commented Dec 13, 2024 •

edited

Loading

miklcct commented Dec 13, 2024 •

edited

Loading

optionsome commented Dec 13, 2024 •

edited

Loading

leonardehrenfried commented Jan 3, 2025

t2gran commented Jan 3, 2025 •

edited

Loading

optionsome commented Jan 3, 2025

miklcct commented Jan 3, 2025 •

edited

Loading

leonardehrenfried commented Jan 3, 2025

t2gran commented Jan 3, 2025

optionsome commented Jan 3, 2025 •

edited

Loading

miklcct commented Jan 3, 2025

miklcct commented Jan 4, 2025

Generate Raptor transfer cache in parallel #6326

Are you sure you want to change the base?

Generate Raptor transfer cache in parallel #6326

Conversation

miklcct commented Dec 9, 2024

Summary

Issue

Unit tests

Documentation

Changelog

Bumping the serialization version id

codecov bot commented Dec 9, 2024 • edited Loading

Codecov Report

t2gran commented Dec 11, 2024

optionsome commented Dec 12, 2024

miklcct commented Dec 12, 2024

optionsome commented Dec 12, 2024

optionsome commented Dec 13, 2024 • edited Loading

miklcct commented Dec 13, 2024 • edited Loading

optionsome commented Dec 13, 2024 • edited Loading

leonardehrenfried commented Jan 3, 2025

t2gran commented Jan 3, 2025 • edited Loading

optionsome commented Jan 3, 2025

miklcct commented Jan 3, 2025 • edited Loading

leonardehrenfried commented Jan 3, 2025

t2gran commented Jan 3, 2025

optionsome commented Jan 3, 2025 • edited Loading

miklcct commented Jan 3, 2025

miklcct commented Jan 4, 2025

codecov bot commented Dec 9, 2024 •

edited

Loading

optionsome commented Dec 13, 2024 •

edited

Loading

miklcct commented Dec 13, 2024 •

edited

Loading

optionsome commented Dec 13, 2024 •

edited

Loading

t2gran commented Jan 3, 2025 •

edited

Loading

miklcct commented Jan 3, 2025 •

edited

Loading

optionsome commented Jan 3, 2025 •

edited

Loading