Lifting the local lock for dynamic execution

In the previous post, we saw how accounting for artifact download times makes the dynamic strategy live to its promise of delivering the best of local and remote build times.

Or does it? If you think about it closely, that change made it so that builds that were purely local couldn’t be made worse by enabling the dynamic scheduler: the dynamic strategy would always favor the local branch of a spawn if the remote one took a long time. But for builds that were better off when they were fully remote (think of a fully-cached build with great networking), this is not true: the dynamic strategy might hurt them because because we may discard some of those remote cache hits.

Think about this scenario:

Send RPC for remote execution and detect a cache hit.
Start downloading outputs.
Start the local unsandboxed spawn, which grabs the output lock upfront.
Downloads complete, but the lock is taken so the remote branch aborts.
The local unsandboxed spawn now has to run through completion, even if that takes a long time or a lot of local resources (think battery!).

Which, in a picture, looks like this:

Slide 29 of my talk on dynamic execution for BazelCon 2019

Not great. The real problem here is the local unsandboxed strategy. If we were using the sandboxed strategy, all would be good because we’d only take the output lock once the local spawn finished execution. But because we do not use sandboxing for performance reasons, we take the lock right away.

What if we could remove the output lock?

What if, instead, we could make the local strategy run free and make the remote branch actively cancel the local branch when it is ready to put the outputs in place?

Then, we would make as much progress as possible on each branch and stop it at the last minute:

Slide 30 of my talk on dynamic execution for BazelCon 2019

Great, but it turns out that this “simple” idea has been very difficult to implement. Let’s see why.

The first step was to refactor the dynamic strategy to not use the output lock. What we need instead is for each branch to cancel each other right before writing to the output tree. I like calling this “cross-cancellations” and you can compare this to coöperative vs. preëmptive scheduling: using a lock is akin to the former while issuing cancellations is akin to the latter.

But implementing cross-cancellations was such an intrusive change that I did not feel comfortable rolling it out without a flag. And, well, because the change was so extensive, I actually forked the implementation because the flag would have made the majority of the code conditional. This is why we now have the LegacyDynamicSpawnStrategy and the DynamicSpawnStrategy. You can pick which one to use with the --[no]legacy_spawn_strategy flag, and my plan is to remove the legacy one at some point this year, of course.

Other than intrusive, these changes were also so incredibly hard to get right due to concurrency issues. And especially, because cancelling a future and reliably waiting for its completion is difficult. Java’s Interrupts don’t make it easy either. It is even more painful when you have to reliably terminate subprocesses as we already saw.

Anyway. The second step was making the local unsandboxed strategy run without grabbing any kind of lock. A trivial change really, but useless without the new dynamic scheduler. This behavior is gated behind --experimental_local_lockfree_output.

For the most part, this all works now but none of this is enabled by default yet. I’m still encountering some form of race condition where the local branch leaves some files behind after cancellation, which confuses the remote branch… and only seems to happen for tree artifacts. More on this later. And once that is done, I’ll also have to profile and optimize the new DynamicSpawnStrategy as it’s not giving better build times yet.

The eventual side effect of these features is that they will enable us to remove or tighten the --experimental_local_execution_delay hack.

This article is part number 5 of 6 of the Bazel dynamic execution series.