Skip to content

[SE-0526] Revision for cooperative cancellation and deadline composition#3289

Open
phausler wants to merge 2 commits into
swiftlang:mainfrom
phausler:pr/deadline_revision
Open

[SE-0526] Revision for cooperative cancellation and deadline composition#3289
phausler wants to merge 2 commits into
swiftlang:mainfrom
phausler:pr/deadline_revision

Conversation

@phausler
Copy link
Copy Markdown
Contributor

Major highlights of the changes to the proposal:

  1. The DeadlineExceededError was removed in favor of avoiding creating a wrapper error. This makes the error handling considerably more ergonomic and avoids existentials
  2. The currentDeadline was removed since that is not needed for composition, but also caused severe ergnomic issues accessing the any InstantProtocol.
  3. A new interface for CancellationError was added to provide a reasoning for why a task or child task was cancelled. This also includes new cancellation methods for Task, TaskGroup, ThrowingTaskGroup, DiscardingTaskGroup and ThrowingDiscardingTaskGroup.
  4. Additional references were added to other related proposals
  5. Contrasting analysis on the naming and apis were added comparing other languages (Go and Kotlin).

Comment thread proposals/0526-deadline.md Outdated
propagates without drift. Kotlin's `withTimeout` is duration-based, but
Kotlin's coroutine scope carries a single deadline internally and computes the
minimum against any new timeout, which is effectively what the nested
`withDeadline` composition in this proposal achieves explicitly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

immediately communicates to the reader that a temporal bound is in effect, which aids code
review and debugging. Names centered on the mechanism (`withAutomaticTaskCancellation`)
require the reader to infer the temporal aspect, while names centered on the concept
(`withDeadline`) let the reader infer the mechanism from context.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

public struct CancellationError: Error {
public enum Reason {
case taskCancelled
case deadlineExpired
Copy link
Copy Markdown
Contributor

@ktoso ktoso May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we want to include the instant here? Since it's not generic now we could do so without much hassle; it would help show better error messages I think?

I'm split about this, I'm also arguing below that failures should be logged at failure sites and therefore we need not propagate more information about the cancellation anywhere, as we're coupling unrelated pieces of code this way...

At the same time, I kindof am wondering if the instant is harmless enough here, as it would be hard to abuse, vs an open string or any object. 🤔

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The withDeadline method is still generic over a clock and instant. In my opinion, we don't need to include it. The reason I first wrapped the error was mostly because I wanted to differentiate between the two separate failures. I researched what most other languages do for deadline/timeouts and most of them don't provide any additional information besides that the deadline expired.


/// The operation threw an error before the deadline expired.
case operationFailed
case custom(String)
Copy link
Copy Markdown
Contributor

@ktoso ktoso May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy, that's a really big one... Perhaps it'd be best avoided to add this "custom", as this effectively turns the communication entirely bidirectional and I'm a bit worried about coupling this can cause.

There's a few things bad about this, so I'd suggest we skip this entirely.

  1. the bi-directional coupling by parsing exact messages has me worried here; the deadline number is kinda defensible, but more arbitrary information is starting to get worrying.
  2. It opens up the debate if this should be Sendable & Error, but I think that would be equally bad:
    1. the logging of errors should be done at failure sites, so I think we're right to not propagate more information into the cancellation. If we did, we then couple potentially unrelated pieces of code and expect some "random" cancellation handler to be able to do something with the information...

Proposal: let's not include custom(...) at all; but keep the deadlineExpired (nit I personally prefer "exceeded" but expired is fine as well if people like it). There is no inherent need for this, and it opens a lot of unconformable design questions I'm not sure we ought to be solving.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I argued for the custom cancellation reason because this has been a common request. We have many users that asked for being able to pass a cancellation reason when cancelling a task or a task group to aid with understanding where a cancellation comes from. It is not about where the logging is done which often already happens but that logging most often is just Something threw an error: CancellationError(). This allows developers to add a cancellation reason that will help them understand why a certain task got cancelled and it can be propagated up again through the CancellationError.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always am pretty allergic to those both-ways couplings, in reactive-streams we specifically avoided this back-channel and managed to avoid the same in the JDK still. But perhaps this is different enough from the streams chain case; cancellation there happens a lot, and making them heavy was a concern.

The way I see the String here is that we would not actually encourage libraries to make much use of it, but rather it'd be more like a debugging tool...

There's another problem to discuss though here; if this implies that task eventually people will ask to gain the "task.cancellationCause" which would be really bad as we'd then have to keep those errors around in memory for a lot longer... I think we should immediately argue that we should not offer such cause "storage" in task; similarly, I'd be worried if we then start to grow APIs like isCancelled(because: .deadlineExceeded) which might then grow into something to "get the cause (error)" which would force us into storing the errors -- something I'd rather want to avoid.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that we want to store those in the task and keep them around until the task is cleaned-up. I would like to understand your hesitance around this more. As I tried to allude to above, many Swift server adopters have asked me if we could add such a cancellation reason API including being able to retrieve the cancellation reason from the i.e. Task.cancellationReason: CancellationError.Reason?. This comes from a real need not only when debugging but also for deployed application. It is very common that a CancellationError is thrown and it is often very unclear why that happened. Most of the time it is due to the fact that something cancelled your task e.g. a deadline being exceeded or other business logic determining that a task should be cancelled. Having a cancellation reason makes it a lot easier to understand what is going on in your system both during development and in production.

I am not concerned with the performance of this for those reasons:

  • Cancellation is often the error edge was is not expected to run in hot paths
  • We are already storing the value or the error of a task. Storing a string in a record doesn't seem like it is going to cause a lot of overhead here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one note: the payload for that is actually really difficult to encapsulate into an implementation. So if we do remove that it would make the implementation MUCH easier.

Copy link
Copy Markdown
Contributor

@ktoso ktoso May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that we want to store those in the task and keep them around until the task is cleaned-up.

This is really stepping outside the scope of this proposal and doesn't feel right to introduce a half baked part of an API like this by smuggling in the cause string here, but not offering the rest of the story -- if we are to move towards heavier cancellation than just a plain signal. It might be added in the future if we decide to do so, separately.

--

I did so some more research on failure reasons; and indeed many runtimes have a string for it (seems Go was pressured into adding it as well huh, and Kotlin/JS both have reasons, though Tokio remains just a signal). So, I'm warming up to the idea but not necessarily the storage part, because as noted earlier this isn't "free" and we should discuss this separately from introducing deadlines IMHO.

--

We are already storing the value or the error of a task. Storing a string in a record doesn't seem like it is going to cause a lot of overhead here.

That's completely unrelated storage though. It's not like we can use the result storage for this; and it cannot be just an on-demand record either, so it would grow all tasks by another record. So yeah it does have wider impact and deserves its own discussion, which doesn't belong to the deadlines introduction.

Let's remove the custom case here, and revisit separately.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving aside the custom storage for a second which I can get behind pulling into a separate proposal. I am wondering how we are implementing this proposal without being able to retrieve the cancellation reason of a task. In particular, let's assume this code:

try await withDeadline(in: .seconds(5)) {
  try await Task.sleep(for: .seconds(10))
}

Task.sleep throws a CancellationError. I would expect that with this proposal the reason on this error is deadlineExceeded. Without being able to retrieve the cancellation reason of your current task though, I don't see how we are able to implement this. Maybe I am missing something here.

A way to side-step all of this is to put the entire reason into a future direction which would mean that for now users aren't able to distinguish a normal cancellation from a deadline based cancellation.

Comment thread proposals/0526-deadline.md
Comment thread proposals/0526-deadline.md Outdated
Comment thread proposals/0526-deadline.md Outdated
public struct CancellationError: Error {
public enum Reason {
case taskCancelled
case deadlineExpired
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The withDeadline method is still generic over a clock and instant. In my opinion, we don't need to include it. The reason I first wrapped the error was mostly because I wanted to differentiate between the two separate failures. I researched what most other languages do for deadline/timeouts and most of them don't provide any additional information besides that the deadline expired.


/// The operation threw an error before the deadline expired.
case operationFailed
case custom(String)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I argued for the custom cancellation reason because this has been a common request. We have many users that asked for being able to pass a cancellation reason when cancelling a task or a task group to aid with understanding where a cancellation comes from. It is not about where the logging is done which often already happens but that logging most often is just Something threw an error: CancellationError(). This allows developers to add a cancellation reason that will help them understand why a certain task got cancelled and it can be propagated up again through the CancellationError.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants