CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 357
Description
So from @wingo's blog I have learned about Concurrent ML. (His concurrency tag is also worth perusing generally.)
The core insight, IIUC, is to take the atomic blocking operations – roughly, the same ones our current cancellation guarantee applies to – and systematically split them into a few consistent sub-operations:
- attempt to perform the operation synchronously, possibly wrapping the result in some thunk
- publish that we're waiting for the operation to be doable and should be woken up if it might succeed
- give up on waiting ("unpublish")
We already implicitly have this basic structure open-coded in a bunch of places, e.g. the _try_sync
helper in trio/socket.py
, the classes in trio/_sync.py
, etc. Pretty much anywhere you see the yield_if_cancelled
/yield_briefly_no_cancel
pair in trio fits this general pattern, and "unpublish" is basically just our abort callback. So the advantages of reifying this would partly be just to simplify the code by having a single implementation of the overall pattern that we could slot things into – but even more, so because given the above pieces, you can create generic implementations of three variants:
- Try to perform the operation, but fail if it would block (like the
*_nowait
variants that we currently implement in an ad hoc way) - Perform the operation, blocking as necessary (what we do now)
- A thing like golang's
select
, where you can say "perform exactly one of the following operations: ..."
(The first is done by just calling the "try this" sub-operation; the second you do by trying and then blocking if you fail, in a loop, with the unpublish operation as the abort callback; the third is done by calling a bunch of "try this" sub-operations and if one succeeds you're done, otherwise publish all the operations and go to sleep. There's some subtleties around knowing which operation woke you up, and when unpublish happens, etc., but that's the basic idea.)
Right now we have a bunch of manual implementations of await x()
/ x_nowait()
primitives. It's not clear that we have enough, either; #14 has an example of a case where you semantically need accept_nowait
, and for HTTP client connection pooling when you pull a connection out of the pool you need something like receive_some_nowait
to check whether the server has closed it before you try to use it.
Also, a golang-style select
is potentially quite useful but isn't possible right now (at least for trio's built-in operations, of course you certainly could build a golang-style channels library on top, but then select
would only work on that library's operations). You can spawn some child tasks to try doing all the things concurrently, but there's no way to make sure that no more than one complete – for that you would need to be able to (1) guarantee that all of the operations really are cleanly cancellable, and (2) perform the cancellation synchronously with the first operation completing, which isn't possible if the last thing it does after committing its work is to call yield_briefly_no_cancel
.
An ancillary benefit is that if we expose these things as a standard interface to users, then this would also serve as very clear documentation of which the actual atomic cancellable operations are.
But, there are also some issues, I think:
IOCP: the above pattern works for BSD-style non-blocking socket operations, but not for Windows IOCP operations (#52). You can implement cancellable blocking operations as IOCP calls (that's basically what they are), and nowait operations using Windows' non-blocking calls, but IOCP has no generic equivalent to epoll to implement wake-me-when-this-non-blocking-call-might-succed, which means that golang-select
is not possible. All you can do is ask the kernel to start initiating all the operations, and then by the time you find out that, say, your recv
has finished, then your send
might also have finished. I guess this might be possible to work around: for send
I'm pretty sure we can treat IOCP as a kind of epoll, and then use non-blocking send. For recv
I'm not sure if this works and for accept
I'm pretty sure it doesn't, but for these operations you can more-or-less fake them as being always cancellable by using a little user-space buffer to push back any result that you want to pretend didn't happen. Are there any other cases? Do we need to change the pattern to accomodate this? The pushback trick doesn't seem compatible with a strict separation between "try it" and "wake me when you want to try again" primitives.
The main problem with HTTP connection pooling isn't checking if the socket is readable – we can already do that in the specific case of a raw socket. It's that it isn't something you can reasonably abstract in terms of generic "streams". In particular, if you have an TLS-wrapped socket, then you actually don't want to check if the TLS layer has data available, you really do want to check the raw socket underneath. And in any case I'm not sure that this would help make it easier to implement receive_some_nowait
as a generic feature on streams, because receiving on a TLS stream is a very complex operation that may require things like lock acquisition, and all of the operations above have to be synchronous. So maybe the HTTP connection pool case isn't really a good motivator anyway.
Lock.acquire
and Lock.acquire_nowait
are tricky because of fairness (#54); it's not the case that acquire
is just a loop like while acquire_nowait fails: sleep until it might succeed
, because that creates a race on wakeup. I don't think it's possible to implement a fair mutex using just the framework described above. The problem is that we really need the two operations to be "try it" and "block until the operation is done"; a retry loop just doesn't work. So maybe this is actually equivalent to the IOCP case? Maybe we need primitives:
- try it without executing any cancel/schedule point
- publish whatever we need to publish to be woken up (but don't actually sleep), with some callbacks for abort, reschedule, and we-got-successfully-rescheduled
I think this is flexible enough to implement fair synchronization primitives and handle all the public operations. E.g. for golang-select
we would want to arrange so that when one operation gets rescheduled, then we immediately abort all the other operations, before waiting to actually be woken up – this would need to happen from the context of the task that's handing off the mutex (for example).
But... this isn't quite right for stuff like non-blocking socket operations, where you actually need a try-sleep-retry-sleep-retry-... loop. Need to think some more about this.