Implement epoll APIs in the JS filesystem#27207
Open
guybedford wants to merge 3 commits into
Open
Conversation
This was referenced Jun 26, 2026
aacb3d7 to
57e98fb
Compare
sbc100
reviewed
Jun 27, 2026
sbc100
left a comment
Collaborator
There was a problem hiding this comment.
I think this like this direction.
I've not had time to look at all the details yet, but it seems like a great idea to unify the node events like this.
sbc100
reviewed
Jun 27, 2026
sbc100
reviewed
Jun 27, 2026
Adds epoll_create1/epoll_ctl/epoll_wait/epoll_pwait and a non-blocking JS-callback variant, emscripten_epoll_set_callback, on a single fd readiness model shared with poll(). Readiness is source-based: producers (sockets, pipes) post edges to a wait-queue on the FS node, which dup'd fds share. An epoll instance is a real FS fd whose stream holds an interest map (fd -> registration) and a ready list. epoll_ctl ADD arms a persistent listener on the watched node - the registration's edge in the interest graph; on an edge the listener appends the registration to the epoll's ready list (Linux's rdllist) and wakes any waiter. Because a source-based model only learns readiness from edges, epoll_ctl ADD/MOD also samples the current level once, so an fd already ready when watched is reported with no further event needed. A wait consumes the ready list (Linux's ep_send_events): each listed registration is re-derived against its current mask; level-triggered ones still ready are re-listed at the tail, edge-triggered ones leave until the next edge, and a no-longer-ready (spurious) edge is dropped. A fired EPOLLONESHOT drops its watched-node listener until EPOLL_CTL_MOD re-arms it, so a dead edge carries no traffic. The ready list is an intrusive doubly-linked list, so draining is O(ready) rather than O(registered), and the remainder past maxevents is rotated to the front for round-robin fairness. emscripten_epoll_set_callback registers a persistent consumer on that same ready list: the runtime delivers the ready set to the callback on each progress, with no blocking and no ASYNCIFY/JSPI. It is armed once (not per spin), re-fires on the next tick while the set stays ready (so level and overflow drain as a blocking epoll_wait loop would), and there is at most one callback per epoll (a second call replaces it; a NULL callback unregisters). Per-fd EPOLLET/EPOLLONESHOT apply unchanged, so a single callback can mix level/edge/oneshot fds. A blocking epoll_wait (under PROXY_TO_PTHREAD, ASYNCIFY, or JSPI) consumes the same ready list, so a wait and a callback on one epoll take disjoint slices rather than each seeing a private copy. The callback is delivered on the main thread's event loop (under PROXY_TO_PTHREAD use a blocking epoll_wait instead), and keeps the runtime alive only while the set can still fire: once every watched fd is closed the set is terminal and the keepalive is dropped, so no explicit disposal is required (closing the epoll or passing a NULL callback also dispose). Registrations key on the open file description (the dup-shared stream state), matching Linux: closing a watched fd and reusing its number for a different open does not resurrect the registration onto the new fd. A close (socket, pipe, or a nested epoll) notifies its node, so the watching epoll promptly re-derives and drops the registration - the analog of Linux's eventpoll_release_file walking the watched file's epitem list. Only sockets and pipes derive real readiness; every other stream type (regular files across MEMFS/NODEFS/NODERAWFS, devices, ttys) has no poll handler and is treated as always readable+writable, so epoll_ctl rejects it with EPERM. This also fixes poll() crashing on a NODERAWFS regular file, whose stream carries no stream_ops at all. EPOLLEXCLUSIVE distributes its single wakeup across multiple epolls watching one fd (round-robin), which suppresses the thundering herd for that case; suppressing it across multiple waiters on a single epoll is out of scope (one instance, and they already share the ready list). Known limitations: WASMFS epoll is out of scope (link error); ttys are not pollable (no poll handler), unlike Linux; and eviction of a closed watched fd is keyed on the fd number, so (unlike Linux) a dup that keeps the underlying description alive does not preserve the registration.
sbc100
reviewed
Jun 30, 2026
sbc100
reviewed
Jun 30, 2026
sbc100
reviewed
Jun 30, 2026
sbc100
reviewed
Jun 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updated version of #27201, based to #27206. Also includes an integrated callback model for #27181, to fully verify the unified wait/callback approach on epoll semantics.
Resolves #5033, #10556.
Adds
epoll_create1,epoll_ctl,epoll_wait,epoll_pwaitand a non-blocking JS-callback variant,emscripten_epoll_set_callback, on a single fd readiness model shared withpoll().Readiness in the JS FS system is already event-driven in SOCKFS and PIPEFS. The integration point is the per-inode wait-queue, having each FS node carrying a
listenersset. Producers then callnotifyNodeListeners(node, flags)on ready transitions. There is no separate or parallel readiness machinery - it integrates directly with the existing model.pollOne(fd, events)is reused on the same readiness definition.Per standard epoll semantics -
epoll_ctl ADDinstalls a new listener on the watched node. If items are already ready they are added to the ready list. That listener then appends the registration to the epoll's ready list for waking. The epoll_wait consumes the ready list, re-checking each item against its current mask viapollOne.EPOLLONESHOTclears listeners to avoid unnecessary callback firing.EPOLL_CTL_MODcan then re-arm them again.EPOLLETis implemented correctly to avoid refiring items that remain readyEPOLLEXCLUSIVEis passed for listeners allowing only one wake for multiple epoll listeners to avoid the "thundering herd".maxevents, draining follows Linux-like semantics in supporting round-robin ready calling. To achieve this without losing performance, a doubly-linked list is used for the registrations. A simpler set / array with copying could be used alternatively if we don't want to use this approach.To support JS callbacks without JSPI/threads, a new
emscripten_epoll_set_callbackis implemented. This was implemented here to verify its comprehensive integration with all of the implemented epoll semantics, but could also be split out into a separate PR if necessary. It allows registering a persistent consumer on that same ready list as the epoll - the runtime delivers the ready set to the callback on each progress as if it were responding to anepoll_wait, but on the next tick after exiting the stack with no blocking and no ASYNCIFY/JSPI. It is armed once for the entire epoll, then consistently re-fires on the next tick while the set stays ready (so level and overflow drain as a blocking epoll_wait loop would). There is at most one callback per epoll (a second call replaces it; a NULL callback unregisters). Full integration with ready-list semantics work out naturally as it is just another consumer of the ready list.EPOLLET/EPOLLONESHOTEPOLLEXCLUSIVE/maxeventsall work out and apply to this callback design, so a single callback can fully integrate with normal epoll semantics.Most of the diff is tests, covering these semantics in depth including error handling, level versus edge reporting, nesting and ELOOP, fd-close auto-removal, JSPI and pthreads, real sockets, deregistration. For
emscripten_epoll_set_callbackcomprehensive tests are added for integrating with JSPI blockingepoll_waitin parallel and verifying both deterministically drain the same ready list with a wait and a callback on one epoll take disjoint slices rather than each seeing private or overlapping copies.Minor semantic divergences to note:
epoll_pwaitignoressigmaskepoll_create1ignoresEPOLL_CLOEXECepoll_eventunder Wasm in Musl is laid out as aligned 16 rather than x86-64's packed 12 bytes.PR made with AI assistance, under my review