Skip to content

fix(vsock): restore listener after accept#2434

Open
archevel wants to merge 27 commits into
hermit-os:mainfrom
archevel:fix-issue-with-vsock-accept
Open

fix(vsock): restore listener after accept#2434
archevel wants to merge 27 commits into
hermit-os:mainfrom
archevel:fix-issue-with-vsock-accept

Conversation

@archevel

Copy link
Copy Markdown

After accepting a connection, move the accepted socket to an ephemeral port and reset the listener entry to Listen state. Track the listen port separately in Socket so subsequent accept calls always find the listener entry regardless of how many connections have been accepted.

Fixes #2433

@archevel

Copy link
Copy Markdown
Author

As a reference I made this pull request as well with a regression test:
hermit-os/hermit-rs#996

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Results

Details
Benchmark Current: 7ac357f Previous: badd4c4 Performance Ratio
startup_benchmark Build Time 82.78 s 81.10 s 1.02
startup_benchmark File Size 0.80 MB 0.80 MB 1.00
Startup Time - 1 core 0.74 s (±0.03 s) 0.76 s (±0.03 s) 0.98
Startup Time - 2 cores 0.75 s (±0.02 s) 0.77 s (±0.02 s) 0.98
Startup Time - 4 cores 0.77 s (±0.02 s) 0.77 s (±0.02 s) 1.01
multithreaded_benchmark Build Time 79.20 s 83.97 s 0.94
multithreaded_benchmark File Size 0.90 MB 0.90 MB 1.00
Multithreaded Pi Efficiency - 2 Threads 89.65 % (±7.02 %) 89.33 % (±7.36 %) 1.00
Multithreaded Pi Efficiency - 4 Threads 44.54 % (±3.33 %) 43.68 % (±2.83 %) 1.02
Multithreaded Pi Efficiency - 8 Threads 26.22 % (±1.54 %) 25.39 % (±2.06 %) 1.03
micro_benchmarks Build Time 77.31 s 82.38 s 0.94
micro_benchmarks File Size 0.90 MB 0.90 MB 1.00
Scheduling time - 1 thread 64.76 ticks (±2.47 ticks) 67.71 ticks (±4.24 ticks) 0.96
Scheduling time - 2 threads 36.78 ticks (±5.03 ticks) 38.11 ticks (±5.12 ticks) 0.96
Micro - Time for syscall (getpid) 4.15 ticks (±0.75 ticks) 4.23 ticks (±0.74 ticks) 0.98
Memcpy speed - (built_in) block size 4096 85205.99 MByte/s (±58823.11 MByte/s) 79468.16 MByte/s (±55116.80 MByte/s) 1.07
Memcpy speed - (built_in) block size 1048576 30557.55 MByte/s (±24664.48 MByte/s) 30161.95 MByte/s (±24453.66 MByte/s) 1.01
Memcpy speed - (built_in) block size 16777216 29333.44 MByte/s (±24140.41 MByte/s) 26353.33 MByte/s (±21799.52 MByte/s) 1.11
Memset speed - (built_in) block size 4096 84757.63 MByte/s (±58536.59 MByte/s) 79293.80 MByte/s (±54991.57 MByte/s) 1.07
Memset speed - (built_in) block size 1048576 31307.78 MByte/s (±25103.65 MByte/s) 30914.27 MByte/s (±24888.27 MByte/s) 1.01
Memset speed - (built_in) block size 16777216 30116.00 MByte/s (±24595.84 MByte/s) 27104.01 MByte/s (±22268.97 MByte/s) 1.11
Memcpy speed - (rust) block size 4096 73222.72 MByte/s (±51115.91 MByte/s) 75054.08 MByte/s (±52569.58 MByte/s) 0.98
Memcpy speed - (rust) block size 1048576 30295.14 MByte/s (±24516.39 MByte/s) 30128.07 MByte/s (±24471.52 MByte/s) 1.01
Memcpy speed - (rust) block size 16777216 29452.94 MByte/s (±24249.44 MByte/s) 26810.93 MByte/s (±22268.16 MByte/s) 1.10
Memset speed - (rust) block size 4096 73732.13 MByte/s (±51496.31 MByte/s) 76095.53 MByte/s (±53191.49 MByte/s) 0.97
Memset speed - (rust) block size 1048576 31054.17 MByte/s (±24960.97 MByte/s) 30853.29 MByte/s (±24883.30 MByte/s) 1.01
Memset speed - (rust) block size 16777216 30225.26 MByte/s (±24693.61 MByte/s) 27563.43 MByte/s (±22730.99 MByte/s) 1.10
alloc_benchmarks Build Time 76.30 s 76.59 s 1.00
alloc_benchmarks File Size 0.88 MB 0.88 MB 1.00
Allocations - Allocation success 91.31 % 91.31 % 1
Allocations - Deallocation success 100.00 % 100.00 % 1
Allocations - Pre-fail Allocations 61.44 % 61.44 % 1
Allocations - Average Allocation time 4668.00 Ticks (±130.56 Ticks) 3943.99 Ticks (±137.69 Ticks) 1.18
Allocations - Average Allocation time (no fail) 5436.63 Ticks (±152.21 Ticks) 4914.38 Ticks (±99.02 Ticks) 1.11
Allocations - Average Deallocation time 1068.38 Ticks (±220.09 Ticks) 1615.85 Ticks (±252.34 Ticks) 0.66
mutex_benchmark Build Time 78.42 s 79.65 s 0.98
mutex_benchmark File Size 0.90 MB 0.90 MB 1.00
Mutex Stress Test Average Time per Iteration - 1 Threads 12.18 ns (±0.43 ns) 12.20 ns (±0.45 ns) 1.00
Mutex Stress Test Average Time per Iteration - 2 Threads 43.40 ns (±2.12 ns) 41.90 ns (±2.35 ns) 1.04

This comment was automatically generated by workflow using github-action-benchmark.

@mkroening mkroening self-assigned this May 16, 2026
@mkroening mkroening self-requested a review May 16, 2026 17:06
@archevel archevel force-pushed the fix-issue-with-vsock-accept branch from 457df22 to 1391ad4 Compare May 28, 2026 09:16

@mkroening mkroening left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding and fixing this bug! :)

Sorry for the late reply. I can reproduce the fix.

Regarding the implementation:
If I understand correctly, your changes now track two ports for listening sockets, one for the listener and one for the last connection, right? This does not support multiple open connections at the same time, right?

Perhaps doing something similar to TCP would be more useful: return a proper connection FD from accept instead of the current dummy NullSocket.

Regarding the test:
Could you adapt the existing vsock tests instead?

@archevel

Copy link
Copy Markdown
Author

No worries! I appreciate you taking the time to have look. I've kept working on this locally and had to do some additional changes. Those are out of scope for this I think, as they involved needing to patch libc and mio to handle epoll for hermit. I'm working on that in order to be able to use firecracker to run a hermit based vm and communicate over vsock. Anyway, that is a larger piece of work, but I think I actually ended up aligning this more with how the TCP is handled. I'll take a peek and adjust this PR.

archevel added 7 commits June 20, 2026 12:07
After accepting a connection, move the accepted socket to an ephemeral
port and reset the listener entry to Listen state. Track the listen port
separately in Socket so subsequent accept calls always find the listener
entry regardless of how many connections have been accepted.

Fixes hermit-os#2433
@archevel archevel force-pushed the fix-issue-with-vsock-accept branch from 1f35b4d to e06d128 Compare June 20, 2026 13:48
@archevel

Copy link
Copy Markdown
Author

I've rebased the changes on main. I then changed to align it more with how the tcp socket works (handling multiple concurrent connections). Let me know if there are any other issues I should address/fix! The CI failed, so I'll take a peek and see if I understand why.

@archevel

Copy link
Copy Markdown
Author

I think the CI fails because it isn't aligned with the changes in hermit-os/hermit-rs#996

@archevel

archevel commented Jun 20, 2026

Copy link
Copy Markdown
Author

@mkroening I realized that the code doesn't really handle multiple cids currently. E.g. If the host assigns two vsock cids, say 3 and 4. Currently listening on either cid on port 1234 will get connections for both I think. Might be a separate issue to solve.

Edit: I realized this wasn't a real issue since qemu can only assign a single cid to a guest with vhost-vsock. However, the code did not fail on binding on a non-assigned cid. I've now added so that a bind on e.g. cid 4 when cid 3 was assigned fails. Binding on VMADDR_CID_ANY works. Arguably there might be a case for allowing binding on CID 1 (loopback), but since hermit apps as far as I understand are essentially single processes there ought to be better ways to handle internal messages....

@archevel archevel requested a review from mkroening June 25, 2026 13:49
@archevel

Copy link
Copy Markdown
Author

@mkroening I'm not entierly familiar with the setup, but this PR won't pass the CI unless hermit-os/hermit-rs#996 is merged. However, merging that will likely cause OTHER PRs to start failing before merging this PR... Is there anything I should amend to make this smoother?

@archevel

Copy link
Copy Markdown
Author

I ended up using claude to aid me with the accept backlog and try to align it with the tcp code. Not sure if that is acceptable? In the process I discovered that mio doesn't handle select in the same way for the native socket types (e.g. TcpStream wrapped in IoSource) as it does for a raw file descriptor in a mio::unix::SourceFd. This causes an issue where the reader is never woken up even when data is arriving. I made a workaround in the app I am working on. Should I raise it as an issue against the kernel? Not sure exactly where the fix should go (might be in mio).

archevel and others added 12 commits July 1, 2026 09:40
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
MSI-X is required to support cloud-hypervisor, as legacy interrupts are
not supported. Additionally, it allows us to set separate handlers for
device configuration interrupts and interrupts of each queue (though
currently we only operate on the queue type granularity). It also seems
to have higher performance.
Verifying the checksum of fragmented packets would require reassembling them,
which we do not want to do in the kernel because of its complexity. IP
fragmentation is "considered fragile" anyways (see RFC 8900).
The value for the receive buffer size when GUEST_TSO{4,6} is negotiated has been
updated in the version 1.4 of the specification in order to allow for larger
packets on IPv6.

IPv6 headers do not eat away from the range of the 16-bit IP packet length
field, which affords 40 additional bytes. The actual increase in the value is 39
instead of 40 because the maximum for the header field is 65535 and not 65536,
which I guess was overlooked when the previous version of the specification was
written.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vsock listener cannot accept a second connection

4 participants