Kernel Mount (seaweed-vfs) — Architecture, Performance & Memory
Technical reference for the Kernel Mount — how the kernel module and daemon split the work, why it’s faster than FUSE for cached access, and why the mount process can’t run out of memory as file count grows.
Architecture
seaweed-vfs follows the WEKA model: a thin in-kernel module owns the VFS integration — inodes, dentries, and the Linux page cache — and a userspace daemon (sw-kd) does all SeaweedFS networking (filer gRPC for metadata, volume HTTP for data). The kernel does zero networking; the hard, fast-moving datapath stays in maintainable userspace.
The two halves talk over a /dev/seaweedvfs character device — an OrangeFS-style request/reply channel. The daemon serves that device; the kernel forwards VFS operations to it and caches the results.
Performance
Because the module owns the page cache and dentry cache:
- Cached reads and metadata are served in-kernel — no per-operation userspace round-trip. A FUSE client bounces essentially every VFS call out to a userspace process; the kernel mount calls the daemon only on a cache miss or a write.
- Kernel readahead accelerates sequential reads, and repeated reads are served straight from the page cache like any local filesystem.
- An optional
io_uringfast path (ublk-style) keeps many requests in flight for high-concurrency workloads; the defaultread()/write()channel is the fallback.
(Throughput and latency vary by workload and hardware — these are architectural properties, not a published benchmark.)
Memory: why it can’t OOM from scale
A FUSE client keeps a per-inode map in its own process. That map is a non-reclaimable heap that grows with the number of files the kernel has looked up; at large file counts it reaches multiple GB and can OOM the mount — ≈6.8 GB RSS was reported at ~33M files in upstream issue seaweedfs#10020.
The kernel mount keeps no per-inode map in the mount process (ino = hash(path)), so the daemon process stays small — about 8 MB regardless of file count (measured flat from 500K to 33M files). The per-file metadata is still cached, but by the OS as reclaimable dentry/inode cache that the kernel evicts under memory pressure — so it never OOMs, and it’s a cost every filesystem client incurs, not a heap unique to this mount.
In short: the part that grows with file count is reclaimable kernel cache, not an unbounded userspace heap — so the mount process can’t be the thing that runs you out of memory.
Stateless daemon
The kernel↔daemon protocol is path-based — each request carries its target path(s), so the daemon holds no inode map and a restart is transparent. You can restart or upgrade sw-kd without unmounting: in-flight requests get -ENOTCONN and the daemon re-attaches (a brief I/O pause, no unmount).
Requirements & limits
- Linux kernel 6.1 or newer, x86_64 or arm64 (CI validates 6.1 LTS → 7.0).
- A reachable filer gRPC endpoint (filer HTTP port + 10000; default
18888). - Under Secure Boot, the module must be signed by an enrolled key — DKMS signs with a per-host key and prompts enrollment; precompiled modules use a key you enroll once with
mokutil. - The kernel does no networking; all SeaweedFS I/O (and any future RDMA datapath) lives in the daemon.
See Kernel Mount for install and mount instructions, and Kernel Mount operations for manual install, building from source, deploy, and upgrade details.