Replies: 2 comments 1 reply
-
|
These are not huge numbers. I regularly analyse codebases larger than that (although the number of directories seems excessive but maybe it was counted wrong). Although I'm not familiar with the virtual environment tools you're mentioning, I'm inclined to believe this is a problem with your machine/environment/setup. |
Beta Was this translation helpful? Give feedback.
-
Root Cause Analysis: "Too many open files" on macOS Docker (VirtioFS)Issue: #14307 Executive SummaryEnvironment: 12 CPU cores, macOS + OrbStack/VirtioFS, ~24,000 PHP files in ~9,200 dirs + ~16,892 vendor files in ~4,603 dirs. Docker container ulimits explicitly set to The core cause is 8 worker processes each independently performing redundant filesystem scanning and file hashing during initialization. This is the only variable that changes between failing cold runs (8 workers) and succeeding warm runs (1-2 workers). With 12 CPU cores, the Identified exhaustion mechanism: The EMFILE error originates from the macOS HOST side, not the container. Apple's VirtioFS implementation (via Proven Facts (from code analysis)1. Worker Initialization Call ChainEach worker process follows this path: WorkerCommand::execute() -> CommandHelper::begin() -> ContainerFactory::create() -> postInitializeContainer() -> getService('betterReflectionSourceLocator') -> BetterReflectionSourceLocatorFactory::create() -> for each directory: OptimizedDirectorySourceLocatorRepository::getOrCreate() -> OptimizedDirectorySourceLocatorFactory::createByDirectory() 1. FileFinder::findFiles() 2. hash_file() on ALL files 3. createCachedDirectorySourceLocator() -> cache->load() [FileCacheStorage - disk] -> symbolFinderInFiles->findSymbols() [only on cache miss] -> cache->save() 2. PHP Internal FD Behavior at Each StepSymfony Finder (
3. FD Inheritance via proc_open()PHP's However, PHP's The spawning loop in Inherited FDs: 7 × 2 tmpfile FDs (14) + 1 TCP server socket + ~5 base FDs = ~20 inherited FDs for worker 8. These inherited FDs are useless to the child (it never uses them) but consume FD table entries. 4. Actual Peak FD Count Per ProcessWorker 8 (worst case):
Worker 1 (best case): ~15-20 FDs (minimal inheritance). Main process (after all workers spawned): ~30 FDs. With a container ulimit of 1,048,576, even the worst-case ~43 FDs per process is negligible. EMFILE cannot originate from inside the container. 5. Source Locator Cache IS Shared via Filesystem
Even on a cold run, workers skip 6. Worker Count Depends on File Count// Scheduler.php:48-53
$numberOfProcesses = min(
max((int) floor(count($jobs) / $this->minimumNumberOfJobsPerProcess), 1),
$cpuCores,
);
$usedNumberOfProcesses = min($numberOfProcesses, $this->maximumNumberOfProcesses);
// maximumNumberOfProcesses default = 8 (conf/config.neon:95)
With 12 CPU cores:
Per-Worker Filesystem Operations (Redundant)Each worker, regardless of cache state:
With 8 workers: ~1,750,000 redundant FUSE operations during initialization. Main process performs the same ~219,000 ops independently, so total is ~1,969,000 FUSE ops — of which ~1,750,000 (89%) are redundant. Identified FD Exhaustion MechanismApple VirtioFS inode caching on the macOS hostThe EMFILE error originates from the macOS HOST, not from inside the Docker container. The mechanism:
Why container-side limits are ruled outThe reporter's Docker configuration explicitly sets
Why FUSE queue limits are ruled outThe Linux FUSE client has
Correlation with observed behavior
The 8 concurrent workers not only multiply FUSE operations but also delay Why Redundant Scanning Is Definitively the Core CauseThis is a proof by elimination:
The ONLY difference between cold and warm runs is the number of workers (determined by files-to-analyze count via the Scheduler). Each worker performs the same initialization filesystem operations regardless — the Finder scan and hash_file are NOT proportional to how many files the worker will analyze. The 2-8x difference in concurrent FUSE load is what pushes the host-side VirtioFS process past its FD limit. This proves:
The FixThis issue only manifests in virtualized environments on macOS (Docker via OrbStack/Docker Desktop using VirtioFS). On native filesystems there is no FUSE intermediary holding FDs. The bottleneck is Apple's VirtioFS holding an PHPStan cannot control host OS settings. The only lever it controls is how many FUSE operations it generates — and 89% are redundant. Eliminating them keeps the host-side VirtioFS process below its FD limit, resolving the "Too many open files" error without requiring any environment changes. What workers currently do (redundantly)Each worker runs
Even when the cache contains valid data (cache HIT on warm runs), steps 1 and 2 are always performed for cache validation. The scan and hash are the expensive operations — they account for ~219,000 FUSE ops per worker. The symbol discovery ( What workers needWorkers need three mapping arrays from
These are pure data — arrays of strings, fully serializable. The Approach: share pre-computed mapping data via temp fileStep 1: Main process collects mapping data (already computed)The main process already computes all ODSL mappings during Step 2: Serialize to temp fileBefore spawning workers in [
'directories' => [
'/path/to/src' => [$classToFile, $functionToFiles, $constantToFile],
'/path/to/vendor/pkg' => [$classToFile, $functionToFiles, $constantToFile],
// ...
],
'byFiles' => [
'odsl-installed-files' => [$classToFile, $functionToFiles, $constantToFile],
],
]Estimated size: ~3-5 MB for ~40,900 files. Format: PHP Step 3: Workers load pre-computed data instead of scanningPass the temp file path as a new Critical timing detail: When the parameter is set, Two entry points that must be coveredODSL instances are created through two independent code paths:
Both paths perform CleanupDelete the temp file after ImpactThis eliminates ~1,750,000 redundant FUSE operations (89% of total) during worker initialization. Each worker's init reduces from ~219,000 FUSE ops to ~1 (reading the temp file). No changes to analysis behavior, no backward compatibility concerns (all affected classes are Secondary improvement: close-on-exec for tmpfilesPHP's This is a hygiene improvement, not a fix for the reported issue. The ~20 inherited FDs are 4+ orders of magnitude below the container's 1M ulimit, and the bottleneck is host-side VirtioFS FD exhaustion from ~1,750,000 FUSE operations — not container-side FD count. PHP does not expose Appendix: Investigated and Ruled Out as Root CauseDuring analysis, several other code paths were examined as potential sources of the EMFILE error. They are documented here to explain why they were ruled out — and in one case, to flag a real but secondary bug.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Environment
ulimit -n: 1,048,576Symptom
Cold PHPStan runs (no result cache) intermittently fail with:
Warm runs (with cache) always succeed. Failure rate on cold runs: ~80%.
Investigation
We ruled out all obvious causes:
/proc/PID/fdsnapshots (far below any limit)kern.maxfiles= 276,480Key proof — same files on native filesystem always succeed:
src/andtests/to container-local tmpfsRoot cause
PHPStan's actual filesystem footprint is much larger than just the analyzed files. Beyond scanning
src/andtests/(7,164 files, 4,554 directories), it performs symbol discovery across the entirevendor/directory (16,892 files, 4,603 directories) via Composer's classmap (17,693 entries) and PSR-4 autoloader. The total bind mount footprint is ~24,000 files across ~9,200 directories.With 12 parallel workers, each independently scanning directories and resolving types from vendor, the potential burst is 12 × 33K ≈ 400K concurrent VirtioFS operations — all funneled through OrbStack's VirtioFS layer to the macOS host.
Bind mount file operations are constrained by macOS host kernel parameters, not container ulimits. We confirmed this empirically: a single container process could open 138,085 FDs to a bind-mounted file (correlating with the macOS default
kern.maxfilesperprocof 138,240), versus 1,048,573 FDs to a container-local file (matching the container ulimit). Under the heavy parallel I/O described above, the VirtioFS layer intermittently returnsEMFILEto the container. The exact internal mechanism is opaque (OrbStack is closed-source), but the fix is definitive: increasingkern.maxfilesperprocand restarting OrbStack eliminates the problem completely.Solution
Increase macOS kernel file limits and restart OrbStack (the process must be restarted to inherit the new limits):
sudo sysctl -w kern.maxfiles=1048576 kern.maxfilesperproc=524288 # Then restart OrbStack — the running process still holds the old limitsFor persistence across reboots, create a LaunchDaemon that sets these values at boot (before OrbStack starts).
Note: There is no OrbStack-specific or per-app setting for this — kern.maxfilesperproc can only be changed system-wide.
Affected setups
This likely affects any macOS Docker runtime using VirtioFS (OrbStack, Docker Desktop) when:
Beta Was this translation helpful? Give feedback.
All reactions