Skip to content

feat: add ddev utility port-diagnose command to identify port conflicts, fixes #8085 (#8260) [skip ci]#8260

Merged
rfay merged 42 commits intoddev:mainfrom
rfay:20260328_rfay_port_diagnose
Apr 15, 2026
Merged

feat: add ddev utility port-diagnose command to identify port conflicts, fixes #8085 (#8260) [skip ci]#8260
rfay merged 42 commits intoddev:mainfrom
rfay:20260328_rfay_port_diagnose

Conversation

@rfay
Copy link
Copy Markdown
Member

@rfay rfay commented Mar 28, 2026

The Issue

DDEV users struggle to identify which processes are occupying the ports DDEV needs (80, 443, or project-specific ones), especially on Windows/WSL2 where the blocking process may be on either side.

How This PR Solves The Issue

Adds ddev utility port-diagnose that:

  1. Requires poweroff first: Checks for running DDEV projects and ddev-router to avoid false positives
  2. Detects blocking processes using a multi-method chain:
    • lsofsudo lsof (with password prompt) → ss/proc/net/tcp (Linux)
    • lsofsudo lsof (macOS)
    • PowerShell Get-NetTCPConnection (Windows, WSL2 Windows-side)
  3. Terse one-line output per port with process name, PID, command, and platform label
  4. Actionable hints for known processes (apache2, nginx, caddy, IIS, Docker Desktop, OrbStack, Lando, wslrelay)
  5. Deduplicates worker processes (e.g., apache2 parent + children)
  6. Detects processes before dialing to avoid killing single-connection listeners like nc -l

Manual Testing Instructions

Prerequisites for all platforms: Run ddev poweroff first. The command will refuse to run if DDEV services are active.

macOS

  1. Start apache: sudo apachectl start
  2. Run from a project directory:
    ddev utility port-diagnose
    
    Expected: Port 80 shows IN USE by httpd (PID ...) [macOS] with hint sudo apachectl stop. You will be prompted for sudo password if lsof can't see the process without it. Port 443 and others should show Available.
  3. Stop apache: sudo apachectl stop
  4. Hold a port with nc: nc -l -k 8142 &
  5. Run again — port 8142 (XHGui HTTPS) should show IN USE by nc. Kill nc afterward.
  6. Run outside a project directory (cd /tmp && ddev utility port-diagnose) — should check only ports 80 and 443.

Linux (non-WSL2)

  1. Start apache: sudo systemctl start apache2 (or nginx)
  2. Run from a project directory:
    ddev utility port-diagnose
    
    Expected: Port 80 shows IN USE by apache2 (PID ...) [Linux] with systemctl stop/disable hints and apt-get remove hint.
  3. Stop apache: sudo systemctl stop apache2
  4. Hold a port with nc: nc -l -k -p 8142 &
  5. Run again — port 8142 should show IN USE by nc.
  6. Test without lsof installed: sudo apt remove lsof, re-run — should fall through to ss or /proc/net/tcp detection and still find nc.

WSL2

  1. Start apache inside WSL2: sudo systemctl start apache2
  2. Run from a project directory:
    ddev utility port-diagnose
    
    Expected: Port 80 shows IN USE by apache2 (PID ...) [Linux (WSL2)].
  3. On the Windows side, hold a port with PowerShell: $l = [System.Net.Sockets.TcpListener]::new([System.Net.IPAddress]::Any, 8142); $l.Start(). Run port-diagnose again — should show the Windows process with [Windows] label.
  4. If you see wslrelay on the Windows side, the hint should say to check WSL2 and run ddev poweroff there.
  5. Hold a port with nc: nc -l -k -p 8142 & — should be detected.

Windows (native, Docker Desktop with WSL2 backend)

  1. Hold a port using PowerShell (run in a separate terminal):
    $l = [System.Net.Sockets.TcpListener]::new([System.Net.IPAddress]::Any, 8142)
    $l.Start()
    Write-Output "Listening on 8142 - press Ctrl+C to stop"
    [Console]::In.ReadLine() | Out-Null
    $l.Stop()
  2. Run from a project directory:
    ddev utility port-diagnose
    
    Expected: Port 8142 shows IN USE by powershell (PID ...) [Windows] with Stop-Process hint.
  3. Stop the listener and verify it shows Available.
  4. If DDEV is running in WSL2, the command should detect wslrelay forwarding and hint to check WSL2.

All platforms: Edge cases

  • Run with DDEV still running — should print active projects/router and exit with code 2 asking for ddev poweroff.
  • Run outside a project directory — should check only default ports 80 and 443.
  • Run with no conflicts — should print All required ports are available.

Automated Testing Overview

Platform-specific tests with real process simulation:

  • nc-based detection tests: TestFindPortProcessesNC_Linux, _WSL2, _macOS — start nc, verify detection (skip if nc unavailable)
  • Own-process detection: TestFindPortProcessesOwnProcess (Unix), _Windows — Go test holds a port, verifies PID match
  • Method-specific tests: TestFindPortProcessesLsof, TestFindPortProcessesSS, TestFindPortProcessesProcNet — test each detection method directly
  • Windows PowerShell test: TestFindWindowsPortProcesses — starts a .NET TcpListener via PowerShell, verifies detection
  • Parser tests: TestParseLsofOutputFiltersListen (ESTABLISHED connections filtered), TestParseLsofOutputNoStateField (macOS compatibility)
  • Unit tests: TestDeduplicateByName, TestPortHints (all known processes), TestPortHintsPlatformSpecific

Release/Deployment Notes

  • New user-facing command; no breaking changes
  • Works on all supported platforms: macOS, Linux, Windows (native and WSL2)
  • No new dependencies — uses existing OS tools (lsof, ss, PowerShell)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 28, 2026

@rfay
Copy link
Copy Markdown
Member Author

rfay commented Mar 30, 2026

WSL2 manual test results (all passing):

  1. Project dir, no conflicts → all ports Available, exit 0
  2. nc -k on project port 8142 → detects nc [Linux (WSL2)] + wslrelay [Windows]
  3. nc -l (single-connection) → detected AND nc stays alive after detection
  4. Outside project dir → checks only ports 80 and 443
  5. DDEV running (project + router) → refuses with exit 2, names running projects and router
  6. Router only running → refuses with exit 2
  7. Windows-side PowerShell TcpListener on 8142 → detects powershell [Windows] with Stop-Process hint
  8. Multiple ports in use (8025 + 8142) → both detected correctly
  9. All clear after cleanup → exit 0

@rfay rfay force-pushed the 20260328_rfay_port_diagnose branch from 574a89d to d8a7581 Compare April 10, 2026 13:59
Copy link
Copy Markdown
Member

@stasadev stasadev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested in on Linux by running in one window:

docker run --rm -it -p 80:80 -p 443:443 nginx

In another window:

$ ddev utility port-diagnose
Port diagnostics for project: l12
 
Unable to identify the process without elevated privileges. 
Running: sudo /usr/sbin/lsof -iTCP:80 -sTCP:LISTEN -n -P
 
You may be prompted for your password. 
Port 80 (router HTTP): IN USE by docker-proxy (PID 667534, cmd=/usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 80 -container-ip 172.17.0.2 -container-port 80 -use-listen-fd) [Linux]
 
  sudo kill 667534
 
Port 443 (router HTTPS): IN USE by docker-proxy (PID 667574, cmd=/usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 443 -container-ip 172.17.0.2 -container-port 443 -use-listen-fd) [Linux]
 
  sudo kill 667574
 
Port 8025 (Mailpit HTTP): Available
 
Port 8026 (Mailpit HTTPS): Available
 
Port 8143 (XHGui HTTP): Available
 
Port 8142 (XHGui HTTPS): Available

$ sudo kill 667534

$ sudo kill 667574

$ ddev utility port-diagnose
Port diagnostics for project: l12
 
Port 80 (router HTTP): Available
 
Port 443 (router HTTPS): Available
 
Port 8025 (Mailpit HTTP): Available
 
Port 8026 (Mailpit HTTPS): Available
 
Port 8143 (XHGui HTTP): Available
 
Port 8142 (XHGui HTTPS): Available
 
All required ports are available.

$ ddev start
...
Error response from daemon: failed to set up container networking: driver failed programming external connectivity on endpoint ddev-router (1023cd8a4df8930a6dd04dc00acfc150b9a2c89df797a1b66857d6df36b6e161): Bind for 0.0.0.0:80 failed: port is already allocated'

Just saying that sudo kill isn't always going to work, because it didn't actually kill the docker run process, because it also created busy ports for IPv6:

$ sudo netstat -tulpn | grep LISTEN
tcp6       0      0 :::80                   :::*                    LISTEN      667547/docker-proxy 
tcp6       0      0 :::443                  :::*                    LISTEN      667581/docker-proxy

@rfay
Copy link
Copy Markdown
Member Author

rfay commented Apr 11, 2026

macOS Manual Test Results

Tested with all 5 Docker providers running simultaneously on macOS arm64 (Apple M4): OrbStack, Colima, Lima, Rancher Desktop, Docker Desktop.

Each scenario runs ddev utility port-diagnose from outside a project directory (checking default ports 80 and 443) with the specified active Docker context and the specified process holding port 80.

# Active context Port held by Output
1 OrbStack nc -k -l 80 (non-Docker process) ✅ "Consider stopping this process using OS tools, e.g. 'kill <pid>'"
2 OrbStack OrbStack container ✅ "Container 'orb-conflict' is holding this port. Run: docker stop orb-conflict"
3 Lima Lima container ✅ "Container 'lima-conflict' is holding this port. Run: docker stop lima-conflict"
4 Colima Colima container ✅ "Container 'colima-conflict' is holding this port. Run: docker stop colima-conflict"
5 Rancher Desktop Rancher Desktop container ✅ "Container 'rd-conflict' is holding this port. Run: docker stop rd-conflict"
6 Docker Desktop Docker Desktop container ✅ "Container 'dd-conflict' is holding this port. Run: docker stop dd-conflict"
7 OrbStack Docker Desktop container ✅ "Docker Desktop is running and holding this port (but is not your active Docker provider). Quit Docker Desktop from the menu bar or: killall 'Docker Desktop'"
8 OrbStack Colima container ✅ "Colima is running and holding this port (but is not your active Docker provider). Stop Colima: colima stop"
9 OrbStack Lima container ✅ "Lima is running and holding this port (but is not your active Docker provider). Stop Lima: limactl stop default"
10 OrbStack Rancher Desktop container ✅ "Rancher Desktop is running and holding this port (but is not your active Docker provider). Quit Rancher Desktop from the menu bar."
11 Docker Desktop Lima container ✅ "Lima is running and holding this port (but is not your active Docker provider). Stop Lima: limactl stop default"
12 Lima Colima container ✅ "Colima is running and holding this port (but is not your active Docker provider). Stop Colima: colima stop"
13 Colima Lima container ✅ "Lima is running and holding this port (but is not your active Docker provider). Stop Lima: limactl stop default"

How same-provider container identification works

For same-provider conflicts (tests 2–6), the tool uses the Docker API (GetDockerContainers) to identify the exact container holding the port and emits docker stop <name> directly.

Rancher Desktop note

Rancher Desktop's ssh port-forward mux uses SO_REUSEPORT, which allows a second net.Listen to succeed on the same port even when a listener is active. The port-free check was updated from a bind-only test to a bind + dial test (isPortFree) to catch this correctly.

@rfay
Copy link
Copy Markdown
Member Author

rfay commented Apr 11, 2026

Linux Test Matrix

Tested on Ubuntu 25.10 (Linux arm64) with all three Docker providers running simultaneously: docker-ce (rootful), docker-rootless, podman-rootless.

Each scenario runs ddev utility port-diagnose from outside a project directory (checking default ports 80 and 443) with the specified active Docker context and the specified process holding port 80.

# Active context lsof? Port held by Output
1 docker-ce yes sudo nc -l -k -p 80 (non-Docker, root) ✅ "Port 80 (HTTP): IN USE by nc (PID …) — Consider stopping this process using OS tools, e.g. 'kill <pid>'"
2 docker-ce yes docker-ce container ✅ "Container 'test-ce' is holding this port. Run: docker stop test-ce"
3 docker-ce yes docker-ce container (root docker-proxy) ✅ sudo lsof reveals docker-proxy, container name shown via Docker API
4 docker-ce no docker-ce container (root docker-proxy) ✅ falls back to sudo ss, still identifies docker-proxy and shows container name
5 docker-ce no sudo nc -l -k -p 80 (root, non-Docker) ✅ sudo ss identifies nc, gives kill hint
6 docker-ce yes docker-rootless container (rootlesskit, user-owned) ✅ "Docker rootless has a container holding this port (but is not your active Docker provider). Check: DOCKER_CONTEXT=rootless docker ps"
7 docker-ce yes podman-rootless container (rootlessport, user-owned) ✅ "Podman has a container holding this port (but is not your active Docker provider). Check: podman ps"
8 podman-rootless yes docker-ce container (root docker-proxy) ✅ "Docker CE (rootful) has a container holding this port (but is not your active Docker provider). Check: sudo docker ps"
9 podman-rootless yes podman-rootless container ✅ "Container 'test-podman' is holding this port. Run: docker stop test-podman"
10 podman-rootless yes docker-rootless container (rootlesskit) ✅ "Docker rootless has a container holding this port (but is not your active Docker provider)."
11 docker-ce yes port free ✅ "Port 80 (HTTP): Available"

Notes

Process names by provider:

  • docker-ce (rootful): docker-proxy (runs as root — requires sudo lsof/ss to identify)
  • docker-rootless: rootlesskit (runs as current user — visible without sudo)
  • podman-rootless: rootlessport (runs as current user — visible without sudo)

lsof absent (scenarios 4–5): Falls back to sudo ss cleanly. The "Running: sudo lsof…" line is suppressed when lsof is not installed. Scenario 5 (sudo ss finds nc) gives a kill hint; the "try installing lsof" hint only appears if sudo ss also fails to identify the process.

Podman container lookup fix: Podman's container-list API omits the IP field from port entries. The original p.IP.IsValid() guard prevented container name lookup from working under Podman; it was removed. This fix is covered by the new TestContainerNameForPort unit test.

New process names added to hints: rootlesskit/rootlessk (Docker rootless) and rootlessport/rootlessp (Podman) are now recognised and routed through dockerProviderHints() for cross-provider detection.

@rfay
Copy link
Copy Markdown
Member Author

rfay commented Apr 11, 2026

Windows (native) Manual Testing — arm64 and amd64, Docker Desktop and Rancher Desktop

Tested on:

  • Windows 11 arm64 with Docker Desktop
  • Windows 11 amd64 with both Rancher Desktop and Docker Desktop (tested separately and together)

Bugs found and fixed

Bug: All ports falsely reported IN USE on Windows when nothing is listening (commit bdf92d6f3)

On Windows-native, findWindowsPortProcesses returning empty was treated as "IN USE (unable to identify)" because the isPortFree check was gated behind !nodeps.IsWindows() and never called. Every port reported as in use even when nothing was listening. Fixed by adding an explicit isPortFree check on Windows before the "unable to identify" path.

Bug: Docker Desktop container reported twice per port (commit eff854ef7)

On Docker Desktop for Windows, a container binding port 80/443 caused both wslrelay.exe (the WSL2→Windows relay) and com.docker.backend.exe (the Docker Desktop proxy) to appear as separate port-holders. Each generated its own "IN USE" line with identical hints pointing to the same container and the same docker stop command. Fixed by adding suppressWSLRelayIfRedundant(), called after deduplicateByName(), which drops wslrelay when any other process is also present for the same port. wslrelay is preserved when it is the sole entry (bare WSL2 distro service case).


Test results

# Scenario Result Notes
W1 Port free — Docker Desktop active, nothing on 80/443 ✅ "All required ports are available." Required the isPortFree fix above — previously showed false IN USE
W2 PowerShell TcpListener on port 80 powershell (PID XXXX, cmd=...powershell.exe) [Windows] / Stop-Process -Id XXXX (PowerShell as Admin) Process name, full path, and hint all correct
W3 Docker Desktop container docker run -d -p 80:80 nginx ✅ Single entry: com.docker.backend (PID XXXX, cmd=...com.docker.backend.exe) [Windows] / Container 'name' is holding this port. Run: docker stop name Required the wslrelay dedup fix — previously showed two entries per port (wslrelay + com.docker.backend) with identical hints
W4 Non-Docker process (PowerShell listener) ✅ See W2 Stop-Process hint correct
W5 Rancher Desktop — port free, inside project ✅ "All required ports are available." Tested on amd64; all 6 project ports checked and reported correctly
W6 Rancher Desktop — container on port 80 com.docker.backend identified with container name and docker stop hint See note below — on amd64 with both providers installed, both contexts resolve to Docker Desktop's daemon
W7 Outside DDEV project directory ✅ "Not in a DDEV project directory — checking default ports 80 and 443." Correct on both providers
W8 DDEV running — ddev start active ✅ "DDEV is currently active (running projects: d11.windows; ddev-router is running). Please run 'ddev poweroff' first." Correctly blocks false-conflict reporting while DDEV is up

Note on W6 / cross-provider behavior: When both Rancher Desktop and Docker Desktop are installed and running simultaneously on Windows, Docker Desktop claims the npipe:////./pipe/docker_engine socket (normally Rancher Desktop's endpoint). Both contexts (default and desktop-linux) resolve to Docker Desktop's daemon — ddev version reports docker-platform: docker-desktop for both. True daemon isolation between the two providers while both are running was not achievable in this configuration; stopping one provider entirely before testing the other is required for a clean Rancher-only test.


Additional fixes made during Windows testing

wslrelay hint improvement (commit 24e86ae51): The wslrelay hint now calls findContainerForPort to distinguish between a Docker/Rancher Desktop container forwarded via WSL2 and a service in a different WSL2 distro. When a container is found it names it and suggests docker stop <name>; otherwise guides the user to check each distro with wsl --list / wsl -d <distro> -- ss -tlnp.

docker-proxy + Rancher Desktop mismatch (commit 24e86ae51): Rancher Desktop in dockerd mode uses docker-proxy internally. Previously this triggered "Docker CE (rootful) has a container holding this port (but is not your active Docker provider)" — now correctly routes through dockerContainerHints when Rancher Desktop is the active provider.


Key things verified on Windows (amd64 + arm64)

  • PowerShell Get-NetTCPConnection path works with Docker Desktop
  • Correct process name shown (com.docker.backend) for Docker Desktop port forwarder — single entry per port
  • Container name resolved and docker stop <name> suggested
  • wslrelay suppressed when com.docker.backend is co-listed (same port, same container)
  • wslrelay preserved when it is the sole entry (bare WSL2 service)
  • No sudo or lsof code paths triggered on Windows
  • Hint message includes PowerShell command for manual investigation
  • Free ports correctly report "Available" (required Windows-specific fix)
  • Outside-project fallback correctly checks only ports 80 and 443
  • DDEV-active guard (running project + router) correctly fires and names both
  • Rancher Desktop: all basic scenarios pass (port free, outside project, DDEV-active guard)
  • Rancher Desktop in isolation (no Docker Desktop installed) — wslrelay-only container path not directly verified on this machine; code path exists and wslrelay hint tested via portHints unit tests

Proposed additional test scenarios (Groups G/H)

Group G — Rancher Desktop container conflict (WSL2, Rancher only):

docker run -d -p 80:80 --name test-nginx nginx
ddev utility port-diagnose
# Expected: wslrelay [Windows] + "Container 'test-nginx' is forwarded to Windows via WSL2. Run: docker stop test-nginx"
docker stop test-nginx && docker rm test-nginx

Group H — Different WSL2 distro holds port 80:

# In PowerShell, start a listener in a second distro:
wsl -d Ubuntu-22.04 -- sudo service nginx start
# Then in ddev distro:
# ddev utility port-diagnose
# Expected: wslrelay [Windows] + "A WSL2 distro is forwarding this port to Windows.
#   Otherwise check which distro holds it — in PowerShell:
#     wsl --list
#     wsl -d <distro> -- ss -tlnp"
wsl -d Ubuntu-22.04 -- sudo service nginx stop

@rfay
Copy link
Copy Markdown
Member Author

rfay commented Apr 11, 2026

WSL2 Test Results

Environment: WSL2 Ubuntu arm64, NAT networking mode, docker-ce (rootful), passwordless sudo, lsof installed unless noted. Tested with binary built from this branch.

Bugs found and fixed during this session (commit 36c6f82):

  1. EACCES false positive — ports < 1024 (80, 443) incorrectly reported IN USE when nothing was listening; isPortFree was treating permission-denied the same as address-in-use.
  2. Root-owned Linux listener hidden by wslrelay — when a root process held a port, wslrelay.exe appeared on the Windows side first, making allProcs non-empty and skipping sudo lsof entirely. Linux process was never identified.
  3. Stale compose YAML overriding project port configGetPrimaryRouterHTTPPort read DDEV_ROUTER_HTTP_PORT from the last rendered compose YAML rather than the current config. Fixed by setting app.ComposeYaml = nil before reading ports.
  4. Spurious sudo messages for Windows-only port holders — when a port was held only on the Windows side, "Unable to identify the process" and "Running: sudo lsof..." were printed unnecessarily. Fixed by gating sudo escalation on isPortFree for the Linux side.

Group A — Baseline

# Scenario Result Notes
A1 Outside project dir, ports free "Not in a DDEV project directory — checking default ports 80 and 443." / "All required ports are available." / exit 0
A2 Port < 1024 free (EACCES fix) Ports 80 and 443 correctly show Available with nothing listening — no false IN USE from bind permission error
A3 Inside project dir, all ports free Project name shown, all 6 ports (HTTP, HTTPS, Mailpit HTTP/HTTPS, XHGui HTTP/HTTPS) checked, exit 0
A4 DDEV project + router running Exit 2: "DDEV is currently active (running projects: d11; ddev-router is running). Please run 'ddev poweroff' first."
A5 Router only running (project stopped) Exit 2: "DDEV is currently active (ddev-router is running)."

Group B — Linux-side conflicts (user-owned, visible without sudo)

# Scenario Result Notes
B1 nc -l -k -p 8025 in WSL2 "Port 8025 (Mailpit HTTP): IN USE by nc (PID …, cmd=nc …) [Linux (WSL2)]" + wslrelay [Windows], exit 1
B2 nc -l 8025 single-connection listener nc detected AND still alive after port-diagnose exits — detection uses lsof/ss, never dials the port

Group C — Linux-side conflicts (root-owned, requires sudo)

# Scenario lsof? Result Notes
C1 Root-owned nc on port 80 yes wslrelay [Windows] + nc [Linux (WSL2)] both shown (was bug #2; fixed)
C3 docker-ce container publishing port 80 (docker-proxy) yes sudo lsof reveals docker-proxy; Docker API resolves container name; "Container 'conflict-test' is holding this port. Run: docker stop conflict-test"
C4 docker-ce container publishing port 80 (docker-proxy) no sudo ss fallback used; "Running: sudo lsof…" line suppressed (lsof not installed); container name still resolved via Docker API

Group D — Windows-side conflicts

# Scenario Result Notes
D1 PowerShell TcpListener on port 8025 "Port 8025: IN USE by powershell (PID …) [Windows]" with Stop-Process hint; no spurious "Unable to identify…" or sudo lsof messages (was bug #4; fixed)
D2 Docker Desktop container publishing port 8025 wslrelay [Windows] + com.docker.backend [Windows] both detected; "Docker Desktop is running and holding this port (but is not your active Docker provider). Quit Docker Desktop…" hint correct. Minor: wslrelay hint says "run ddev poweroff inside WSL2" which is slightly misleading (real owner is Docker Desktop); com.docker.backend line gives the correct actionable hint. Also: isPortFree returns false for this case because Docker Desktop's Windows-side listener is accessible at 127.0.0.1 from WSL2, so sudo lsof runs (finds nothing extra) — cosmetic noise only.

Group E — Cross-distro (listener in another WSL2 distro)

# Scenario Result Notes
E1 nc -l -k 8025 in ubuntu-desktop distro WSL2 distros share a network namespace: bind fails in our distro (socket visible in /proc/net/tcp), but the process is in a different PID namespace so lsof/ss cannot identify it. Windows side shows wslrelay correctly. Output: "Port 8025: IN USE by wslrelay [Windows]" with "Run 'ddev poweroff' inside WSL2" hint.
E2 Process visibility across distros ✅ confirmed port-diagnose cannot see into another distro's process table — only wslrelay on the Windows side is reported. The hint ("check WSL2") is directionally correct but not precise (user would need to check all running distros). Acceptable for a rare edge case.

Group F — Port config edge cases

# Scenario Result Notes
F1 Project port changed without restart (stale compose YAML) router_http_port: 9999 in config.yaml, rendered YAML still has DDEV_ROUTER_HTTP_PORT: 80 — port-diagnose shows 9999 (was bug #3; fixed)
F2 Outside project dir Only ports 80 and 443 checked
F3 Inside project dir All six configured project ports checked

@rfay rfay marked this pull request as ready for review April 11, 2026 23:44
@rfay rfay requested review from a team as code owners April 11, 2026 23:44
@rfay
Copy link
Copy Markdown
Member Author

rfay commented Apr 11, 2026

This is crazy territory, with results being different on every OS and Docker provider. I didn't expect to spend all day on it. It's good enough I imagine, and involves no risk that I can see.

Copy link
Copy Markdown
Collaborator

@tyler36 tyler36 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test on WSL2 (Ubuntu 24.04) on Win10, with Docker Desktop and works as expected.

I get "wslrelay.exe [Windows]" messages when running denying sudo access and "[Linux (WSL2)]" messages when allowing sudo access . Both messages point to a WSL-side conflict so should not be a problem.

Test

WSL2 (Ubuntu 24.04) on Win10

$ ddev -v
ddev version v1.25.1-108-g98312a83c
  1. Start Apache on port 80
$ sudo systemctl start apache2
  1. Run diagnostic WITHOUT sudo use
$ ddev utility port-diagnose
...
Allow sudo use? [y/N] (no): n
Port 80 (router HTTP): IN USE by wslrelay (PID 21268, cmd=C:\Program Files\WSL\wslrelay.exe) [Windows]
  1. Run diagnostic WITH sudo use
$ ddev utility port-diagnose
...
Allow sudo use? [y/N] (no): y
Port 80 (router HTTP): IN USE by apache2 (PID 10171, cmd=/usr/sbin/apache2 -k start) [Linux (WSL2)]
  1. Stop Apache2
sudo systemctl stop apache2 
  1. Run diagnostic WITHOUT sudo use
$ ddev utility port-diagnose
...
Allow sudo use? [y/N] (no): n
Port 80 (router HTTP): Available
  1. Run diagnostic WITH sudo use
$ ddev utility port-diagnose
...
Allow sudo use? [y/N] (no): y
Port 80 (router HTTP): Available

@stasadev stasadev changed the title feat: add ddev utility port-diagnose command to identify port conflicts feat: add ddev utility port-diagnose command to identify port conflicts, fixes #8085 Apr 14, 2026
Copy link
Copy Markdown
Member

@stasadev stasadev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Thank you for chasing down rootful/rootless path, and so many testing scenarios (which I didn't test).

I have a small nitpick about highlighting ddev utility port-diagnose in the docs.

* IIS on Windows (can affect WSL2). You’ll have to disable it in the Windows settings.

To dig deeper, you can use a number of tools to find out what process is listening.
To dig deeper, run `ddev utility port-diagnose` from your project directory. It checks each port your project needs, identifies the blocking process by name and PID, and suggests how to stop it:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added too far down the page, people just won't read this info.

Considering the effort put into this command, I suggest moving it or adding a tip at the top of:
https://ddev--8260.org.readthedocs.build/en/8260/users/usage/troubleshooting/#web-server-ports-already-occupied

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked the troubleshooting.md, need to re-read it in the morning, but it should be better and this should be ready to go.

rfay and others added 14 commits April 14, 2026 20:23
…icts

Implements issue ddev#8085: a new `ddev utility port-diagnose` command (aliased
as `ddev ut port-diagnose`) that identifies which processes are occupying
ports needed by DDEV projects.

Features:
- Checks project-specific ports (router HTTP/HTTPS, Mailpit, XHGui)
- Falls back to checking ports 80 and 443 when outside a project
- Identifies blocking processes by PID, name, and full command line
- On Linux/macOS: uses lsof with fallback to ss
- On Windows: uses PowerShell Get-NetTCPConnection
- On WSL2: checks both Linux and Windows sides independently
- Provides actionable hints for kill/stop/disable/uninstall based on process

Tests:
- TestPortDiagnoseAvailablePort: smoke test
- TestPortDiagnoseInUsePort: verifies process identification
- TestPortHints: validates hint generation for known processes

Docs:
- Updated troubleshooting.md with port-diagnose example and guidance
- Added before manual lsof/netstat instructions

Implementation follows DDEV patterns:
- Uses existing netutil.IsPortActive() for initial checks
- Reuses platform detection from nodeps package
- Follows DDEV diagnostic command structure
- Cleans golangci-lint with SplitSeq and CutPrefix modernization
…ter process detection

- Require ddev poweroff before running, to avoid false positives from running projects
- Terse one-line output per port: "Port 80 (HTTP): IN USE by nginx (PID 1234) [Linux]"
- Better process detection chain: lsof -> sudo lsof -> ss -> /proc/net/tcp
- No longer depends on lsof being installed; falls through to ss and /proc/net/tcp
- Still doesn't detect nc on Linux (process detection issue to investigate)

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…skip ci]

- Check for ddev-router container in addition to running projects
- Fix TestPortHints to work on macOS (no systemctl) and all platforms

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…p ci]

- Remove linux-only guard so sudo lsof runs on macOS too
- Drop -n flag and connect stdin/stderr so sudo can prompt for password
- Print one-time message explaining why elevated privileges are needed

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…se hints [skip ci]

- Use /usr/sbin/lsof with fallback to PATH lsof (macOS keeps it in /usr/sbin)
- Remove incorrect "brew uninstall httpd" hint for apache on macOS
- Make all hints concise single-line output

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…p ci]

Apache2 (and similar) runs parent + worker processes all listening on the
same port. Deduplicate by process name so only one entry is shown per port.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…nection listeners [skip ci]

IsPortActive() dials the port, which causes single-connection listeners
like `nc -l` to accept and exit before lsof can find them. Restructured
to find processes first (lsof/ss/proc), then fall back to IsPortActive
only when no processes are found.

Also improved the "unidentifiable" message to suggest the manual sudo
lsof command, and suppressed noisy sudo stderr when password is required.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…kip ci]

- Separate nc-based tests for Linux, WSL2, macOS (skip if nc unavailable)
- Direct tests for each detection method: lsof, ss, /proc/net/tcp, Windows
- TestDeduplicateByName for apache2 worker process dedup
- Expanded TestPortHints with subtests for all known processes
- TestPortHintsPlatformSpecific verifies correct commands per OS

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…p ci]

On macOS, lsof -sTCP:LISTEN was still returning ESTABLISHED connections
(Chrome, Discord outbound to remote port 443). Now parseLsofOutput
requests the T field (-F pcnT) and only accepts entries with TST=LISTEN.

Also shows the full sudo command being run so users know what to expect
and can run it themselves.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…skip ci]

Some macOS lsof versions may not emit the T (TCP state) field. Now
entries without a TST= line are accepted (trusting the -sTCP:LISTEN
filter), while entries with an explicit non-LISTEN state are rejected.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
macOS nc does not allow -p with -l (port must be positional);
Linux nc requires -p to specify the listen port.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…kip ci]

$pid is a read-only automatic variable in PowerShell. Renamed to $procId
in the Get-NetTCPConnection script — this was broken on all Windows/WSL2.

Added TestFindWindowsPortProcesses that starts a .NET TcpListener on the
Windows side via PowerShell and verifies findWindowsPortProcesses finds it.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
rfay and others added 23 commits April 14, 2026 20:23
…hints

- Update troubleshooting docs to match current terse output format
- Reset sudoMessageShown at start of runPortDiagnose to avoid state leaks
- Early break in findPortProcessesProcNet after matching a PID's inode
- Validate port is numeric before interpolating into PowerShell script
- Show PowerShell hint instead of sudo lsof on Windows when process is
  unidentifiable

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Three issues fixed:

1. `lsof -sTCP:LISTEN` on macOS arm64 returns UDP connections (e.g. Chrome
   QUIC traffic to remote port 443), causing false port-conflict reports.
   Fixed by using `-iTCP:<port>` instead of `-i :<port>` to restrict lsof
   to TCP sockets before any filtering.

2. `findPortProcessesSudoLsof` connected stdin unconditionally, so in CI
   environments where Buildkite provides a pseudo-terminal, sudo would
   prompt for a password and hang for hours. Fixed with an `isTerminal`
   check — stdin is only connected in interactive sessions.

3. The "Unable to identify the process without elevated privileges" message
   and sudo attempt fired even for free ports (lsof exits 1 when nothing
   matches). Moved sudo detection out of `findPortProcesses` into
   `runPortDiagnose`, gated behind `IsPortActive`, so the elevated path
   only runs when something is actually listening.

Also fixes `TestFindPortProcessesNC_macOS` to call `findPortProcessesLsof`
directly (skipping the sudo fallback) and skip gracefully when non-sudo
lsof cannot see nc — the macOS CI restriction that triggered the hang.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
- Add docker-proxy case: killing the proxy doesn't free the port;
  guide users to 'docker ps' / 'docker stop <name>' instead.
- Extend OrbStack match to cover "OrbStack Helper" (the process name
  lsof reports on macOS/OrbStack), with the same docker stop guidance.
- Replace 'sudo kill <pid>' default with a softer suggestion:
  "Consider stopping this process using OS tools, e.g. 'kill <pid>'".
- Replace IsPortActive with isPortBindable (net.Listen on 0.0.0.0)
  for the sudo-gate availability check. IsPortActive dials the Docker
  IP which on OrbStack is a VM gateway address, not loopback, so it
  missed listeners on *:80/443. A bind attempt correctly detects any
  listener on any local interface.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
When a port is held by a Docker provider process (docker-proxy, ssh/limactl
for Colima, OrbStack Helper), use context-aware hints:

- If the process belongs to the active Docker provider, query the Docker API
  (GetDockerContainers) to find exactly which container holds the port and
  emit: "Container 'name' is holding this port. Run: docker stop name"
- If the process belongs to a different (non-active) Docker provider, suggest
  stopping that provider instead (colima stop, quit OrbStack menu, etc.)

Add helpers:
- activeDockerProvider(): uses dockerutil.IsColima/IsOrbStack/etc. to name
  the currently active provider
- findContainerForPort(): queries the Docker API by PublicPort to find the
  container name without shelling out to docker ps
- dockerContainerHints(): wraps findContainerForPort with a fallback
- dockerProviderHints(): dispatches to container hints or provider-stop hints

Also add Colima/Lima ssh-mux and limactl cases to portHints, and thread
cmdLine and port through the portHints signature to support these lookups.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…aths [skip ci]

- Detect plain Lima (/.lima/ in cmdline) separately from Colima (/.colima/)
  so limactl/ssh port-forwards on Lima without Colima are correctly identified
- Add Lima and Rancher Desktop to dockerProviderHints non-active provider branch
- Remove hardcoded /Users/testbot paths from hint tests; use relative ~/.lima
  and ~/.colima paths (expanded at test time) so CI runners with any home
  directory pass correctly
- Relax provider hint test assertions to check for "port" (present in all
  provider branches) rather than provider-specific strings that vary depending
  on which Docker provider is active in the test environment

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…kip ci]

Two fixes discovered during manual testing with all 5 macOS Docker
providers running simultaneously:

1. Rancher Desktop forwards ports via an ssh mux whose cmdline contains
   "rancher-desktop/lima" — add a matching case before the generic
   /.lima/ case so it maps to "Rancher Desktop" rather than "Lima",
   and add the Rancher Desktop entry to dockerProviderHints.

2. Rancher Desktop's ssh mux uses SO_REUSEPORT, which allows a second
   net.Listen to succeed on the same port even when a listener is
   already active. This caused isPortBindable to return true (free)
   for a port actually held by a Rancher Desktop container. Replace
   isPortBindable with isPortFree, which performs a bind followed by
   a 250ms dial to 127.0.0.1:<port>: if something answers the dial
   after a successful bind, the port is in use despite the bind
   succeeding.

All 13 manual test scenarios passed after these fixes (see PR comment).

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
When lsof is absent (stock Ubuntu), fall back to sudo ss -tlnp to find
root-owned listeners such as docker-proxy. Also update the "unable to
identify" hint to suggest sudo apt-get install lsof when lsof is missing.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…Linux [skip ci]

Adds full Linux multi-provider support to `ddev utility port-diagnose`:

- `activeDockerProvider()` now returns "Podman", "Docker (rootless)", or
  "Docker" for Linux providers (previously returned "" for all of them).

- `portHints()` recognises the Linux rootless port-forwarding processes:
  `rootlesskit`/`rootlessk` (Docker rootless) and `rootlessport`/`rootlessp`
  (Podman). Both now route through `dockerProviderHints()` for cross-provider
  awareness. `docker-proxy` likewise routes through `dockerProviderHints("Docker")`
  instead of calling `dockerContainerHints` directly.

- `dockerProviderHints()` now emits provider-specific stop hints for each
  Linux provider when it is not the active one, e.g. "Podman has a container
  holding this port (but is not your active Docker provider). Check: podman ps".

- `findContainerForPort()` drops the `p.IP.IsValid()` guard: Podman's
  container-list API omits the IP field entirely, so the guard prevented
  container lookup from ever succeeding under Podman. Checking `PublicPort`
  alone is correct — a zero PublicPort won't match a real port number.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Bugs found and fixed during WSL2 (NAT mode, docker-ce) manual testing:

1. EACCES false positive on ports < 1024: isPortFree treated EACCES
   (permission denied) the same as EADDRINUSE (address in use). On Linux
   with ip_unprivileged_port_start=1024, binding port 80 as non-root
   returns EACCES — port is free but was reported as IN USE. Fix:
   distinguish EACCES from other bind errors and fall through to a
   dial-only check.

2. Root-owned Linux listener hidden by wslrelay: When a root process
   held a port, findPortProcesses (unprivileged) returned empty. But
   findWindowsPortProcesses found wslrelay.exe, making allProcs non-empty
   and skipping the sudo lsof escalation. Fix: track Linux-side and
   Windows-side results separately; sudo escalation is driven by whether
   the Linux side found something, independent of the Windows side.

3. Stale compose YAML overriding current project port config:
   GetPrimaryRouterHTTPPort reads DDEV_ROUTER_HTTP_PORT from the last
   rendered compose YAML (set during previous ddev start), overriding the
   current project config. Fix: set app.ComposeYaml = nil before reading
   ports, same as Start() already does.

4. Spurious "Unable to identify..." and "Running: sudo lsof..." when port
   is held only on the Windows side: when a Windows-only process held a
   port (e.g. PowerShell TcpListener), isPortFree returned true for the
   Linux side but the code still entered the sudo escalation path because
   allProcs was non-empty from the Windows check. Fix: call isPortFree
   once and use it to gate sudo escalation — only escalate when the port
   is confirmed in use on the Linux side.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…istro conflicts [skip ci]

The wslrelay hint now calls findContainerForPort to distinguish between
a Docker/Rancher Desktop container forwarded via WSL2 and a service
running in a different WSL2 distro. When a container is found, it is
named with a docker stop suggestion; otherwise the user is guided to
check each distro with wsl --list and ss -tlnp.

Also fix docker-proxy being misidentified as Docker CE (rootful) when
Rancher Desktop is the active provider — Rancher Desktop in dockerd
mode uses docker-proxy internally, so it should route through
dockerContainerHints instead of the cross-provider warning.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…ip ci]

On Windows-native, findWindowsPortProcesses returning empty was treated
as "IN USE (unable to identify)" because the isPortFree check was gated
behind !nodeps.IsWindows(). This produced false positives for every free
port. Add an explicit isPortFree check on Windows before the
"unable to identify" path so that genuinely free ports report Available.

Found during Windows native manual testing (test matrix scenario #1).

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…resent

On Docker Desktop for Windows, a container binding port 80/443 causes
both wslrelay.exe (the WSL2→Windows relay) and com.docker.backend.exe
(the Docker Desktop proxy) to appear as separate port-holders. The old
code reported both, producing two identical "IN USE" lines per port
pointing to the same container with the same docker-stop hint.

Add suppressWSLRelayIfRedundant(), called after deduplicateByName(),
which drops wslrelay when any other process is also present. wslrelay
is always a subordinate relay; the co-listed provider process carries
the actionable hint. When wslrelay is the sole entry (bare WSL2 service
case) it is preserved unchanged.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Extracts the container-matching loop from findContainerForPort into a
pure helper containerNameForPort(hostPort int, containers []container.Summary)
so it can be tested without a live Docker API.

TestContainerNameForPort covers:
- Docker CE style (IP field populated with 0.0.0.0)
- Podman style (IP field absent / zero netip.Addr)
- Port not matched
- Unexposed port (PublicPort == 0) does not false-match
- Leading slash stripped from container name

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…sudo flag

Before running any elevated command, show the exact full-path sudo command(s)
that may be used and ask via util.ConfirmTo. Each invocation also prints the
command being run. --allow-sudo skips the prompt (useful for scripts/CI).
Document the flag in commands.md and troubleshooting.md.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Use exec.LookPath as fallback so the lsof path passed to sudo is always
absolute, not a bare command name that could be hijacked via PATH.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Resolving sudo/lsof/ss via PATH is a security risk; accept only known
canonical locations (/usr/bin/sudo, /usr/sbin/lsof, /usr/sbin/ss, etc.).
If a tool is not found at a canonical path it is treated as unavailable.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Set global router ports to unprivileged test ports so runPortDiagnose
can be exercised end-to-end without root. Tests skip when DDEV is active.
Also read router ports from globalconfig when not in a project directory.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…tests

GetActiveProjects() misses a running router holding ports. PowerOff() stops
both all projects and the router, ensuring a clean slate for the test.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
When Docker Desktop is the active provider, the hint describes the
container (not the provider name), so assert "port" which is present
in all branches — consistent with other provider-dependent test cases.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
The manual lsof/netstat/ps/Windows-netstat diagnostic blocks are now
redundant since `ddev utility port-diagnose` identifies the blocking
process automatically. Merge the duplicate Methods 1 and 3, reorder
the common-tools list under Method 1 where it belongs, and update the
WSL2 section to point at port-diagnose instead of removed commands.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@rfay rfay force-pushed the 20260328_rfay_port_diagnose branch from 98312a8 to c7debd2 Compare April 15, 2026 02:35
Copy link
Copy Markdown
Member

@stasadev stasadev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@rfay
Copy link
Copy Markdown
Member Author

rfay commented Apr 15, 2026

I read https://ddev--8260.org.readthedocs.build/en/8260/users/usage/troubleshooting/#web-server-ports-already-occupied again, thanks for your suggestions. It's trimmed down, with the new tool as the lead. This whole page is a bit too much, trying to cover everything ever discovered in the history of DDEV :) I'm not sure what can be done about that.

@rfay rfay changed the title feat: add ddev utility port-diagnose command to identify port conflicts, fixes #8085 feat: add ddev utility port-diagnose command to identify port conflicts, fixes #8085 (#8260) [skip ci] Apr 15, 2026
@rfay rfay merged commit 69e79ab into ddev:main Apr 15, 2026
36 of 37 checks passed
@rfay rfay deleted the 20260328_rfay_port_diagnose branch April 15, 2026 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: ddev utility port-diagnose

3 participants