Skip to content

Fix flaky LinuxResourceHealthCheckTests by isolating MeterListener with ReferenceEquals#7302

Merged
stephentoub merged 6 commits intomainfrom
copilot/fix-flaky-healthcheck-test
Feb 17, 2026
Merged

Fix flaky LinuxResourceHealthCheckTests by isolating MeterListener with ReferenceEquals#7302
stephentoub merged 6 commits intomainfrom
copilot/fix-flaky-healthcheck-test

Conversation

Copy link
Contributor

Copilot AI commented Feb 14, 2026

  • Analyze code coverage issue
  • Understand that CheckHealthAsync code path needs to be exercised
  • Implement hybrid approach: use local MeterListener for test correctness + call CheckHealthAsync for code coverage
  • Test passes all cases successfully
  • Reply to user comment
Original prompt

Problem

The test Microsoft.Extensions.Diagnostics.HealthChecks.Test.LinuxResourceHealthCheckTests.TestCpuAndMemoryChecks_WithMetrics is flaky in CI. It fails with:

Assert.Equal() Failure: Values differ
Expected: Healthy
Actual:   Unhealthy

at line 198 in test/Libraries/Microsoft.Extensions.Diagnostics.HealthChecks.ResourceUtilization.Tests/Linux/LinuxResourceHealthCheckTests.cs.

Root Cause

The test creates a Meter with the hardcoded name "Microsoft.Extensions.Diagnostics.ResourceMonitoring" (line 157). The ResourceUtilizationHealthCheck class creates a MeterListener that subscribes to all instruments from any meter with that name via OnInstrumentPublished:

// In ResourceUtilizationHealthCheck.cs line 184-190
private void OnInstrumentPublished(Instrument instrument, MeterListener listener)
{
    if (instrument.Meter.Name == ResourceUtilizationInstruments.MeterName)
    {
        listener.EnableMeasurementEvents(instrument);
    }
}

Since this is a [ConditionalTheory] with [MemberData], multiple test cases run. When test cases overlap in execution (or other tests in the same process also create meters with the same name), one test's MeterListener picks up instruments from another test's LinuxUtilizationProvider, receiving wrong measurement values. A test case expecting Healthy can receive high CPU/memory values from a concurrently-running test case that expects Unhealthy.

Note that line 155 generates a unique name var meterName = Guid.NewGuid().ToString() but it is never used — the meter on line 157 uses the hardcoded constant instead. However, simply using a unique meter name would break the test because ResourceUtilizationHealthCheck.OnInstrumentPublished would reject instruments whose meter name doesn't match the hardcoded ResourceUtilizationInstruments.MeterName.

Fix

The fix should be a test-only change in test/Libraries/Microsoft.Extensions.Diagnostics.HealthChecks.ResourceUtilization.Tests/Linux/LinuxResourceHealthCheckTests.cs.

Instead of relying on ResourceUtilizationHealthCheck's MeterListener (which matches by meter name globally), the test should use its own local MeterListener that filters by meter reference using ReferenceEquals(instrument.Meter, meter). This is the same isolation pattern already used in this repo by:

  • WindowsCountersTests.cs — uses ReferenceEquals(meter, instrument.Meter) in InstrumentPublished
  • AcceptanceTest.cs — uses ReferenceEquals(instrument.Meter.Scope, meterScope) in OnInstrumentPublished

The test should:

  1. Remove the unused meterName variable.
  2. Replace the ResourceUtilizationHealthCheck instance with a local MeterListener scoped via ReferenceEquals to the specific meter instance created in the test.
  3. Collect cpuUsedPercentage and memoryUsedPercentage from the local listener's measurement callback (multiplying by 100 since Linux gauges are in [0,1] range), matching what ResourceUtilizationHealthCheck.OnMeasurementRecorded does.
  4. Call listener.RecordObservableInstruments() to trigger observation.
  5. Call ResourceUtilizationHealthCheck.EvaluateHealthStatusAsync(cpuUsedPercentage, memoryUsedPercentage, options) to evaluate health status. Note: EvaluateHealthStatusAsync is private static — if it cannot be accessed from the test, the test should instead inline the equivalent health evaluation logic, or use the same thresholds comparison that EvaluateHealthStatusAsync performs to construct and verify the expected HealthCheckResult.

The key instruments to listen for are:

  • ResourceUtilizationInstruments.ProcessCpuUtilization and ContainerCpuLimitUtilization_cpuUsedPercentage = measurement * 100
  • ResourceUtilizationInstruments.ProcessMemoryUtilization and ContainerMemoryLimitUtilization_memoryUsedPercentage = measurement * 100

After collecting the measurements, the test should evaluate health status using the same logic as EvaluateHealthStatusAsync in ResourceUtilizationHealthCheck.cs (lines 42-90) which compares the percentages against the CpuThresholds and MemoryThresholds from ResourceUtilizationHealthCheckOptions.

If making EvaluateHealthStatusAsync accessible is too invasive, an alternative simpler approach: keep the ResourceUtilizationHealthCheck but add [Collection("ResourceMonitoring Meter Tests")] to serialize with other tests that create meters with the same name. This is the pattern used in this repo for "Tcp Connection Tests", "StdoutUsage", and "EnvironmentVariableTests" collections. However, this only prevents intra-collection parallelism and may not fully prevent contamination from other test classes.

The preferred approach is the ReferenceEquals-based local listener since it provides complete isolation regardless of parallelism.

This pull request was created from Copilot chat.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Microsoft Reviewers: Open in CodeFlow

Copilot AI changed the title [WIP] Fix flaky Linux resource health check test Fix flaky LinuxResourceHealthCheckTests by isolating MeterListener with ReferenceEquals Feb 14, 2026
Copilot AI requested a review from stephentoub February 14, 2026 12:52
@stephentoub stephentoub marked this pull request as ready for review February 14, 2026 15:17
@stephentoub stephentoub requested a review from a team as a code owner February 14, 2026 15:17
Copilot AI review requested due to automatic review settings February 14, 2026 15:17
@stephentoub stephentoub enabled auto-merge (squash) February 14, 2026 15:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses CI flakiness in LinuxResourceHealthCheckTests.TestCpuAndMemoryChecks_WithMetrics by avoiding cross-test metric contamination from MeterListener subscriptions that match meters globally by name, and instead isolating metric collection to the specific Meter instance created by the test.

Changes:

  • Replaced the test’s use of ResourceUtilizationHealthCheck’s internal MeterListener with a locally-scoped MeterListener filtered via ReferenceEquals(meter, instrument.Meter).
  • Captured CPU/memory utilization measurements from observable instruments and evaluated health via ResourceUtilizationHealthCheck.EvaluateHealthStatusAsync.
  • Removed unused variables (dataTracker, meterName, checkContext) and switched to string literals for instrument names to avoid type ambiguity.

@stephentoub
Copy link
Member

@copilot CI is complaining about code coverage. Can you take a look?
https://dev.azure.com/dnceng-public/public/_build/results?buildId=1294962&view=codecoverage-tab

auto-merge was automatically disabled February 15, 2026 12:00

Head branch was pushed to by a user without write access

Copy link
Contributor Author

Copilot AI commented Feb 15, 2026

@copilot CI is complaining about code coverage. Can you take a look?
https://dev.azure.com/dnceng-public/public/_build/results?buildId=1294962&view=codecoverage-tab

Fixed in f335eda. The test now calls CheckHealthAsync to maintain code coverage while using a local MeterListener with ReferenceEquals to ensure correctness and avoid flakiness from parallel test execution.

@stephentoub stephentoub merged commit 0d92f3f into main Feb 17, 2026
6 checks passed
@stephentoub stephentoub deleted the copilot/fix-flaky-healthcheck-test branch February 17, 2026 14:39
@github-actions github-actions bot locked and limited conversation to collaborators Mar 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants