Operator Troubleshooting¶
This page documents the expected failure modes for a Ranger run and how operators should interpret them.
First Principle¶
A partial run is still useful.
Ranger should preserve successful domain results even when some targets are unreachable, some credentials are missing, or some optional integrations are not present.
Collector States¶
The manifest should clearly distinguish:
success— the domain ran and returned the expected datapartial— the domain ran but some evidence or subtargets were unavailablefailed— the domain should have run but could not completeskipped— the domain was not run because inputs, credentials, or operator intent excluded itnot-applicable— the domain is not relevant to the detected environment
Common Problems¶
Arc Run Command Transport Fails¶
Symptoms:
- collectors are attempted via Arc but return non-zero exit codes or empty output
- manifest shows
status: skippedwith a message referencing Arc transport unavailability
Check:
Az.ConnectedMachinemodule is installed:Get-Module Az.ConnectedMachine -ListAvailable- active Az context exists:
Get-AzContext - nodes are registered as Arc-enabled servers in the configured subscription and resource group
- the Az identity has
Microsoft.HybridCompute/machines/runCommands/actionon the machines behavior.transportis set toautoorarcin config —winrmwill not attempt Arc fallback
Connectivity Matrix Shows Unexpected Posture¶
Symptoms:
- run reports
disconnectedposture when cluster nodes should be reachable - collectors are skipped that you expected to run
Check:
manifest.run.connectivity.cluster.targets— confirms which hosts were probed and whether they responded on ports 5985 or 5986- WinRM TrustedHosts configuration on the runner
- DNS resolution for node FQDNs from the runner
behavior.degradationMode— set tograceful(skip) orstrict(fail) as intended
To inspect the connectivity matrix after a run:
$m = Get-Content .\manifest\audit-manifest.json | ConvertFrom-Json
$m.run.connectivity | ConvertTo-Json -Depth 5
Progress Bars Not Showing¶
Symptoms:
-ShowProgresswas passed but no progress display appears
Check:
PwshSpectreConsoleis installed:Get-Module PwshSpectreConsole -ListAvailable- the session is interactive — progress is automatically suppressed in CI (
$env:CI,$env:GITHUB_ACTIONS, etc.) and when-Unattendedis set - the terminal supports ANSI escape sequences (Windows Terminal, VS Code terminal, or iTerm2 work; classic
cmd.exedoes not)
If PwshSpectreConsole is absent, Ranger falls back to Write-Progress silently — this is expected.
WinRM Access Fails¶
Symptoms:
- cluster, storage, networking, VM, or management-tool domains fail early
- node-level collectors never start
Check:
- credential validity
- WinRM reachability from the jump box
- DNS resolution for node names
- firewall rules on the management path
Azure Authentication Fails¶
Symptoms:
- Azure integration, monitoring, policy, update, and backup domains fail or skip
Check:
- current Az context or service principal configuration
- subscription and tenant alignment
- RBAC scope for the selected resource group or Azure Local instance
Hardware Domain Shows skipped With No BMC Configured¶
Since v2.6.5 (#316), the hardware collector is automatically removed from the run when targets.bmc.endpoints is empty and the hardware domain was not explicitly included via domains.include. This is expected behavior — no misleading skipped entry will appear in the log.
If you want hardware collection but see skipped:
- Confirm
targets.bmc.endpointscontains at least one IP or hostname in your config, or - Pass BMC IPs when prompted — in interactive sessions Ranger asks
Include BMC / iDRAC hardware collection? [Y/N]before collectors run, or - Add
hardwaretodomains.includein your config to force-include it (you will then be prompted for BMC endpoints or the collector will fail with a clear error)
Redfish or iDRAC Access Fails¶
Symptoms:
- hardware or OEM-management domains skip or fail with a BMC connectivity error
Check:
- BMC reachability from the workstation
- correct endpoint names or IPs
- credential validity
- certificate or TLS trust posture
Key Vault Resolution Fails¶
Symptoms:
- Ranger cannot resolve one or more
keyvault://references - local-identity discovery lacks Key Vault-backed detail
Check:
- the Key Vault exists in the correct subscription and resource group
- the secret name and optional version are correct
- the calling identity has the required Key Vault role assignments
- network reachability to the Key Vault endpoint exists
Local Identity with Key Vault Tooling Confusion¶
Symptoms:
- an operator expects WAC or SCVMM behavior that does not work in a local-identity deployment
Check:
- whether the environment is AD-backed or local identity with Key Vault
- documented tool support limitations for local-identity deployments
Current Microsoft documentation states that Windows Admin Center is not supported in Azure Key Vault-based identity environments and SCVMM support is limited or unsupported.
Azure Local VM Management Prerequisites Missing¶
Symptoms:
- Arc VM or Azure Local VM management resources are absent or unhealthy
Check:
- Arc Resource Bridge exists and is healthy
- Custom Location exists and is healthy
- Azure Local instance, resource bridge, and related resources are in the same supported region
- firewall URL requirements are satisfied
Disconnected Environment Assumptions Leak Into Connected Docs¶
Symptoms:
- operators expect public-cloud Azure behavior in a disconnected deployment
Check:
- whether the environment is actually a disconnected-operations deployment
- whether the required local control-plane prerequisites exist
- whether a dedicated management cluster exists where the documented disconnected model requires it
Run Failed Before Any Log Entries Appeared¶
Before v2.6.5, the log file was not opened until after config loading, auto-discovery, and validation completed — so a failure during that phase produced either no log file or an empty one with only the header line.
Since v2.6.5 (#318), ranger.log begins with a # bootstrap phase section that captures every Write-RangerLog call made during config load, auto-discovery, and validation. The first line is always the invocation parameters:
# bootstrap phase
[2026-04-17T10:32:01][INFO ] Invoke-AzureLocalRanger: invoked — ConfigPath='.\ranger.yml'
[2026-04-17T10:32:02][INFO ] Invoke-RangerAzureAutoDiscovery: tenantId '...' sourced from active Az session
...
# run phase
[2026-04-17T10:32:05][INFO ] AzureLocalRanger run started — package: tplabs-current-state-20260417T103201Z
If the run failed before a package root could be created (e.g., the output path is invalid), there will still be no log file — look at the PowerShell error output in that case. For everything else, the bootstrap section tells you exactly what Ranger saw when it started.
To get the most detail in the bootstrap section, set behavior.logLevel: debug in your config or pass -Verbose to the PowerShell session before invoking.
What To Do With Missing Data¶
Missing data should not be hidden. Ranger should surface one of these explanations:
- target unreachable
- credential unavailable
- not applicable to current topology
- feature not supported in current release
- skipped by operator configuration
That explanation is part of the artifact quality, not an implementation detail.
Escalation Path¶
When a run result is surprising, operators should narrow the problem in this order:
- validate configuration
- validate target reachability
- validate credentials and RBAC
- rerun only the affected domains
- review collector status and provenance in the manifest