Task 03: Enable HCI Insights
DOCUMENT CATEGORY: Runbook SCOPE: HCI Insights monitoring integration PURPOSE: Enable Azure Local Insights workbook for cluster health and performance monitoring MASTER REFERENCE: Microsoft Learn - Azure Local Insights
Status: Active
Azure Local Insights provides a rich, pre-built Azure Monitor Workbook that visualizes cluster health, node status, storage performance, and VM state. Enabling Insights automatically installs the Azure Monitor Agent (if not already present) and configures the required Data Collection Rules for HCI-specific telemetry.
Prerequisites
| Requirement | Description | Validation |
|---|---|---|
| Log Analytics Workspace | Created in Step 1 | Workspace accessible |
| Azure Monitor Agent | Installed on cluster nodes (Step 2) | Extension status: Succeeded |
| Arc-Enabled Cluster | Azure Local cluster registered | Portal shows cluster resource |
| RBAC Permissions | Monitoring Contributor | Role assignment verified |
Variables from variables.yml
| Variable | Config Path | Example |
|---|---|---|
AZURE_SUBSCRIPTION_ID | azure.subscription.id | 00000000-0000-0000-0000-000000000000 |
AZURE_SUBSCRIPTION_NAME | azure.subscription.name | Azure Local Production |
AZURE_RESOURCE_GROUP | azure.resource_group.name | rg-azurelocal-prod-eus2 |
CLUSTER_NAME | cluster.name | azl-dal-cl01 |
LOG_ANALYTICS_WORKSPACE_NAME | monitoring.log_analytics.workspace_name | law-azl-DAL-prod-01 |
SITE_CODE | site.code | DAL |
CLUSTER_NODE_01_NAME | nodes[0].name | azl-dal-node-01 |
Overview
HCI Insights collects data from specific Windows Event Log channels and performance counters:
| Data Type | Source | Purpose |
|---|---|---|
| Health Events | Microsoft-Windows-Health/Operational | Node and component health status |
| SDDC Events | Microsoft-Windows-SDDC-Management/Operational | Cluster management events |
| Memory | Memory\Available Bytes | Memory utilization |
| Network | Network Interface(*)\Bytes Total/sec | Network throughput |
| CPU | Processor(_Total)\% Processor Time | CPU utilization |
| RDMA | RDMA Activity(*)\RDMA Inbound/Outbound Bytes/sec | Storage network performance |
Configuration Options
- Azure Portal
- Orchestrated Script (Azure Policy)
- Standalone Script
Step 3.1: Enable Insights from Cluster Resource
- Navigate to Azure Portal → Azure Local → Select your cluster
- Under Capabilities tab, locate the Insights tile
- Click Insights to open the configuration page
- Click Get Started
Step 3.2: Configure Data Collection Rule
- On the Insights configuration page:
- Select an existing DCR from the dropdown, OR
- Click Create New to create a dedicated Insights DCR
- For New DCR:
| Setting | Value |
|---|---|
| Subscription | {{AZURE_SUBSCRIPTION_NAME}} |
| DCR Name | AzureStackHCI-{{CLUSTER_NAME}}-dcr |
| Data Collection Endpoint | dce-{{SITE_CODE}}-azl-01 |
- Click Review + create
When you configure Insights, Azure Monitor Agent is automatically installed on all cluster nodes if not already present. The DCRs created by Insights have the prefix AzureStackHCI-.
Step 3.3: Verify Insights Configuration
- Return to your cluster's Capabilities tab
- The Insights tile should now show Configured
- Click Insights to view the workbook
For organizations with multiple Azure Local clusters, use Azure Policy to enable Insights at scale:
Step 3.1: Assign Built-in Policy
- Navigate to Azure Policy → Definitions
- Search for "Azure Local" or "HCI Insights"
- Select Configure Azure Local machines to be associated with a Data Collection Rule
- Click Assign
Step 3.2: Configure Policy Parameters
| Parameter | Value |
|---|---|
| Scope | Management group or subscription containing clusters |
| Data Collection Rule Resource ID | Resource ID of your DCR |
| Data Collection Endpoint Resource ID | Resource ID of your DCE |
- Click Review + create → Create
Step 3.3: Trigger Remediation
- Navigate to Policy → Compliance
- Find the assigned policy
- Click Create Remediation Task to apply to existing clusters
#Requires -Modules Az.Monitor, Az.StackHCI
# Variables
$SubscriptionId = "{{AZURE_SUBSCRIPTION_ID}}"
$ResourceGroup = "{{AZURE_RESOURCE_GROUP}}"
$ClusterName = "{{CLUSTER_NAME}}"
$WorkspaceName = "{{LOG_ANALYTICS_WORKSPACE_NAME}}"
$SiteCode = "{{SITE_CODE}}"
# Connect to Azure
Connect-AzAccount -Subscription $SubscriptionId
# Get cluster resource
$cluster = Get-AzStackHciCluster `
-ResourceGroupName $ResourceGroup `
-ClusterName $ClusterName
# Get workspace
$workspace = Get-AzOperationalInsightsWorkspace `
-ResourceGroupName $ResourceGroup `
-Name $WorkspaceName
# Enable Insights by creating the required DCR
# Note: For Azure Local 23H2+, Insights is enabled through the portal
# This script prepares the DCR that Insights will use
$dcrName = "AzureStackHCI-$ClusterName-dcr"
Write-Host "To enable Insights:" -ForegroundColor Cyan
Write-Host "1. Navigate to Azure Portal → Azure Local → $ClusterName"
Write-Host "2. Click on Insights tile under Capabilities"
Write-Host "3. Select or create DCR: $dcrName"
Write-Host "4. Workspace: $WorkspaceName"
# Verify cluster is ready for Insights
$arcServers = Get-AzConnectedMachine -ResourceGroupName $ResourceGroup |
Where-Object { $_.Name -like "*$ClusterName*" -or $_.Name -like "{{CLUSTER_NODE_01_NAME}}*" }
Write-Host "`nCluster nodes registered with Arc:" -ForegroundColor Yellow
$arcServers | ForEach-Object { Write-Host " - $($_.Name): $($_.Status)" }
Using the Insights Workbook
Once enabled, the Insights workbook provides several views:
Cluster Health Overview
| Metric | Description | Alert Threshold |
|---|---|---|
| Cluster Health | Overall cluster status | Warning/Critical |
| Node Health | Individual node status | Any node unhealthy |
| Storage Health | Storage pool and volume status | Degraded/Unhealthy |
| VM Health | Virtual machine states | Failed VMs |
Performance Monitoring
The workbook displays:
- CPU utilization across all nodes
- Memory availability trends
- Network throughput (total bytes/sec)
- RDMA performance for storage traffic
- Storage latency for CSV volumes
Navigation Tabs
| Tab | Content |
|---|---|
| Overview | Cluster summary, health status, quick stats |
| Nodes | Per-node CPU, memory, network metrics |
| Storage | Volume health, capacity, latency |
| VMs | VM count, state distribution |
Validation
Verify Insights Status
# Check cluster Insights configuration
$cluster = Get-AzStackHciCluster `
-ResourceGroupName "{{AZURE_RESOURCE_GROUP}}" `
-ClusterName "{{CLUSTER_NAME}}"
# Check for AMA extensions on nodes
$nodes = @("{{CLUSTER_NODE_01_NAME}}", "{{CLUSTER_NODE_02_NAME}}")
foreach ($node in $nodes) {
$ext = Get-AzConnectedMachineExtension `
-ResourceGroupName "{{AZURE_RESOURCE_GROUP}}" `
-MachineName $node `
-Name "AzureMonitorWindowsAgent"
Write-Host "$node AMA Status: $($ext.ProvisioningState)"
}
Verify Data Collection
Run these queries in Log Analytics to confirm data is flowing:
// Check Health events
Event
| where Source == "Microsoft-Windows-Health"
| where TimeGenerated > ago(1h)
| summarize count() by Computer
| order by Computer asc
// Check SDDC Management events
Event
| where Source == "Microsoft-Windows-SDDC-Management"
| where TimeGenerated > ago(1h)
| summarize count() by Computer
Sample Health Query
// Cluster health summary
Event
| where Source == "Microsoft-Windows-SDDC-Management"
| where EventID == 3000 // Server health event
| where TimeGenerated > ago(24h)
| extend ParsedData = parse_json(RenderedDescription)
| project TimeGenerated, Computer, HealthState = ParsedData.HealthState
| summarize LatestHealth = arg_max(TimeGenerated, *) by Computer
Troubleshooting
| Issue | Possible Cause | Resolution |
|---|---|---|
| Insights shows "Not configured" | AMA not installed | Check Extensions on cluster nodes |
| No data in workbook | DCR not associated | Verify DCR associations in Monitor → DCRs |
| Stale data (>15 min old) | Agent connectivity issue | Check azcmagent show on nodes |
| Missing health events | Event log channel not enabled | Verify Windows Event Log settings |
| Cluster shows "Other" status | Recent Arc reconnection | Wait for next health check cycle |
Event Log Verification
On each cluster node, verify the required event logs are enabled:
# Run on each cluster node
$logs = @(
"Microsoft-Windows-Health/Operational",
"Microsoft-Windows-SDDC-Management/Operational"
)
foreach ($log in $logs) {
$logInfo = Get-WinEvent -ListLog $log -ErrorAction SilentlyContinue
if ($logInfo) {
Write-Host "✅ $log - Enabled: $($logInfo.IsEnabled)" -ForegroundColor Green
} else {
Write-Host "❌ $log - Not found" -ForegroundColor Red
}
}
Variables Reference
| Variable | Description | Example |
|---|---|---|
{{CLUSTER_NAME}} | Azure Local cluster name | azl-dal-cluster-01 |
{{CLUSTER_NODE_01_NAME}} | First node hostname | azl-dal-n01 |
{{LOG_ANALYTICS_WORKSPACE_NAME}} | Workspace name | law-azl-dal-prod-01 |
Next Steps
After enabling HCI Insights:
- ➡️ Task 4: Setup Alerting — Configure alert rules based on Insights data
- Review the Insights workbook for baseline understanding
- Bookmark key workbook views for operational monitoring
- Consider enabling at scale using Azure Policy for multiple clusters
Navigation
| Previous | Up | Next |
|---|---|---|
| ← Task 02: Azure Monitor Agent | Phase 02: Monitoring & Observability | Task 04: Setup Alerting → |
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0.0 | 2026-03-24 | Azure Local Cloudnology Team | Initial release |