Skip to main content
Version: Next

Part 7: Cluster Testing & Validation

Runbook Azure

DOCUMENT CATEGORY: Runbook SCOPE: Production readiness validation PURPOSE: Validate cluster for production workloads and customer handover MASTER REFERENCE: Microsoft Learn - Validate Deployment

Status: Active


Overview

This section covers comprehensive technical validation of the Azure Local cluster to ensure production readiness before customer handover. Each step generates a validation report that feeds into the consolidated handover documentation.

Purpose

  • Validate all cluster components are functioning correctly
  • Document performance baselines for storage and network
  • Test high availability and disaster recovery capabilities
  • Verify security posture and compliance
  • Generate reports for customer handover package

Validation Steps

StepDescriptionDurationDeliverable
1Infrastructure Health Validation1-2 hoursHealth validation report
2VMFleet Storage Performance Testing4-6 hoursStorage performance baseline report
3Network & RDMA Validation1-2 hoursNetwork validation report
4High Availability Testing2-4 hoursHA/failover test results, RTO documentation
5Security & Compliance Validation1-2 hoursSecurity posture report
6Backup & DR Validation2-4 hoursBackup/restore test results

Total Estimated Duration: 12-20 hours (steps can be parallelized)


Prerequisites

Before starting validation:

  • All cluster deployment stages complete (Parts 1-6)
  • Administrative access to all cluster nodes
  • Windows Server 2022 Core VHD available (for VMFleet)
  • Maintenance window scheduled (VMFleet creates significant I/O load)
  • Test VMs will be created during validation (deleted after)

Test Environment Setup

Dedicated test VMs are created during Step 4 (HA Testing):

Test VMOSPurposeDelete After
TEST-WIN-01Windows Server 2022HA testing, backup testing, connectivityYes
TEST-WIN-02Windows Server 2022Live migration target, failover testingYes
TEST-LNX-01Ubuntu 22.04 LTSCross-platform validation, network testingYes

Validation Reports

Each step generates a separate report stored in a central location:

\\<ClusterName>\ClusterStorage$\Collect\validation-reports\
├── 01-infrastructure-health-report-YYYYMMDD.txt
├── 02-vmfleet-storage-baseline-YYYYMMDD.csv
├── 02-vmfleet-storage-summary-YYYYMMDD.txt
├── 03-network-rdma-validation-YYYYMMDD.txt
├── 04-ha-failover-test-results-YYYYMMDD.txt
├── 05-security-compliance-report-YYYYMMDD.txt
├── 06-backup-dr-validation-YYYYMMDD.txt
└── CONSOLIDATED-VALIDATION-REPORT-YYYYMMDD.pdf

Consolidated Handover Report

After all steps complete, generate consolidated report:

# Script to consolidate all validation reports
# Located in: scripts/validation/New-ConsolidatedValidationReport.ps1

The consolidated report includes:

  • Executive summary (pass/fail status for each category)
  • Storage performance baseline metrics
  • Network performance metrics
  • HA/failover test results with RTO/RPO
  • Security compliance status
  • Backup validation results
  • Recommendations and known issues

Success Criteria

Production Readiness Checklist

CategoryCriteriaStatus
InfrastructureAll nodes healthy, Test-Cluster passes, no health faults
StorageVMFleet baseline documented, performance within expected range
NetworkRDMA operational, DCB configured, all VLANs accessible
High AvailabilityLive migration < 5s, node failover < 2 min, quorum maintained
SecurityDefender score > 80%, no critical recommendations, RBAC verified
BackupBackup job successful, restore test passed, RPO/RTO documented

Validation Sign-Off

RoleNameDateSignature
Azure Local Cloud Engineer
Azure Local Cloud QA
Project Manager

Next Steps

After all validation steps complete with passing status:

  1. Generate consolidated validation report
  2. Archive individual reports to handover package
  3. Clean up test VMs and VMFleet resources
  4. Proceed to Part 8: Validation & Handover for team sign-offs