Schedule clean up of your sandbox tenant with Azure Automation

Photo of author

Dan Rios

4 min read

Introduction

If you’ve got a sandbox tenant which you and your teams are using to simulate certain deployments that don’t fit into the typical subscription scope (E.g. full blown Landing Zones for example) you may have a test or sandbox Azure tenant for this. In this case, you may also want to clear down Azure resources on that tenant on a daily basis to keep costs low and tenant clean for the next deployment.

I’ll detail how you can implement this with an Azure Automation runbook on a schedule with a PowerShell script.

!!! Please be extra careful implementing as this is destructive with NO WAY to retrieve the deletions !!! Proceed at your own risk, and be sure to triple check permissions and target scopes. This is to act as an example, your scenario may differ but the concept will be similar to implement.

Prerequisites

  • Azure Key Vault deployed in your main Azure tenant
  • Azure Automation Account deployed in your main Azure tenant

Authentication Setup

  1. Create a SPN (Service Principal Name) in the Azure Sandbox Tenant with a secret (take note of the secret!)
  2. Check that you have enabled Manage Identity on the Azure Automation Account
  3. Grant the Managed Identity RBAC access to the Key Vault as ‘Key Vault Secrets User’ role

Key Vault Secrets

In the Key Vault ensure the following secret values are added:

  1. Application Id of the Service Principal (not to be confused with the App registration Id that is associated to it)
  2. Service Principal secret
  3. Tenant Id of the Azure Sandbox

Sandbox RBAC

For the runbook to successfully clear down every resource group in the tenant, we must add the SPN onto the root management group where we want the script to find the subscriptions for. In your case, this may be more granular that that.

In my example I have added the Service Principal object ‘Owner’ role onto Tenant Root Group so it will inherit down to everything beneath.

Azure Automation Runbook

We’ll be using a PowerShell runbook on a linked schedule to run daily in my example, but you can amend to your needs. We don’t need to import any specific Azure modules here, as they are already baked in for use.

  1. Create a new Azure Automation runbook:
  • Runbook type: PowerShell
  • Runtime version: 7.2 (preview)

Runbook script

Once created, go to the runbook and edit in the Portal (or VSCode) and paste the script in, saving & publishing it.

Adjust the Key Vault name and secret names according to yours.

pwsh/AzureAutomationRunBooks/azure-sandbox-destroy.ps1 at main · riosengineer/pwsh (github.com)

# Connect to Key Vault with MI and get secret values
Connect-AzAccount -Identity
$tenantId = Get-AzKeyVaultSecret -VaultName "kv-rios-example" -Name "tenantId" -AsPlainText
$appId = Get-AzKeyVaultSecret -VaultName "kv-rios-example" -Name "appId" -AsPlainText
$spnsecret = Get-AzKeyVaultSecret -VaultName "kv-rios-example" -Name "secret" -AsPlainText


# Login to Sandbox Tenant via SPN
$secret = ConvertTo-SecureString -String $spnsecret -AsPlainText -Force
$pscredential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $appId, $secret
Connect-AzAccount -ServicePrincipal -Credential $pscredential -Tenant $tenantId -Verbose

# Get subscriptions
$subscriptions = Get-AzSubscription

# Loop through each subscription and delete resource groups as background job
foreach ($subscription in $subscriptions) {
    Select-AzSubscription -SubscriptionId $subscription.Id
    $resourceGroups = Get-AzResourceGroup
    foreach ($resourceGroup in $resourceGroups) {
        $lock = Get-AzResourceLock -ResourceGroupName $resourceGroup.ResourceGroupName
        if ($lock -eq $null) {
            Remove-AzResourceGroup -Name $resourceGroup.ResourceGroupName -Force -AsJob
            Write-Output "Sending delete job for Resource Group: $($resourceGroup.ResourceGroupName)."
        }
        else {
            Write-Output "Resource group $($resourceGroup.ResourceGroupName) has a resource lock present so cannot be deleted."
        }
    }
}
PowerShell

The script runs the delete request as a job, this is to combat long runtimes from certain Azure resources that can take a while to remove (or timeout). This way, the runbook won’t wait about for the return and will execute quickly.

Additionally, in the script I’m writing the outputs to the logs so we can see what the script picks up during run, this helps aid troubleshooting and auditing from the runs.

Creating the automation schedule

Now the runbook is published, we can create and associate a schedule for it to run with. Locate the Automation Account > Schedules > Add a schedule, for example:

Azure sandbox tenant automation

Lastly, back to the runbook select ‘Link to a schedule’ and select your newly created schedule to complete.

Test run

Now we’re set to go, we can see after I’ve created some dummy resource groups over in the sandbox tenant that the script detects these and outputs the information to the logs on the runbook for us! Awesome.

What if I want to exclude certain resource groups?

Adding a resource lock to the resource group (or sub-level if needed) will prevent the script from deleting the resources within. This is easier than adding more complexity logic into the script.

Source control?

I recommend implementing Azure automation source control. When making updates not only will it auto-sync the changes to the Azure runbook from the repository but is good practice for a robust CI/CD lifecycle with relevant tests before going live. Out of scope for this post but worth a mention.

Conclusion

The script is quite rudimentary in my example, but can be expanded on to include specific requirements for your needs. For example, running the deletions as jobs could pose issues where they timeout with no way to retry in the current script, so that may be something I may look to add in at a later stage.

That’s all there is to it, I hope others find the post useful!

Leave a comment


Skip to content