Cloud-native AVaaS is there. Just take the pill.

Przemek Sempruch
5 min readOct 27, 2024

--

Photo by Volodymyr Hryshchenko on Unsplash

In my previous article, Yet Another AVaaS, This Time with Azure Functions, I discussed a DIY approach to implementing AVaaS in Azure.

Since then, Microsoft has introduced a fully managed malware scanning solution: Defender for Storage. This tool is a lifesaver for many scenarios where you must ensure that processed binaries are malware-free without spending too much time or money. It’s a perfect solution for ticking off that crucial security checkbox, such as those referenced by the ASVS framework.

What Is It?

This feature offers near-real-time integration between Azure Storage Accounts (Blob containers) and Defender for Storage. Given that Storage Accounts are a ubiquitous and security-approved component in most Azure architectures, this integration is a significant enhancement.

When a blob is uploaded to a container within a Storage Account enabled for malware scanning, the scan is automatically triggered. The process works through Storage Account notifications sent to the Azure Event Grid System Topic upon upload. These notifications send the blob metadata to a special endpoint, triggering the Defender scan. Finally, the blob is annotated with relevant index tags indicating the scan result. The tag can show one of three states: no threats found, malicious, or error.

There are many reasons why the scan may fail. It may be a permission issue, an attempt to scan a client-side key encrypted payload or a file being too big.

To retrieve the outcome, one can poll the index tag while waiting for results, but there are better options such as pushing outcome notifications through Azure Event Grid. You can also log the events with the selected Log Analytics for audit, or raise a security alert with Defender for Cloud. You can pick all or just a few of them.

The malware scanner is fully managed, meaning Microsoft maintains AV definitions for you. This is fantastic as long as your workflows can align with the integration provided by Azure. However, if you need synchronous integration, you’ll need a custom solution like Yet Another AVaaS, This Time with Azure Functions, or you can use malware scanners included in firewalls such as Azure (Web) Firewall or Barracuda CloudGen Firewall.

Example architecture

Below you can find an example of architecture implemented on one of our projects.

This diagram has been designed using resources from Flaticon.com

The diagram presents a typical workload where a binary file is uploaded from the external network and must be checked for malware before being processed by the downstream business workloads.

The web app uploads the file to the storage account which acts as a staging area (untrusted). No other system can reference the files kept there. This storage account is enabled for malware scanning. The file is annotated with a unique reference ID which is subsequently processed by the relevant domain microservice.

The Defender for Storage executes the scan, which triggers a notification on the Event Grid instance.

Result Handler polls the topic and verifies the outcome. If the outcome is no-malware, the file is copied to the trusted storage account. This account is not enabled for malware scanning. The database is updated with the outcome of scanning. When the orchestration executed by the function is successful, the function acknowledges the consumption of the data with Azure Event Grid. Any downstream process can now inspect the file status with the relevant microservice, retrieve the file based on the reference ID and location of the storage account, and serve it to its consumers.

In the event of failure, the Operations Team is notified with Azure Monitor and can take action such as:

  • re-trigger the scan
  • re-trigger the result handler through the Azure Event Grid message replay
  • removing the file from the untrusted storage account

Key takeaways

Limitations

No serious cloud provider will offer you a service without limits. Although often neglected, constraint validation is a key step before proceeding with any design — no matter whether it is cloud or not.

Here are some limitations that you should cross-validate your use case against first:

  • File size limitation — currently 2GB
  • Lack of support for client-side encrypted files — such files will not be scanned. The great news is Customer-managed keys are supported.
  • No execution time SLA — it is near real-time, but that is it. Think of it as part of a larger choreography and account for delays.
  • No support for Data Lakes (Azure Data Lake Storage Gen2) — make sure the files are scanned for malware before storing the in the lake.

Private Network support

A key feature of Defender for Storage is its ability to integrate with Storage Accounts that are exposed to private networks. This applies to both the ingress (triggering the scan) and egress (triggering the result handler) processes. We used Azure Event Grid Topic as a notification sink because it seemed the most practical and robust way to handle the scanning outcomes.
While the ingress side is straightforward, the egress side is more complex. When passing the outcome further down the consumption chain to, for example, an Azure Function App or Azure Service Bus, Azure Event Grid requires both the Azure Function App (or any HTTP Webhook) and Azure Service Bus to be exposed on the Internet at the time of writing. This may be perceived as an unnecessary security risk, especially if all your compute resources are only available in private networks, such as Azure App Service Environments. Always coordinate with your GRC department to ensure compliance!
The good news is that you can use Event Grid Pull Delivery to transfer malware-scanning outcome-related events to the Event Grid Topic and consume them via the associated Event Grid Topic Subscription. Although this is not an event-driven approach, the handler can consume the events entirely within the private network using a dedicated Private Endpoint. It works nicely with the DR site (in the secondary region), too. This greatly simplifies the architecture, especially if you have stringent security requirements.

Index Tags

When you scan a blob, the results are stored as index tags on the scanned blob. This feature is useful for scenarios where you need to search for blobs and identify those marked as positive, negative, or those where the scan failed. With these tags, you can easily detect failures and take the necessary recovery steps, such as re-triggering the scan.

Just Go For It

Microsoft’s Defender for Storage is a fantastic tool that addresses a long-standing gap in cloud services. While there are some design limitations to keep in mind, it effectively meets the needs of most use cases. Malware scanning has always been an Achilles’ heel for cloud providers, leaving many in the architecture community wondering why such a crucial service was not included sooner. But Microsoft has once again taken the lead. Now, when you need malware scanning, you can avoid the usual headaches. The solution is here!

--

--

Przemek Sempruch
Przemek Sempruch

No responses yet