Secure AWS to Azure integration

Przemek Sempruch
7 min readMay 30, 2022
Photo by Redd on Unsplash

Building cloud-based solutions in a single cloud is not often enough today. Cloud platforms such as AWS, Azure, GCP, etc. offer unique features and services that you would like to leverage no matter where your primary workloads are deployed to.

Some say development in all cloud providers are similar. This is often true from a functional standpoint, however, when it comes to technical implementation, it is where similarities usually end. In this article I would like to take you through the journey that will guide you how to build a scalable bridge between AWS and Azure to overcome those differences.

The problem

AWS and Microsoft have implemented their cloud security fabric differently. AWS bet on the totally bespoke IAM controls as opposed to Microsoft which invested into industry standard, OAuth 2.0/OpenID protocol for Azure. Both have their pros and cons, but ultimately there is no out-of-the-box way to translate one into another in a secure, scalable way involving minimum maintenance.

A generic way…

As a quick reminder, any resource residing outside Azure Subscription can acquire an access token with Azure AD over Client Credentials workflow involving Service Principal clientID and clientSecret set of credentials which are used to authenticate to Azure AD. While requesting an access token with Azure AD, the caller specifies the scopes, it wants to get a token for.

So what does it mean for workloads deployed to AWS? Imagine a most basic scenario where the service needs to consume Azure AD protected resource. The resource may be (the list is not exhaustive):

  • a native Azure resources such as Azure Service Bus
  • a custom resource deployed with Azure App Service (including Azure Functions) or behind Azure API Management
  • Microsoft Dataverse (or generically Power Platform) resource e.g. a table
  • Microsoft Graph API

Your AWS service must reference clientID and clientSecret generated for the Service Principal in Azure AD authorised to some of the above items. Ideally these credentials are stored with AWS Secret Manager or AWS SSM.

The above works for a simple deployment, but may not be ideal for larger production ones for the reasons below.

Scale, security and maintenance

Imagine you have multiple workloads in AWS that need to access an Azure AD protected resource. This means that these credentials must be referenced by multiple AWS service instances. You need to make sure they are correctly referenced and in the longer term rotated in a way it does not interfere with your workloads.

As well, your Azure team may not be the same team that operates AWS. The coordination needed whenever those credentials must be rotated with Azure AD, whether as a result of periodic rotation or security incident response, is an overhead one ideally would not like to deal with.

Going even further, imagine you would like to access many Azure AD protected resources and be able to tell which AWS workloads can access specific Azure AD resources. It gets a bit more complicated, does it not ? One could introduce another bunch of Service Principals with Azure AD, but that would only lead to increased maintenance overhead.

The solution

It must be stated that the above problems are not blockers to implementing your AWS-2-Azure integration. Implementing the basic scenario may be the right thing for you and your project. Larger scale projects usually look for maximum optimisation of risk and maintenance overhead. The upfront investment into reusable blocks is less of a hassle to them. Yet, it is fair to say, this solution can be packaged as a module and ultimately serve as an accelerator for any project.

The cornerstone for this solution is feature that Microsoft introduced some time ago — a Workload Identity Federation (WIF). It allows to integrate external workloads with Azure AD protected resources without reliance on confidential information such as Service Principal clientSecret. If you’re familiar with a concept of Azure Managed Identity available in Azure Subscriptions, WIF is equivalent of it for external workloads. At the time of writing WIF is in Preview so it has its ups and downs. Please refer to the documentation (https://docs.microsoft.com/en-us/azure/active-directory/develop/workload-identity-federation) to understand the orchestration and the required configuration between the external identity provider and Azure AD to issue access tokens.

It is worth saying that while WIF integrates out-of-the-box with Google Cloud Platform Service Accounts as they use the same protocol — OpenID, since AWS security platform is completely different, one needs to implement a translation mechanism to achieve it.

The proposed solution uses AWS-2-Azure Broker whose responsibility is to seamlessly provide Azure AD access tokens to AWS workloads.

The AWS-2-Azure Broker is a packaged set of AWS resources:

  • API Gateway
  • Handlers Lambda Function

An HTTP API exposed over AWS API Gateway consists of 2 OpenID Discovery endpoints:

  • OpenID Well-Known endpoint (GET /.well-known/openid-configuration, which presents an OpenID compliant configuration.
  • JWKS endpoint (GET /jwks), which presents JWKSet compliant repository of public keys used to verify the signature of the token

and one Token Endpoint (GET /token/{scope}), which accepts scope, as a parameter, to be requested with Azure AD.

The Discovery Endpoints will be exposed to the wide Internet since they will be queried by Azure AD while validating incoming tokens (in the next section I will share a suggestion how to restrict it a bit more). The Token Endpoint is secured with AWS IAM with a granularity to an Azure AD scope. This allows to create identity-based IAM policies that grant access to AWS API Gateway endpoints and to assign those policies to IAM Roles used by specific AWS services. With that mapping in place you can achieve a fine-grained and flexible security as assigning new or removing old permissions from the services happens instantly as part of AWS IAM configuration. An example IAM policy allowing access to the scope may look like this:

Okay, but how does AWS-2-Azure Broker generate a token for Azure AD for it to validate with Discovery Endpoints to eventually exchange it to Azure AD access token?

As part of Lambda handler for the Token Endpoint, a one-off JWT token is generated and signed with Azure AD compatible algorithm — at the moment RSA private/public key pair is recommended. Doesn’t it mean that you need to store the private key somewhere ? Yes, you do. Storing the private key is required, but, as opposed to Service Principal clientSecret generated with Azure AD, the private key must be generated at AWS and managed in AWS rather than Azure. It is a small change, but important from maintenance point of view. The secret can be rotated periodically and as long as the associated public key data related to the newly generated private key is exposed under JWKS endpoint, the integration will work just fine.

With the above design we have achieved a universal, scalable and easy-to-maintain integration between AWS and Azure workloads. Still, there are a few ideas that could further improve this setup.

Tired of design concepts and looking for code? Stay tuned, I will try to provide a few code snippets that are key to this.

Ideas for improvements

Azure AD Access Tokens caching

Acquired Azure AD access tokens can and should be cached. They should be cached as long as the exp claim on an Azure AD access token allows for it so AWS API Gateway provided cache is not a good answer. I suggest using DynamoDB that tends to go great hand-in-hand with AWS Lambda.

Discovery Endpoints caching

Tokens caching is important as it has the greatest impact on the performance of the integration. If you are looking for even better results, make sure Azure AD does not query your Discovery Endpoints too often. You can also serve those requests from a cache, but, ensure the cache is reloaded whenever private keys are updated.

The setup works without that caching, but it slows things down and also unnecessarily adds to your bill. According to the OpenID specification, one can instruct the client of the Well-Known OpenID configuration and JWKS endpoints to cache the content using Cache-Control header. At the moment, it is hard to say if Azure AD respects it, though. The caching behaviour of Azure AD is inconsistent around it. It may have to do with Preview.

Further Securing Token Endpoint

The design assumes Token Endpoint security is based on IAM security. It may be further enhanced to accept traffic only from the VPC or or VPC Endpoint of your choice.

Further securing Discovery Endpoints

You may not feel comfortable with exposing Discovery Endpoints to the public due to strict security policies enforced on your project. What could help you to meet your security requirements is to whitelist Azure AD IP ranges that are published by Microsoft at https://www.microsoft.com/en-us/download/details.aspx?id=56519 knowing that only Azure AD will be querying it. I suggest using Azure WAF to do it as the number of IP ranges to whitelist is quite big and is unlikely to fit your AWS API Gateway IAM Resource policy.

--

--