Azure Storage Access Deep Dive

SqlInSix Tech Blog
8 min readFeb 15, 2024

--

Azure Storage Access Deep Dive

In this post we look at some of the basics of Azure RBAC roles with Azure storage before diving into deeper access levels with Azure storage involving other RBAC roles, custom RBAC roles and ACLs.

Questions

  • Question 1: You grant OurUser with an RBAC that gives the account permission to assign POSIX access controls. What is the minimum RBAC you assigned?
  • Question 2: You want to grant an external user with read access to a file stored in Azure file storage for 30 days. What shared access signature would you use for the least permissions (or most restriction) in terms of access?
  • Question 3: You want to set access for the user-group-other using POSIX as read, write and execute for the user, read and write for the group and read for other. What would the POSIX codes be and the corresponding numerical values?
  • Question 4: You have a team of data engineers that you want to provide appropriate access to a development storage account used for a develop Azure Synapse instance. What is the most appropriate Azure access resourc that you will use here?

Terms

  • RBAC: role-based access control. Generally, we use RBACs when we want to apply broad permissions to an Azure storage account to a user or group of users, such as making data engineers storage blob data contributors to an Azure storage account that support Azure Synapse.
  • ACL: access control list. Generally, we use ACLs when we want to apply more detailed permissions, such as permission to a folder or file within a container within a an Azure storage account (such as giving a user read access to a specific file).
  • POSIX (Portable Operating System Interface): access controls that allow more granular control over folders and files within a POSIX compliant operating system (Unix/Linux).
  • ADLS: Azure data lake storage, a storage solution in Azure for data lakes.
  • ADLS Gen1: hyper scale storage for big data workloads based on hadoop distributed file system (HDFS) that will be retired as of February 29th, 2024.
  • ADLS Gen2: an improvement on Gen1 in that it adds hierarchical namespaces with better scalability and cost. It also works easier with blob storage in that you can use both the data lake storage API and the blob storage API because it is built on top of blob storage.
  • SAS: shared access signatures allow administrators to provide restricted access to a storage account, which is done through a token string with constraints being attached to a storage URI. These constraints can applied for cases such as limiting permissions to a specific folder, limiting access for a specific time, ip address liietc. These fall into 3 categories — user delegated SAS (applies only to blob storage), service SAS (can only be used for one of the the four storage types of blob, table, queue, and file) and account SAS (applies to any storage type and is not limited to one). The generation of SAS tokens by users cannot be audited.

Basic Role-Based Access Controls

In working with Azure storage and Azure Synapse and Databricks, the below are the most common RBAC roles that you will either assign, use or have assigned to you relative to your position in the organization. Keep in mind that you can create a custom RBAC role which has a combination of access privileges along with specific access in some cases (like an ACL). A custom RBAC role is much more common in organizations than standard roles; I rarely see standard RBAC roles being used in Azure because they often give too broad of permissions. However, if you want to use Microsoft’s standard RBAC roles, the below are ones that you can find without adding your own custom roles that are generally used.

  • Owner: we’ll start with the highest permission to Azure storage as a user will have full access to the Azure storage account.
  • Contributor: while this will give a user full access to an Azure storage account, they will not be able to assign RBAC roles.
  • Reader: this will give a user the ability to view the storage account and resources within it but prevent any changes.
  • Storage blob data owner: the highest permissions access to Azure storage blob containers including the ability to assign POSIX access control.
  • Storage blob data contributor: with data contributor access, a user can read, write and delete within Azure storage containers and blobs. This will be a minimum requirement for Azure Synapse if a user will need to both read and write data.
  • Storage blob data reader: a user will have read and list access to Azure storage blobs and containers.

As a quick security note, even in a test or POC environment, I would still avoid giving owner permissions. As an example here: if I have a team of developers building a POC data warehouse in Synapse and they have all the data they’ll need in storage, there’s no need for them to have as high of access as owner.

When it comes to POSIX access controls, these allow us more granular control over folders and files within a POSIX compliant operating system (Unix/Linux). As a shallow dive into these, as this is a deep topic in and of itself, the below POSIX codes represent the levels of access for Owner/Group/Other:

  • r represents read permission; the numerical value of 4 represents this.
  • w represents write permission; the numerical value of 2 represents this.
  • x represents execute permission; the numerical value of 1 represents this.
  • - represents no permission; the numerical value of 0 represents this.

An example of this would be rwxr--r--. This means the following:

  • The owner can read, write and execute
  • The group can read
  • Other can read

Translating this into numerical values, this would equate to 744. To get these values, we add the numerical values for each permission. In the case of the owner, they have read, write and execute (4 + 2 + 1 = 7). The group only has read (4 = 4). And other only has read (4 = 4). Keep in mind that this is a very superficial dive into POSIX so that we have an understanding of where we might apply this with Azure storage. I highly recommend this article if you want to deep dive into POSIX with the Unix/Linux system.

Working With Access Control Lists (ACLs)

When assigning ACLs, ADLS Gen2 is required for the storage account in question. As we’ve highlighted above this, we would assign ACLs when we want to have extremely specific access controls to folders and files within containers of an Azure storage account. In our development example here, we’ll be using PowerShell — however you can assign these in the Azure portal by navigating to the container and folder in question then selecting the option to “Manage ACL” in the dropdown.

For PowerShell, I will assume that you already have the Az module installed and that you know how to select the appropriate subscription. In the below PowerShell, we get the storage context by using our Microsoft Entra ID. If you cannot execute this with your Microsoft Entra ID, you will need to either evaluate your permissions or use and alternate method (such as using a key). As we’ve seen earlier, you will need to be a minimum of storage blob data owner in order to assign POSIX controls.

$storagecontext = New-AzStorageContext -StorageAccountName "OurStorageAccount" -UseConnectedAccount

Next we’ll get the ACL permissions for a file and output to PowerShell’s ISE — this code assumes that you have a storage account set up with a path and a file that you can get the ACL information:

$filesystem = "OurFileSystem"
$file = "OurPath/OurFile.txt"
$result = Get-AzDataLakeGen2Item -Context $storagecontext -FileSystem $filesystem -Path $file
Write-Output $result.ACL

Let’s set the permissions that we had above this and highlight what we’ll be doing here:

  • The user can read, write and execute rwx (numerical value of 7)
  • The group can read r-- (numerical value of 4)
  • Other can read r-- (numerical value of 4)
$filesystem = "OurFileSystem"
$path= "OurPath/"
$updateacl = Set-AzDataLakeGen2ItemAclObject -AccessControlType user -Permission rwx
$updateacl = Set-AzDataLakeGen2ItemAclObject -AccessControlType group -Permission --r -InputObject $updateacl
$updateacl = Set-AzDataLakeGen2ItemAclObject -AccessControlType other -Permission --r -InputObject $updateacl
Update-AzDataLakeGen2Item -Context $storagecontext -FileSystem $filesystemName -Path $dirname -Acl $acl
$result = Get-AzDataLakeGen2Item -Context $storagecontext -FileSystem $filesystem -Path $path
Write-Output $result.ACL

Remember that you must have permissions to update the ACLs. If you do not have permissions, you will get an error.

As a general best practice, unless you have a granular view on permissions, I would avoid mixing user permissions such as using ACLs along with standard RBAC roles. AS we see with ACLs, if we are going to use these, then we need to have insight into the permissions. If we combine RBACs with ACLs, we must ensure that we have insight into both. Unfortunately in most environments, this becomes complex even when administrators feel excited about using a new Azure tool. I would highly recommend that you stick with a standard, even if that feels less exciting as I’ve seen two examples where a user had permissions through several avenues that were overlooked and caused issues when they were no longer with the company. All this being said, ACLs can enhance security and allow administrators more control over permissions if this is the preferred route.

When Would We Use An SAS Token?

After covering permissions with role based access controls and access control lists, we may feel curious about when would we use a shared access signature. When it comes to external users that will have regular access to resources and will need to develop or build on storage resources, we would seldom provide an SAS token because this may invite more problems than solve. In this case, I would turn to RBACs or ACLs.

However, SAS tokens are extremely useful when we want a limited time access policy along with heavy restrictions to access of the data. An example would be a daily file is generated and provided to external users and we provide external users with a time window of 7 days to extract the data from the file from a specific IP address range. However even with this control, because we cannot audit who generates SAS tokens, I would be extremely cautious about allowing this. If you do allow this, you will want to regularly audit Azure Monitor for who is accessing your storage account.

Answers

  • Answer 1: Storage blob data owner
  • Answer 2: Service shared access signature. A user delegated shared access signature can only be used for blob storage. An account shared access signature can be used with any type of Azure storage technology (blob, table, file, queue) and is better if you want to apply access to multiple resources. The most restricted in this case would be a service shared access signature.
  • Answer 3: rwxrw-r--; numerical value would be 764
  • Answer 4: While you could assign ACLs, that would be overkill in this instance as an RBAC will work. The reasons the RBAC is better is because it’s more broad, this is a development environment which means it has no production data, and the data engineers don’t need an owner level RBAC to work use Azure Synapse

Note: all images in the post are either created through actual code runs or from Pixabay. The written content of this post is copyright; all rights reserved. None of the written content in this post may be used in any artificial intelligence.

--

--

SqlInSix Tech Blog
SqlInSix Tech Blog

Written by SqlInSix Tech Blog

I speak and write about research and data. Given my increased speaking frequency, I write three articles per year here.

No responses yet