Azure Blob Storage Interview Questions

What is Azure Storage

Account?

It contains one or more sets (containers) of your Azure Storage Services (Azure Blobs, Azure Files, Azure Queues, and Azure Tables).
It provides a unique endpoint for each of your Azure Services, which can be accessed from anywhere in the world using HTTP or HTTPS protocols.
It is an Azure resource and is included in a resource group.

What is Azure Blob Storage? What kind of content is ideal for storing in Blob Storage?

Azure Blob Storage is an object storage solution provided by Azure that can be accessed from anywhere in the world using HTTP or HTTPS protocols.

Blob Storage is ideal for storing massive amounts of unstructured data such as text or binary data.

Images and documents that can be served directly to the browser
Text and binary files for distributed access
Streaming video and audio files
Storing backup data
Storing data for analysis

What is a container in Azure Storage?

A container organizes a set of blobs, similar to a directory in a file system. For example, a storage account can include an unlimited number of containers, and a container can store an infinite number of blobs. The container name must be lowercase.

How can you host a static website in Azure Storage Account?

Go into your storage account.
On the right side, you’ll find Data Management Section
Where you’ll find the Static website option, Click on it.
You’ll have an option to enable it.
Once you’ve done the same, you’ll find that $web container would be created.
You’ll have two fields that need to be filled, index & error document path. The error document path is optional.
You’ll also be able to see the website’s primary endpoint.
Now go to your $web container and update the files programmatically or manually.

Can we host multiple static websites on a single Azure Storage Account?

You should create a separate Azure Storage Account for each static website.

What is soft delete for blobs?

Blob soft delete protects an individual blob, snapshot, or version from accidental deletes or overwrites by maintaining the deleted data in the system for a specified period. During the retention period, you can restore a soft-deleted object to its state when it was deleted. After the retention period has expired, the object is permanently deleted.

What do you mean by blob snapshots?

A snapshot is a read-only version of a blob that’s taken at a point in time.

A snapshot of a blob is identical to its base blob, except that the blob URI has a DateTime value appended to the blob URI to indicate the time at which the snapshot was taken.

http://storagesample.core.blob.windows.net/mydrives/myvhd?snapshot=2011-03-09T01:42:34.9360000Z

How do you delete a blob with snapshots?

If a blob has snapshots, the blob cannot be deleted unless the snapshots are also deleted.

What is base azure resource does Azure Data Lake Storage Gen2 depend on?

Azure Data Lake Storage Gen2 is built on Azure Blob Storage.

Can we store hierarchical namespace structure or nested directories into blob storage?

Hierarchical namespace structure can be enabled while creating the general or premium storage account. But they have slightly higher costs compared to Block Blob Storage.

What are the different types of blobs in Azure?

Block Blobs: store text and binary data. A block blob is composed of blocks, and there can be a maximum of 50000 blocks in each blob. A block may have different sizes. The blocks can be managed individually using a set of existing commands.
Append Blobs: made up of blocks like block blobs. But, append blobs are optimized for append operations. Blocks can be appended at the end of the blob. Updating and deleting the existing blocks are not allowed.
Page Blobs: a collection of 512-byte pages optimized for random read and write operations. Page blobs store virtual hard drive (VHD) files and serve as disks for Azure virtual machines.

What are the different types of Access Tiers?

Hot: optimized for data accessed frequently. It has the highest storage cost and the lowest access cost as compared to the other tiers. Data in active use or expected to be read from or written should be kept in the hot tier.
Cool: has lower storage cost and higher access cost compared to hot storage. Data that is not used frequently but expected to be available immediately when required should be stored in the cool tier. Data should remain in this tier for at least 30 days.
Archive: has the lowest storage cost but higher data retrieval cost than hot and cool tiers. Data should remain in this tier for at least 180 days. Otherwise, a deletion charge needs to be paid. Blob data is offline in archive storage and should be rehydrated to an online tier (hot or cool) before accessing. Blob metadata remains online and can be used to list the blob.

What do we mean by Blob-level tiering?

Azure storage offers different types of access tiers allowing storage of the blob object data in the most cost-effective manner. Hot and Cool access tiers can be set at both account level and blob level. Access tier set at the account level is inherited at the blob level as well unless specified.

Can we set the Archive tier for the entire storage account?

Archive level may be set only at the blob level. For each blob, the access tier may be set if required.

Can we change the tier to archive for page-level/append blobs?

Archive storage and blob-level tiering only support block blobs.

Does Azure support Tiering for Premium storage account?

Data stored in a block blob storage account (Premium performance) cannot currently be tied to hot, cool, or archive using Set Blob Tier or Azure Blob Storage lifecycle management. Instead, to move data, you must synchronously copy blobs from the block blob storage account to the hot access tier in a different account using the Put Block From URL API or a version of AzCopy that supports this API.

Explain about Blob index tags?

As datasets get more comprehensive, finding a specific object in a sea of data can be difficult. Blob index tags provide data management and discovery capabilities by using key-value index tag attributes. You can categorize and find objects within a single container or across all containers in your storage account.

Dynamically categorize your blobs using key-value index tags
Quickly find specifically tagged blobs across an entire storage account
Specify conditional behaviors for blob APIs based on the evaluation of index tags
Use index tags for advanced controls on features like blob lifecycle management

What is Network File System (NFS) 3.0 protocol?

The NFS version 3 protocol enables safe asynchronous writes on the server, which improves performance by allowing the server to cache client write requests in memory. As a result, the client does not need to wait for the server to commit the changes to the disk, so the response time is faster.

It also supports 64-bit file sizes and offsets, allowing clients to access more than 2Gb of file data. In addition, it can use both TCP and UDP protocol over an IP network.

Explain how Azure Blob storage supports NFS 3.0 protocol?

It’s always been a challenge to run large-scale legacy workloads, such as High-Performance Computing (HPC) in the cloud. One reason is that applications often use traditional file protocols such as NFS or Server Message Block (SMB) to access data.

NFS 3.0 protocol support requires blobs to be organized into a hierarchical namespace. You can enable a hierarchical namespace when you create a storage account. The ability to use a hierarchical namespace was introduced by Azure Data Lake Storage Gen2.

Explain about types of Azure Storage redundancy?

Azure Storage constantly stores multiple copies of your data to protect it from planned and unplanned events, including transient hardware failures, network or power outages, and massive natural disasters.

When deciding which redundancy option is best for your scenario, consider the tradeoffs between lower costs and higher availability.

Locally redundant storage (LRS) copies your data synchronously three times within a single physical location in the primary region. LRS is the least expensive replication option but is not recommended for applications requiring high availability or durability.
Zone-redundant storage (ZRS) copies your data synchronously across three Azure availability zones in the primary region. Microsoft recommends using ZRS in the primary region for applications requiring high availability and replicating to a secondary region.
Geo-redundant storage (GRS) copies your data synchronously three times within a single physical location in the primary region using LRS. It then copies your data asynchronously to a single physical location in the secondary region. Finally, within the secondary region, your data is replicated synchronously three times using LRS.
Geo-zone-redundant storage (GZRS) copies your data synchronously across three Azure availability zones in the primary region using ZRS. It then copies your data asynchronously to a single physical location in the secondary region. Finally, within the secondary region, your data is replicated synchronously three times using LRS.

How many Availability zones are present in a region?

At least 3 physical datacenters would be present for a region, so if we choose ZRS, 3 copies of data will sit in different datacenters.

Can we use Azure Managed Disks in ZRS/GRS/GZRS option?

No, Azure Managed Disks are only available for LRS (Locally redundant storage).

Explain Object replication for block blobs?

Object replication asynchronously copies block blobs between a source storage account and a destination account. Some scenarios supported by object replication include:

Minimizing latency. Object replication can reduce latency for reading requests by enabling clients to consume data from a region that is in closer physical proximity.
Increase efficiency for compute workloads. For example, compute workloads can process the same sets of block blobs in different regions with object replication.
Optimizing data distribution. You can process or analyze data in a single location and then replicate the results to additional regions.
Optimizing costs. You can reduce expenses after your data has been replicated by moving it to the archive tier using life cycle management policies.

Explain differences between page blob vs. block blog?

Page blobs are a collection of 512-byte pages optimized for random read and write operations. To create a page blob, you initialize the page blob and specify the maximum size the page blob will grow. To add or update the contents of a page blob, you write a page or pages by specifying an offset and a range that both align to 512-byte page boundaries. A write to page blob can overwrite just one page, some pages, or up to 4 MiB of the page blob. Writes to page blobs happen in place and are immediately committed to the blob. The maximum size for a page blob is 8 TiB.

Block blobs are optimized for uploading large amounts of data efficiently. Block blobs are composed of blocks, each of which is identified by a block ID. A block blob can include up to 50,000 blocks. Furthermore, each block in a block blob can have different sizes, up to the maximum extent permitted for the service version in use, i.e., the latest is 4000 MB.

What do you mean by lease blob in Azure storage?

The Lease Blob operation creates and manages a lock on a blob for write and deletes operations. The lock duration can be 15 to 60 seconds or can be infinite.

The Lease Blob operation can be called in one of five modes:

First, acquire to request a new lease.
Renew to renew an existing lease.
Change to change the ID of an existing lease.
Release, to free the lease if it is no longer needed so that another client may immediately acquire a lease against the blob.
Break, to end the lease but ensure that another client cannot acquire a new lease until the current lease period has expired.

What to do if we want Multi-region static website hosting using an Azure storage account?

If you plan to host a website in multiple geographies, we recommend using a Content Delivery Network for regional caching. Use Azure Front Door if you want to serve different content in each region. It also provides failover capabilities.