Title image of How to store data in Azure

How to store data in Azure

2 July 2022

·
Azure

Cloud providers have grown huge.

They have developed a massive list of services that can be really confusing.

Most of the services are made for enterprise companies to entice them onto the cloud. Most of us will never need any of them.

We only really need the core Azure services to make our application work.

Finding these core services through the others is tough which is why I wanted to write this blog post.

There are loads of data storage options in Azure but which one is the best? I’ve broken this question down into different data types.

Then added a summary of the best options that Azure has for each data type. Including the most notable and useful features for each service.


Tabular Data

Let’s start with the most common data type, tabular data.

Data is stored in tables with columns and rows.

Most of the time, tables can be related to each other to show the relationships in the data.

An example of tabular data could be a table containing information about an application’s users:

UserIdEmailPassword HashProfile Picture
5761[email protected]cea6d76a1abf56e23b09f873847b9942raj-profile1.jpeg
5762[email protected]f1714434fcf2012d9ee452ee6f0bdf50penny-profile1.jpeg
5763[email protected]11026e7d4395d27454c39477eb528aadsheldon-profile3.jpeg
5764[email protected]5da5f1466e81f2a213c854363848bd40amy-profile2.jpeg

Example of tabular data showing users

Azure SQL Database

Azure SQL icon

SQL is by far the most used database tool. Many databases use it to interact with the data and manage the database itself.

Microsoft has its own database system that uses it called SQL Server. Azure provides managed versions of SQL Server which is probably the best way to store tabular data in Azure.

To make things confusing there are a few different SQL server services within Azure. They cover specific scenarios like migrating legacy on-premise SQL servers to Azure or SQL for edge computing and IoT devices. For a Cloud-native application, you should start with Azure SQL Database.

Notable features

  • Serverless Pricing Tier - The price of a database is split between storage and compute. Compute is used when the database does stuff like executing quires. With the serverless pricing tier, your database will scale the compute depending on the usage. So when your database is not receiving any requests it will scale down and not cost you anything! This is awesome for databases with sporadic usage.
  • Geo-replication - Automatically replicate data from your primary database to a secondary database in another region.
  • Encryption - Azure SQL Database makes protecting your data easy. All of the data in the database will be encrypted at rest by checking a checkbox.
  • Elastic Pool https://docs.microsoft.com/en-gb/azure/azure-sql/database/elastic-pool-overview?view=azuresql

Table Storage

Table storage is a favorite of mine. It’s an underrated service.

Azure table icon

That’s a table!

It’s massively scalable and a lot cheaper than a traditional SQL database.

It is consumption-based so you don’t pay for things like RAM and CPU like a traditional database. Only pay for the amount of data you store and the number of transactions you execute.

It’s very easy to set up. No time is wasted on defining the columns of your data because it is schemaless.

The downside of Table storage is that it’s not as flexible as normal SQL. It doesn’t have the basic database operations that you would expect. For example, queries are very limited because there is no concept of indexes in table storage.

Partition KeyRow KeyUsernameCountry
gmail.comdave123Dave WattsUnited Kingdom
gmail.comspuddy8Sam PuddyUnited States
outlook.comrossyrooRossy CarterFrance
outlook.comniros67Hollie NirosSpain

Example of data in Table Storage

Notable Features

  • Partitions and Row keys - A partition key and row key are the only schemas needed. These dictate how the data is stored and indexed in Table Storage. It’s an important decision that depends on the data and the number of reads/writes. Microsoft gives good recommendations in their documentation about how to choose partition and row keys.
  • Redundancy - Like all good Azure services table storage has built-in data redundancy. Data can be copied automatically within data center availability zones.

Real-life example of Azure Table Storage

Splash screen of Have I been pwned

I hope not!

Have I been pwned used Table storage in a previous iteration of the data breach service

Troy Hunt (no relation) who is a web security consultant and Microsoft MVP runs the service. As part of HIBP he used table storage to store 154 million records.

He found that storing 100GB of data and hitting it 10 million times only costs $8 a month. The also found the query speed very good. Results were returning in as little as 4 milliseconds.

Read more on his usage of Table storage in his very detailed blog post.


NoSQL Document Data

Relational databases are cool but a lot of planning is needed to design the tables and relations. They still power most of the applications we use today but many developers are caring less about the data layer and just want something simple and easy so they can focus on other things.

Instead of storing data in related tables, document databases store data in combined documents. Data spread out over 5 different tables in a relational model could be stored inside a single document. Documents are usually represented in blocks of JSON.

A lot of developers find document databases easier to use. A common critique of relational models is the clunkiness with object-oriented languages. Translating related tables into objects can be awkward. Whereas storing data in documents makes translating them into objects very smooth.

Cosmos DB

Azure cosmos db icon

It’s the size of a planet!

Time to talk about the big dog.

Cosmos db is Microsoft’s proprietary fully managed NoSql database.

It’s designed to be as easy as possible for developers.

Cosmos takes care of problems like scaling, performance, and global distribution. It does this with little configuration and setup needed.

When developers don’t have to worry about the database they can focus on building amazing applications.

Notable features

  • Planet scale - Max storage for a Cosmos container? Unlimited.
  • Guaranteed speed - Speed and availability are actually guaranteed by a Cosmos DB SLA. Microsoft will give you credits for any failures to meet this SLA
  • Multiple Models - Cosmos is not just NoSql Documents. Data can be stored in Key-Value, Column Friendly, and Graph database formats.
  • Multiple APIs - As well as multiple models Cosmos supports multiple APIs. You can interact with Cosmos through MongoDB, Gremlin, Cassandra, and Table storage APIs. Migrating applications that use these APIs becomes just a change of connection string.
  • Consistency levels - Cosmos has 5 levels of data consistency: Strong, Bounded staleness, Session, Consistent prefix, and Eventual. More levels allow developers to fine-tune the database to their needs.

Real-life example of Cosmos DB

Screenshot of walking dead game

Zombies!

Walking Dead: No man’s land is a popular mobile game using Cosmos DB.

The game exploded when it was released. It was downloaded 1 million times in the first weekend. The scalability of Cosmos ensured this sudden surge in players was no problem.

According to the game’s website, the game has been downloaded over 23 million times. On average, players complete 8.7 million missions every week. 79 billion walkers have been killed since the game was released.

Read more on how the game uses Cosmos.


Big Files

Okay so we’ve covered data types to get your application going but what about our big files?

Where do we put our Images, PDFs, Videos, and Log files?

Blob Storage

Azure blob storage icon

I was looking for somewhere to store my 1’s and 0’s

Blob storage is the simplest service on this list.

It’s for storing large amounts of unstructured data.

Simple but very powerful.

Notable Features

  • CDN - Blob storage integrates with Azure CDN to cache blobs close to your users. So if you’re storing images for a website, loading them from the CDN reduces load times for your site.
  • Hot, Cool, and Archive access tiers - Access tiers allow you to control the cost of storing your data. It depends on how often the data is accessed. For data accessed and modified frequently the Hot tier has the lowest access costs but highest storage costs. The Archive tier is for data that you save just in case but don’t look at for years. And the Cool tier is the in-between of the two.
  • TTL - Blob lifestyle policies let you delete or change the tier of a blob after a period of time.
  • Index tags - These let you categorize your blobs with key-value tags. Can be extremely useful if you have a lot of blobs to search through.

Summary

ServiceData TypeQualities
Azure SQLTabularGeneral Purpose
Table StorageTabularCheapHigh performanceLimited querying functionality
Cosmos DBNOSQLCan be expensiveEasy to useGuaranteed speed and availabilityMulti Model and Api
Blob StorageBig filesStore any kind of file