Cosmos DB is a fantastic database service for many different types of applications. But it can also be quite expensive, especially if you have a number of instances of your database to maintain. For example, in some enterprise development teams you may need to have dev, test, UAT, staging, and production instances of your application and its components. Assuming you're following best practices and keeping these isolated from each other, that means you're running at least five Cosmos DB collections. It's easy for someone to accidentally leave one of these Cosmos DB instances provisioned at a higher throughput than you expect, and before long you're racking up large bills, especially if the higher throughput is left overnight or over a weekend.
In this post I'll describe an approach I've been using recently to ensure the Cosmos DB collections in my subscriptions aren't causing costs to escalate. I've created an Azure Function that will run on a regular basis. It uses a managed service identity to identify the Cosmos DB accounts throughout my whole Azure subscription, and then it looks at each collection in each account to check that they are set at the expected throughput. If it finds anything over-provisioned, it sends an email so that I can investigate what's happening. You can run the same function to help you identify over-provisioned collections too.
Step 1: Create Function App
First, we need to set up an Azure Functions app. You can do this in many different ways; for simplicity, we'll use the Azure Portal for everything here.
Create a Resource on the left pane of the portal, and then choose
Serverless Function App. Enter the information it prompts for - a globally unique function app name, a subscription, a region, and a resource group - and click
Step 2: Enable a Managed Service Identity
Once we have our function app ready, we need to give it a managed service identity. This will allow us to connect to our Azure subscription and list the Cosmos DB accounts within it, but without us having to maintain any keys or secrets. For more information on managed service identities, check out my previous post.
Open up the
Function Apps blade in the portal, open your app, and click
Platform Features, then
Managed service identity:
Switch the feature to
On and click
Step 3: Create Authorisation Rules
Now we have an identity for our function, we need to grant it access to the parts of our Azure subscription we want it to examine for us. In my case I'll grant it the rights over my whole subscription, but you could just give it rights on a single resource group, or even just a single Cosmos DB account. Equally you can give it access across multiple subscriptions and it will look through them all.
Open up the
Subscriptions blade and choose the subscription you want it to look over. Click
Access Control (IAM):
Add button to create a new role assignment.
The minimum role we need to grant the function app is called
Cosmos DB Account Reader Role. This allows the function to discover the Cosmos DB accounts, and to retrieve the read-only keys for those accounts, as described here. The function app can't use this role to make any changes to the accounts.
Finally, enter the name of your function app, click it, and click
This will create the role assignment. Your function app is now authorised to enumerate and access Cosmos DB accounts throughout the subscription.
Step 4: Add the Function
Next, we can actually create our function. Go back into the function app and click the
+ button next to
Functions. We'll choose to create a custom function:
Then choose a timer trigger:
Choose C# for the language, and enter the name
CosmosChecker. (Feel free to use a name with more panache if you want.) Leave the timer settings alone for now:
Your function will open up with some placeholder code. We'll ignore this for now. Click the
View files button on the right side of the page, and then click the
Add button. Create a file named
project.json, and then open it and paste in the following, then click
This will add the necessary package references that we need to find and access our Cosmos DB collections, and then to send alert emails using SendGrid.
Now click on the
run.csx file and paste in the following file:
I won't go through the entire script here, but I have added comments to try to make its purpose a little clearer.
Finally, click on the
function.json file and replace the contents with the following:
This will configure the function app with the necessary timer, as well as an output binding to send an email. We'll discuss most of these settings later, but one important setting to note is the
schedule setting. The value I've got above means the function will run every hour. You can change it to other values using cron expressions, such as:
Run every day at 9.30am UTC:
0 30 9 * * *
Run every four hours:
0 0 */4 * * *
Run once a week:
0 0 * * 0
You can decide how frequently you want this to run and replace the
schedule with the appropriate value from above.
Step 5: Get a SendGrid Account
We're using SendGrid to send email alerts. SendGrid has built-in integration with Azure Functions so it's a good choice, although you're obviously welcome to switch out for anything else if you'd prefer. You might want an SMS message to be sent via Twilio, or a message to be sent to Slack via the Slack webhook API, for example.
If you don't already have a SendGrid account you can sign up for a free account on their website. Once you've got your account, you'll need to create an API key and have it ready for the next step.
Step 6: Configure Function App Settings
Click on your function app name and then click on
Scroll down to the
Application settings section. We'll need to enter three settings here:
SendGridKey. This should have a value of your SendGrid API key from step 5.
AlertToAddress. This should be the email address that you want alerts to be sent to.
AlertFromAddress. This should be the email address that you want alerts to be sent from. This can be the same as the 'to' address if you want.
Application settings section should look something like this:
Step 7: Run the Function
Now we can run the function! Click on the function name again (
CosmosChecker), and then click the
Run button. You can expand out the
Logs pane at the bottom of the screen if you want to watch it run:
Depending on how many Cosmos DB accounts and collections you have, it may take a minute or two to complete.
If you've got any collections provisioned over 2000 RU/s, you should receive an email telling you this fact:
Configuring Alert Policies
By default, the function is configured to alert whenever it sees a Cosmos DB collection provisioned over 2000 RU/s. However, your situation may be quite different to mine. For example, you may want to be alerted whenever you have any collections provisioned over 1000 RU/s. Or, you may have production applications that should be provisioned up to 100,000 RU/s, but you only want development and test collections provisioned at 2000 RU/s.
You can configure alert policies in two ways.
First, if you have a specific collection that should have a specific policy applied to it - like the production collection I mentioned that should be allowed to go to 100,000 RU/s - then you can create another application setting. Give it the name
MaximumThroughput:, and set the value to the limit you want for that collection.
For example, a collection named
customers in a database named
customerdb in an account named
myaccount-prod would have a setting named
MaximumThroughput:myaccount-prod:customerdb:customers. The value would be 100000, assuming you wanted the function to check this collection against a limit of 100,000 RU/s.
Second, by default the function has a default quota of 2000 RU/s. You can adjust this to whatever value you want by altering the value on line 17 of the function code file (
If you want to deploy this function for yourself, you can also use an ARM template I have prepared. This performs all the steps listed above except step 3, which you still need to do manually.
Of course, you are also welcome to adjust the actual logic involved in checking the accounts and collections to suit your own needs. The full code is available on GitHub and you are welcome to take and modify it as much as you like! I hope this helps to avoid some nasty bill shocks.