John Downs

Building Human-Focused Software

Filtering by Tag: DevOps

Deploying App Services with 'Run From Package', Azure Storage, and Azure Pipelines

This post was originally published on the Kloud blog.

Azure App Service recently introduced a feature called Run From Package. Rather than uploading our application binaries and other files to an App Service directly, we can instead package them into a zip file and provide App Services with the URL. This is a useful feature because it eliminates issues with file locking during deployments, it allows for atomic updates of application code, and it reduces the time required to boot an application. It also means that the 'release' of an application simply involves the deployment of a configuration setting. And because Azure Functions runs on top of App Services, the same technique can be used for Azure Functions too.

While we can store the packages anywhere on the internet, and then provide a URL to App Services to find them, a common approach is to use Azure Storage blobs to host the packages. Azure Storage is a relatively cheap and easy way to host files and get URLs to them, making it perfect for this type of deployment. Additionally, by permanently storing our packages in an Azure Storage account we can keep a permanent record of all deployments - and we can even use feature of Azure Storage like immutable blobs to ensure that the blobs can't be tampered with or deleted.

However there are a few different considerations that it's important to think through when using storage accounts in conjunction with App Services. In this post I'll describe one way to use this feature with Azure DevOps build and release pipelines, and some of the pros and cons of this approach.

Storage Account Considerations

When provisioning an Azure Storage account to contain your application packages, there are a few things you should consider.

First, I'd strongly recommend using SAS tokens in conjunction with a container access policy to ensure that your packages can't be accessed by anyone who shouldn't access them. Typically application packages that are destined for an App Service aren't files you want to be made available to everyone on the internet.

Second, consider the replication options you have for your storage account. For something that runs an App Service I'd generally recommend using the RA-GRS replication type to ensure that even if Azure Storage has a regional outage App Services can still access the packages. This needs to be considered as part of a wider disaster recovery strategy though, and also remember that in the event that the primary region for your Azure Storage account is unavailable, you need to do some manual work to switch your App Service to read your package from the secondary region.

Third, virtual network integration is not currently possible for storage accounts that are used by App Services, although it should be soon. Today, Azure allows joining an App Service into a virtual network, but service endpoints - the feature that allows blocking access to Azure Storage outside of a virtual network - aren't supported by App Services yet. This feature is in preview, though, so I'm hopeful that we'll be able to lock down the storage accounts used by App Services packages soon.

Fourth, consider who deploys the storage account(s), and how many you need. In most cases, having a single storage account per App Service is going to be too much overhead to maintain. But if you decide to share a single storage account across many App Services throughout your organisation, you'll need to consider how to share the keys to generate SAS tokens.

Fifth, consider the lifetime of the application versus the storage account. In Azure, resource groups provide a natural boundary for resources that share a common lifetime. An application and all of its components would typically be deployed into a single resource group. If we're dealing with a storage account that may be used across multiple applications, it would be best to have the storage account in its own resource group so that decommissioning one application won't result in the storage account being removed, and so that Azure RBAC permissions can be separately assigned.

Using Azure Pipelines

Azure Pipelines is the new term for what used to be called VSTS's build and release management features. Pipelines let us define the steps involved in building, packaging, and deploying our applications.

There are many different ways we can create an App Service package from Azure Pipelines, upload them to Azure Storage, and then deploy them. Each option has its own pros and cons, and the choice will often depending on how pure you want your build and release processes to be. For example, I prefer that my build processes are completely standalone, and don't interact with Azure at all if possible. They emit artifacts and the release process then manages the communication with Azure to deploy them. Others may not be as concerned about this separation, though.

I also insist on 100% automation for all of my build and release tasks, and vastly prefer to script things out in text files and commit them to a version control system like Git, rather than changing options on a web portal. Therefore I typically use a combination of build YAMLs, ARM templates, and PowerShell scripts to build and deploy applications.

It's fairly straightforward to automate the creation, upload, and deployment of App Service packages using these technologies. In this post I'll describe one approach that I use to deploy an App Service package, but towards the end I'll outline some variations you might want to consider.

Example

I'll use the hypothetical example of an ASP.NET web application. I'm not going to deploy any actual real application here - just the placeholder code that you get from Visual Studio when creating a new ASP.NET MVC app - but this could easily be replaced with a real application. Similarly, you could replace this with a Node.js or PHP app, or any other language and framework supported by App Service.

You can access the entire repository on GitHub here, and I've included the relevant pieces below.

Step 1: Deploy Storage Account

The first step is to deploy a storage account to contain the packages. My preference is for a storage account to be created separately to the build and release process. As I noted above, this helps with reusability - that account can be used for any number of applications you want to deploy - and it also ensures that you're not tying the lifetime of your storage account to the lifetime of your application.

I've provided an ARM template below that deploys a storage account with a unique name, and creates a blob container called packages that is used to store all of our packages. The template also outputs the connection details necessary to upload the blobs and generate a SAS token in later steps.

Step 2: Build Definition

Our Azure Pipelines build definition is pretty straightforward. All we do here is build our app and publish a build artifact. In Azure Pipelines, 'publishing' a build artifact means that it's available for releases to use - the build process doesn't actually save the package to the Azure Storage account. I've used a build YAML file to define the process, which is provided here:

Step 3: Release Definition

Our release definition runs through the remaining steps. I've provided a PowerShell script that executes the release process:

First, it uploads the build artifacts to our storage account's packages container. It generates a unique filename for each blob, ensuring that every release is fully independent and won't accidentally overwrite another release's package.

Next, it generates a SAS token for the packages it's uploaded. The token just needs to provide read access to the blob. I've used a 100-year expiry, but you could shorten this if you need to - just make sure you don't make it too short, or App Services won't be able to boot your application once the expiry date passes.

Finally, it deploys the App Service instance using an ARM template, and passes the full URL to the package - including its SAS token - into the ARM deployment parameters. The key part for the purposes of this post is on line 51 of the template, where we create a new app setting called WEBSITE_RUN_FROM_PACKAGE and set it to the full blob URL, including the SAS token part. Here's the ARM template we execute from the script:

Note that if you want to use this PowerShell script in your own release process, you'll want to adjust the variables so that you're using the correct source alias name for the artifact, as well as the correct resource group name.

Pros and Cons

The approach I've outlined here has a few benefits: it allows for a storage account to be shared across multiple applications; it keeps the build process clean and simple, and doesn't require the build to interact with Azure Storage; and it ensures that each release runs independently, uploading its own private copy of a package.

However there are a few problems with it. First, the storage account credentials need to be available to the release definition. This may not be desirable if the account is shared by multiple teams or multiple applications. Second, while having independent copies of each package is useful, it also means there's some wasted space if we deploy a single app multiple times.

If these are concerns to you, there are a number of things you could consider, depending on your concern.

If your concern is that credentials are being shared, then you could consider creating a dedicated storage account as part of the release process. The release process can provision the storage account (if it doesn't already exist), retrieve the keys to it, upload the package, generate a SAS token, and then deploy the App Service with the URL to the package. The storage account's credentials would never leave the release process. Of course, this also makes it harder to share the storage account across multiple applications.

Keeping the storage account with the application also makes the release more complicated, since you can no longer deploy everything in a single ARM template deployment operation. You'd need at least two ARM deployments, with some scripting required in between. The first ARM template deployment would deploy the storage account and container. You'd then execute some PowerShell or another script to upload the package and generate a SAS token. Then you could execute a second ARM template deployment to deploy the App Service and point it to the package URL.

Another alternative is to pre-create SAS tokens for your deployments to use. One SAS token would be used for the upload of the blobs (and would therefore need write permissions assigned), while a second would be used for the App Service to access all blobs within the container (and would only need read permissions assigned).

Yet another alternative is to use the preview Azure RBAC feature of Azure Storage to authenticate the release process to the storage account. This is outside the scope of this post, but this approach could be used to delegate permissions to the storage account without sharing any account keys.

If your concern is that the packages may be duplicated, you have a few options. One is to simply not create unique names during each release, but instead use a naming scheme that results in consistent names for the same build artifacts. For example, you might use the convention .zip. Subsequent releases could check if the package already exists and leave it alone if it does.

If you don't want to use Azure Storage at all, you can also upload a package directly to the App Service's d:\home\data\SitePackages folder. This way you gain some of the benefits of the Run From Package feature - namely the speed and atomicity of deployments - but lose the advantage of having a simpler deployment with immutable blobs. This is documented on the official documentation page. Also, you can of course use any file storage system you like, such as Amazon S3, to host your packages.

Also, bear in mind that currently App Services on Linux don't support the Run from Package feature at all currently.

Automatic Key Rotation for Azure Services

This post was originally published on the Kloud blog.

Securely managing keys for services that we use is an important, and sometimes difficult, part of building and running a cloud-based application. In general I prefer not to handle keys at all, and instead rely on approaches like managed service identities with role-based access control, which allow for applications to authenticate and authorise themselves without any keys being explicitly exchanged. However, there are a number of situations where do we need to use and manage keys, such as when we use services that don't support role-based access control. One best practice that we should adopt when handling keys is to rotate (change) them regularly.

Key rotation is important to cover situations where your keys may have compromised. Common attack vectors include keys having been committed to a public GitHub repository, a log file having a key accidentally written to it, or a disgruntled ex-employee retaining a key that had previously been issued. Changing the keys means that the scope of the damage is limited, and if keys aren't changed regularly then these types of vulnerability can be severe.

In many applications, keys are used in complex ways and require manual intervention to rotate. But in other applications, it's possible to completely automate the rotation of keys. In this post I'll explain one such approach, which rotates keys every time the application and its infrastructure components are redeployed. Assuming the application is deployed regularly, for example using a continuous deployment process, we will end up rotating keys very frequently.

Approach

The key rotation process I describe here relies on the fact that the services we'll be dealing with - Azure Storage, Cosmos DB, and Service Bus - have both a primary and a secondary key. Both keys are valid for any requests, and they can be changed independently of each other. During each release we will pick one of these keys to use, and we'll make sure that we only use that one. We'll deploy our application components, which will include referencing that key and making sure our application uses it. Then we'll rotate the other key.

The flow of the script is as follows:

  1. Decide whether to use the primary key or the secondary key for this deployment. There are several approaches to do this, which I describe below.

  2. Deploy the ARM template. In our example, the ARM template is the main thing that reads the keys. The template copies the keys into an Azure Function application's configuration settings, as well as into a Key Vault. You could, of course, output the keys and have your deployment script put them elsewhere if you want to.

  3. Run the other deployment logic. For our simple application we don't need to do anything more than run the ARM template deployment, but for many deployments you might copy your application files to a server, swap the deployment slots, or perform a variety of other actions that you need to run as part of your release.

  4. Test the application is working. The Azure Function in our example will perform some checks to ensure the keys are working correctly. You might also run other 'smoke tests' after completing your deployment logic.

  5. Record the key we used. We need to keep track of the keys we’ve used in this deployment so that the next deployment can use the other one.

  6. Rotate the other key. Now we can rotate the key that we are not using. The way that we rotate keys is a little different for each service.

  7. Test the application again. Finally, we run one more check to ensure that our application works. This is mostly a last check to ensure that we haven't accidentally referenced any other keys, which would break our application now that they've been rotated.

We don't rotate any keys until after we've already switched the application to using the other set of keys, so we should never end up in a situation where we've referenced the wrong keys from the Azure Functions application. However, if we wanted to have a true zero-downtime deployment then we could use something like deployment slots to allow for warming up our application before we switch it into production.

A Word of Warning

If you're going to apply this principle in this post or the code below to your own applications, it's important to be aware of an important limitation. The particular approach described here only works if your deployments are completely self-contained, with the keys only used inside the deployment process itself. If you provide keys for your components to any other systems or third parties, rotating keys in this manner will likely cause their systems to break.

Importantly, any shared access signatures and tokens you issue will likely be broken by this process too. For example, if you provide third parties with a SAS token to access a storage account or blob, then rotating the account keys will cause the SAS token to be invalidated. There are some ways to avoid this, including generating SAS tokens from your deployment process and sending them out from there, or by using stored access policies; these approaches are beyond the scope of this post.

The next sections provide some detail on the important steps in the list above.

Step 1: Choosing a Key

The first step we need to perform is to decide whether we should use the primary or secondary keys for this deployment. Ideally each deployment would switch between them - so deployment 1 would use the primary keys, deployment 2 the secondary, deployment 3 the primary, deployment 4 the secondary, etc. This requires that we store some state about the deployments somewhere. Don’t forget, though, that the very first time we deploy the application we won’t have this state set. We need to allow for this scenario too.

The option that I’ve chosen to use in the sample is to use a resource group tag. Azure lets us use tags to attach custom metadata to most resource types, as well as to resource groups. I’ve used a custom tag named CurrentKeys to indicate whether the resources in that group currently use the primary or secondary keys.

There are other places you could store this state too - some sort of external configuration system, or within your release management tool. You could even have your deployment scripts look at the keys currently used by the application code, compare them to the keys on the actual target resources, and then infer which key set is being used that way.

A simpler alternative to maintaining state is to randomly choose to use the primary or secondary keys on every deployment. This may sometimes mean that you end up reusing the same keys repeatedly for several deployments in a row, but in many cases this might not be a problem, and may be worth the simplicity of not maintaining state.

Step 2: Deploy the ARM Template

Our ARM template includes the resource definitions for all of the components we want to create - a storage account, a Cosmos DB account, a Service Bus namespace, and an Azure Function app to use for testing. You can see the full ARM template here.

Note that we are deploying the Azure Function application code using the ARM template deployment method.

Additionally, we copy the keys for our services into the Azure Function app's settings, and into a Key Vault, so that we can access them from our application.

Step 4: Testing the Keys

Once we've finished deploying the ARM template and completing any other deployment steps, we should test to make sure that the keys we're trying to use are valid. Many deployments include some sort of smoke test - a quick test of core functionality of the application. In this case, I wrote an Azure Function that will check that it can connect to the Azure resources in question.

Testing Azure Storage Keys

To test connectivity to Azure Storage, we run a query against the storage API to check if a blob container exists. We don't actually care if the container exists or not; we just check to see if we can successfully make the request:

Testing Cosmos DB Keys

To test connectivity to Cosmos DB, we use the Cosmos DB SDK to try to retrieve some metadata about the database account. Once again we're not interested in the results, just in the success of the API call:

Testing Service Bus Keys

And finally, to test connectivity to Service Bus, we try to get a list of queues within the Service Bus namespace. As long as we get something back, we consider the test to have passed:

You can view the full Azure Function here.

Step 6: Rotating the Keys

One of the last steps we perform is to actually rotate the keys for the services. The way in which we request key rotations is different depending on the services we're talking to.

Rotating Azure Storage Keys

Azure Storage provides an API that can be used to regenerate an account key. From PowerShell we can use the New-AzureRmStorageAccountKey cmdlet to access this API:

Rotating Cosmos DB Keys

For Cosmos DB, there is a similar API to regenerate an account key. There are no first-party PowerShell cmdlets for Cosmos DB, so we can instead a generic Azure Resource Manager cmdlet to invoke the API:

Rotating Service Bus Keys

Service Bus provides an API to regenerate the keys for a specified authorization rule. For this example we're using the default RootManageSharedAccessKey authorization rule, which is created automatically when the Service Bus namespace is provisioned. The PowerShell cmdlet New-AzureRmServiceBusKey can be used to access this API:

You can see the full script here.

Conclusion

Key management and rotation is often a painful process, but if your application deployments are completely self-contained then the process described here is one way to ensure that you continuously keep your keys changing and up-to-date.

You can download the full set of scripts and code for this example from GitHub.

Deploying Azure Functions with ARM Templates

This post was originally published on the Kloud blog.

There are many different ways in which an Azure Function can be deployed. In a future blog post I plan to go through the whole list. There is one deployment method that isn't commonly known though, and it's of particular interest to those of us who use ARM templates to deploy our Azure infrastructure. Before I describe it, I'll quickly recap ARM templates.

ARM Templates

Azure Resource Manager (ARM) templates are JSON files that describe the state of a resource group. They typically declare the full set of resources that need to be provisioned or updated. ARM templates are idempotent, so a common pattern is to run the template deployment regularly—often as part of a continuous deployment process—which will ensure that the resource group stays in sync with the description within the template.

In general, the role of ARM templates is typically to deploy the infrastructure required for an application, while the deployment of the actual application logic happens separately. However, Azure Functions' ARM integration has a feature whereby an ARM template can be used to deploy the files required to make the function run.

How to Deploy Functions in an ARM Template

In order to deploy a function through an ARM template, we need to declare a resource of type Microsoft.Web/sites/functions, like this:

There are two important parts to this.

First, the config property is essentially the contents of the function.json file. It includes the list of bindings for the function, and in the example above it also includes the disabled property.

Second, the files property is an object that contains key-value pairs representing each file to deploy. The key represents the filename, and the value represents the full contents of the file. This only really works for text files, so this deployment method is probably not the right choice for precompiled functions and other binary files. Also, the file needs to be inlined within the template, which may quickly get unwieldy for larger function files—and even for smaller files, the file needs to be escaped as a JSON string. This can be done using an online tool like this, or you could use a script to do the escaping and pass the file contents as a parameter into the template deployment.

Importantly, in my testing I found that using this method to deploy over an existing function will remove any files that are not declared in the files list, so be careful when testing this approach if you've modified the function or added any files through the portal or elsewhere.

Examples

There are many different ways you can insert your function file into the template, but one of the ways I tend to use is a PowerShell script. Inside the script, we can read the contents of the file into a string, and create a HashTable for the ARM template deployment parameters:
https://gist.github.com/johndowns/9a4506cabe1e3ad0299b604258add69c

Then we can use the New-AzureRmResourceGroupDeployment cmdlet to execute the deployment, passing in $templateParameters to the -TemplateParameterObject argument.

You can see the full example here.

Of course, if you have a function that doesn't change often then you could instead manually convert the file into a JSON-encoded string using a tool like this one, and paste the function right into the ARM template. To see a full example of how this can be used, check out this example ARM template from a previous blog article I wrote.

When to Use It

Deploying a function through an ARM template can make sense when you have a very simple function that is comprised of one, or just a few, files to be deployed. In particular, if you already deploy the function app itself through the ARM template then this might be a natural extension of what you're doing.

This type of deployment can also make sense if you're wanting to quickly deploy and test a function and don't need some of the more complex deployment-related features like control over handling locked files. It's also a useful technique to have available for situations where a full deployment script might be too heavyweight.

However, for precompiled functions, functions that have binary files, and for complex deployments, it's probably better to use another deployment mechanism. Nevertheless, I think it's useful to know that this is a tool in your Azure Functions toolbox.

Deploying Blob Containers with ARM Templates

This post was originally published on the Kloud blog.

ARM templates are a great way to programmatically deploy your Azure resources. They act as declarative descriptions of the desired state of an Azure resource group, and while they can be frustrating to work with, overall the ability to use templates to deploy your Azure resources provides a lot of value.

One common frustration with ARM templates is that certain resource types simply can't be deployed with them. Until recently, one such resource type was a blob container. ARM templates could deploy Azure Storage accounts, but not blob containers, queues, or tables within them.

That has now changed, and it's possible to deploy a blob container through an ARM template. Here's an example template that deploys a container called logs within a storage account:

Queues and tables still can't be deployed this way, though - hopefully that's coming soon.

VSTS Build Definitions as YAML Part 2: How?

This post was originally published on the Kloud blog.

In the last post, I described why you might want to define your build definition as a YAML file using the new YAML Build Definitions feature in VSTS. In this post, we will walk through an example of a simple VSTS YAML build definition for a .NET Core application.

Application Setup

Our example application will be a blank ASP.NET Core web application with a unit test project. I created these using Visual Studio for Mac's ASP.NET Core Web application template, and added a blank unit test project. The template adds a single unit test with no logic in it. For our purposes here we don't need any actual code beyond this. Here's the project layout in Visual Studio:

1-vsproject1.png

I set up a project on VSTS with a Git repository, and I pushed my code into that repository, in a /src folder. Once pushed, here's how the code looks in VSTS:

2-vstscodesrc1.png

Enable YAML Build Definitions

As of the time of writing (November 2017), YAML build definitions are currently a preview feature in VSTS. This means we have to explicitly turn them on. To do this, click on your profile picture in the top right of the screen, and then click Preview Features.

3-vstspreviewfeatures1.png

Switch the drop-down to For this account [youraccountname], and turn the Build YAML Definitions feature to On.

4-vstspreviewyaml1.png

Design the Build Process

Now we need to decide how we want to build our application. This will form the initial set of build steps we'll run. Because we're building a .NET Core application, we need to do the following steps:

  • We need to restore the NuGet packages (dotnet restore).

  • We need to build our code (dotnet build).

  • We need to run our unit tests (dotnet test).

  • We need to publish our application (dotnet publish).

  • Finally, we need to collect our application files into a build artifact.

As it happens, we can actually collapse this down to a slightly shorter set of steps (for example, the dotnet build command also runs an implicit dotnet restore), but for now we'll keep all four of these steps so we can be very explicit in our build definition. For a real application, we would likely try to simplify and optimise this process.

I created a new build folder at the same level as the src folder, and added a file in there called build.yaml.
VSTS also allows you to create a file in the root of the repository called .vsts-ci.yml. When this file is pushed to VSTS, it will create a build configuration for you automatically. Personally I don't like this convention, partly because I want the file to live in the build folder, and partly because I'm not generally a fan of this kind of 'magic' and would rather do things manually.

Once I'd created a placeholder build.yaml file, here's how things looked in my repository:

Writing the YAML

Now we can actually start writing our build definition!

In future updates to VSTS, we will be able to export an existing build definition as a YAML file. For now, though, we have to write them by hand. Personally I prefer that anyway, as it helps me to understand what's going on and gives me a lot of control.

First, we need to put a steps: line at the top of our YAML file. This will indicate that we're about to define a sequence of build steps. Note that VSTS also lets you define multiple build phases, and steps within those phases. That's a slightly more advanced feature, and I won't go into that here.

Next, we want to add an actual build step to call dotnet restore. VSTS provides a task called .NET Core, which has the internal task name DotNetCoreCLI. This will do exactly what we want. Here's how we define this step:

Let's break this down a bit. Also, make sure to pay attention to indentation - this is very important in YAML.

- task: DotNetCoreCLI@2 indicates that this is a build task, and that it is of type DotNetCoreCLI. This task is fully defined in Microsoft's VSTS Tasks repository. Looking at that JSON file, we can see that the version of the task is 2.1.8 (at the time of writing). We only need to specify the major version within our step definition, and we do that just after the @ symbol.

displayName: Restore NuGet Packages specifies the user-displayable name of the step, which will appear on the build logs.

inputs: specifies the properties that the task takes as inputs. This will vary from task to task, and the task definition will be one source you can use to find the correct names and values for these inputs.

command: restore tells the .NET Core task that it should run the dotnet restore command.

projects: src tells the .NET Core task that it should run the command with the src folder as an additional argument. This means that this task is the equivalent of running dotnet restore src from the command line.

The other .NET Core tasks are similar, so I won't include them here - but you can see the full YAML file below.
Finally, we have a build step to publish the artifacts that we've generated. Here's how we define this step:

This uses the PublishBuildArtifacts task. If we consult the definition for this task on the Microsoft GitHub repository, we can see This task accepts several arguments. The ones we're setting are:

  • pathToPublish is the path on the build agent where the dotnet publish step has saved its output. (As you will see in the full YAML file below, I manually overrode this in the dotnet publish step.)

  • artifactName is the name that is given to the build artifact. As we only have one, I've kept the name fairly generic and just called it deploy. In other projects, you might have multiple artifacts and then give them more meaningful names.

  • artifactType is set to container, which is the internal ID for the Artifact publish location: Visual Studio Team Services/TFS option.

Here is the complete build.yaml file:

Set up a Build Configuration and Run

6-vstsbuildnew1.png

Now we can set up a build configuration in VSTS and tell it to use the YAML file. This is a one-time operation. In VSTS's Build and Release section, go to the Builds tab, and then click New Definition.

You should see YAML as a template type. If you don't see this option, check you enabled the feature as described above.

We'll configure our build configuration with the Hosted VS2017 queue (this just means that our builds will run on a Microsoft-managed build agent, which has Visual Studio 2017 installed). We also have to specify the relative path to the YAML file in the repository, which in our case is build/build.yaml.

8-vstsbuildcomplete1.png

Now we can save and queue a build. Here's the final output from the build:

(Yes, this is build number 4 - I made a couple of silly syntax errors in the YAML file and had to retry a few times!)

As you can see, the tasks all ran successfully and the test passed. Under the Artifacts tab, we can also see that the deploy artifact was created:

10-vstsbuildartifacts1.png

Tips and Other Resources

This is still a preview feature, so there are still some known issues and things to watch out for.

Documentation

There is a limited amount of documentation available for creating and using this feature. The official documentation links so far are:

In particular, the properties for each task are not documented, and you need to consult the task's task.json file to understand how to structure the YAML syntax. Many of the built-in tasks are defined in Microsoft's GitHub repository, and this is a great starting point, but more comprehensive documentation would definitely be helpful.

Examples

There aren't very many example templates available yet. That is why I wrote this article. I also recommend Marcus Felling's recent blog post, where he provides a more complex example of a YAML build definition.

Tooling

As I mentioned above, there is limited tooling available currently. The VSTS team have indicated that they will soon provide the ability to export an existing build definition as YAML, which will help a lot when trying to generate and understand YAML build definitions. My personal preference will still be to craft them by hand, but this would be a useful feature to help bootstrap new templates.

Similarly, there currently doesn't appear to be any validation of the parameters passed to each task. If you misspell a property name, you won't get an error - but the task won't behave as you expect. Hopefully this experience will be improved over time.

Error Messages

The error messages that you get from the build system aren't always very clear. One error message I've seen frequently is Mapping values are not allowed in this context. Whenever I've had this, it's been because I did something wrong with my YAML indentation. Hopefully this saves somebody some time!

Releases

This feature is only available for VSTS build configurations right now. Release configurations will also be getting the YAML treatment, but it seems this won't be for another few months. This will be an exciting development, as in my experience, release definitions can be even more complex than build configurations, and would benefit from all of the peer review, versioning, and branching goodness I described above.

Variable Binding and Evaluation Order

While you can use VSTS variables in many places within the YAML build definitions, there appear to be some properties that can't be bound to a variable. One I've encountered is when trying to link an Azure subscription to a property.

For example, in one of my build configurations, I want to publish a Docker image to an Azure Container Registry. In order to do this I need to pass in the Azure service endpoint details. However, if I specify this as a variable, I get an error - I have to hard-code the service endpoint's identifier into the YAML file. This is not something I want to do, and will become a particular issue when release definitions can be defined in YAML, so I hope this gets fixed soon.

Summary

Build definitions and build scripts are an integral part of your application's source code, and should be treated as such. By storing your build definition as a YAML file and placing it into your repository, you can begin to improve the quality of your build pipeline, and take advantage of source control features like diffing, versioning, branching, and pull request review.

VSTS Build Definitions as YAML Part 1: What and Why?

This post was originally published on the Kloud blog.

Visual Studio Team Services (VSTS) has recently gained the ability to create build definitions as YAML files. This feature is currently in preview. In this post, I'll explain why this is a great addition to the VSTS platform and why you might want to define your builds in this way. In the next post I'll work through an example of using this feature, and I'll also provide some tips and links to documentation and guidance that I found helpful when constructing some build definitions myself.

What Are Build Definitions?

If you use a build server of some kind (and you almost certainly should!), you need to tell the build server how to actually build your software. VSTS has the concept of a build configuration, which specifies how and when to build your application. The 'how' part of this is the build definition. Typically, a build definition will outline how the system should take your source code, apply some operations to it (like compiling the code into binaries), and emit build artifacts. These artifacts usually then get passed through to a release configuration, which will deploy them to your release environment.

In a simple static website, the build definition might simply be one step that copies some files from your source control system into a build artifact. In a .NET Core application, you will generally use the dotnet command-line tool to build, test, and publish your application. Other application frameworks and languages will have their own way of building their artifacts, and in a non-trivial application, the steps involved in building the application might get fairly complex, and may even trigger PowerShell or Bash scripts to allow for further flexibility and advanced control flow and logic.

Until now, VSTS has really only allowed us to create and modify build definitions in its web editor. This is a great way to get started with a new build definition, and you can browse the catalog of available steps, add them into your build definition, and configure them as necessary. You can also define and use variables to allow for reuse of information across steps. However, there are some major drawbacks to defining your build definition in this way.

Why Store Build Definitions in YAML?

Build definitions are really just another type of source code. A build definition is just a specification of how your application should be built. The exact list of steps, and the sequence in which they run, is the way in which we are defining part of our system's behaviour. If we adopt a DevOps mindset, then we want to make sure we treat all of our system as source code, including our build definitions, and we want to take this source code seriously.

In November 2017, Microsoft announced that VSTS now has the ability to run builds that have been defined as a YAML file, instead of through the visual editor. The YAML file gets checked into the source control system, exactly the same way as any other file. This is similar to the way Travis CI allows for build definitions to be specified in YAML files, and is great news for those of us who want to treat our build definitions as code. It gives us many advantages.

Versioning

We can keep build definitions versioned, and can track the history of a build definition over time. This is great for audibility, as well as to ensure that we have the ability to consult or roll back to a previous version if we accidentally make a breaking change.

Keeping Build Definitions with Code

We can store the build definitions alongside the actual code that it builds, meaning that we are keeping everything tidy and in one place. Until now, if we wanted to fully understand the way the application was built, we'd have to look at at the code repository as well as the VSTS build definition. This made the overall process harder to understand and reason about. In my opinion, the fewer places we have to remember to check or update during changes the better.

Branching

Related to the last point, we can also use make use of important features of our source control system like branching. If your team uses GitHub Flow or a similar Git-based branching strategy, then this is particularly advantageous.

Let's take the example of adding a new unit test project to a .NET Core application. Until now, the way you might do this is to set up a feature branch on which you develop the new unit test project. At some point, you'll want to add the execution of these tests to your build definition. This would require some careful timing and consideration. If you update the build definition before your branch is merged, then any builds you run in the meantime will likely fail - they'll be trying to run a unit test project that doesn't exist outside of your feature branch yet. Alternatively, you can use a draft version of your build configuration, but then you need to plan exactly when to publish that draft.

If we specify our build definition in a YAML file, then this change to the build process is simply another change that happens on our feature branch. The feature branch's YAML file will contain a new step to run the unit tests, but the master branch will not yet have that step defined in its YAML file. When the build configuration runs on the master branch before we merge, it will get the old version of the build definition and will not try to run our new unit tests. But any builds on our feature branch, or the master branch once our feature branch is merged, will include the new tests.

This is very powerful, especially in the early stages of a project where you are adding and changing build steps frequently.

Peer Review

Taking the above point even further, we can review a change to a build definition YAML file in the same way that we would review any other code changes. If we use pull requests (PRs) to merge our changes back into the master branch then we can see the change in the build definition right within the PR, and our team members can give them the same rigorous level of review as they do our other code files. Similarly, they can make sure that changes in the application code are reflected in the build definition, and vice versa.

Reuse

Another advantage of storing build definitions in a simple format like YAML is being able to copy and paste the files, or sections from the files, and reuse them in other projects. Until now, copying a step from one build definition to another was very difficult, and required manually setting up the steps. Now that we can define our build steps in YAML files, it's often simply a matter of copying and pasting.

Linting

A common technique in many programming environments is linting, which involves running some sort of analysis over a file to check for potential errors or policy violations. Once our build definition is defined in YAML, we can perform this same type of analysis if we wanted to write a tool to do it. For example, we might write a custom linter to check that we haven't accidentally added any credentials or connection strings into our build definition, or that we haven't mistakenly used the wrong variable syntax. YAML is a standard format and is easily parsable across many different platforms, so writing a simple linter to check for your particular policies is not a complex endeavour.

Abstraction and Declarative Instruction

I'm a big fan of declarative programming - expressing your intent to the computer. In declarative programming, software figures out the right way to proceed based on your high-level instructions. This is the idea behind many different types of 'desired-state' automation, and in abstractions like LINQ in C#. This can be contrasted with an imperative approach, where you specify an explicit sequence of steps to achieve that intent.

One approach that I've seen some teams adopt in their build servers is to use PowerShell or Bash scripts to do the entire build. This provided the benefits I outlined above, since those script files could be checked into source control. However, by dropping down to raw script files, this meant that these teams couldn't take advantage of all of the built-in build steps that VSTS provides, or the ecosystem of custom steps that can be added to the VSTS instance.

Build definitions in VSTS are a great example of a declarative approach. They are essentially an abstraction of a sequence of steps. If you design your build process well, it should be possible for a newcomer to look at your build definition and determine what the sequence of steps are, why they are happening, and what the side-effects and outcomes will be. By using abstractions like VSTS build tasks, rather than hand-writing imperative build logic as a sequence of command-line steps, you are helping to increase the readability of your code - and ultimately, you may increase the performance and quality by allowing the software to translate your instructions into actions.

YAML build definitions give us all of the benefits of keeping our build definitions in source control, but still allows us to make use of the full power of the VSTS build system.

Inline Documentation

YAML files allow for comments to be added, and this is a very helpful way to document your build process. Frequently, build definitions can get quite complex, with multiple steps required to do something that appears rather trivial, so having the ability to document the process right inside the definition itself is very helpful.

Summary

Hopefully through this post, I've convinced you that storing your build definition in a YAML file is a much tidier and saner approach than using the VSTS web UI to define how your application is built. In the next post, I'll walk through an example of how I set up a simple build definition in YAML, and provide some tips that I found useful along the way.