John Downs

Building Human-Focused Software

Filtering by Category: Azure DevOps

Integration Testing Timer-Triggered Precompiled v2 Azure Functions

This post was originally published on the Kloud blog.

In a recent post, I described a way to run integration tests against precompiled C# Azure Functions using the v2 runtime. In that post, we looked at an example of invoking an HTTP-triggered function from within an integration test.

Of course, there are plenty of other triggers available for Azure Functions too. Recently I needed to write an integration test against a timer-triggered function and decided to investigate the best way to do this.

The Azure Functions runtime provides a convenient API for invoking a timer-trigger function. You can issue an HTTP POST request against an endpoint for your function, and the function runtime will start up and trigger the function. I wasn't able to find any proper documentation about this, so this blog post is a result of some of my experimentation.

To invoke a timer-triggered function named MyFunction, you need to issue an HTTP request as follows:

Replace  with either your real Azure Functions hostname - such as myfunctionsapp.azurewebsites.net - or, if you're running through the Azure Functions CLI, use localhost and the port number you're running it on, such as localhost:7071.

Interestingly, invoking this endpoint immediately returns an HTTP 201 response, but the function runs separately. This makes sense, though, since timer-trigger functions are not intended to return data in the same way that HTTP-triggered functions are.

I've created an updated version of the GitHub repository from the previous post with an example of running a test against a timer-triggered function. In this example, the function simply writes a message to an Azure Storage queue, which we can then look at to confirm the function has run. Normally the function would only run once a week, at 9.30am on Mondays, but our integration test triggers it each time it runs to verify that it works correctly.

In this version of the test fixture we also wait for the app to start before our test runs. We do this by polling the / endpoint with GET requests until it responds. This ensures that we can access the timer invocation HTTP endpoint successfully from our tests.

Of course, just like in the previous post's integration tests, a timer-triggered integration test can run from an Azure Pipelines build too, so you can include timer-triggered functions in your continuous integration and testing practices alongside the rest of your code. In fact, the same build.yaml that we used in the previous post can be used to run these tests, too.

Integration Testing Precompiled v2 Azure Functions

This post was originally published on the Kloud blog.

Azure Functions code can often contain important functionality that needs to be tested. The two most common ways of testing code are unit testing and integration testing. Unit testing runs pieces of code in isolation, and this is relatively simple to do with Azure Functions. Integration testing can be a little trickier though, and I haven't found any good documentation about how do this with version 2 of the Functions runtime. In this post I'll outline the approach I'm using to run integration tests against my Azure Functions v2 code.

In an application with a lot of business logic, unit testing may get us most of the way to verifying the code's quality. But Azure Functions code often involves pieces of functionality that can't be easily unit tested. For example, triggers, input and output bindings are very powerful features that let us avoid writing boilerplate code to bind to HTTP requests, connect to Azure Storage and Service Bus blobs, queues, and tables, or building our own timer logic. Similarly, we may need to connect to external services or databases, or work with libraries that can't be easily mocked or faked. If we want to test these parts of our Functions apps then we need some form of integration testing.

Approaches to Integration Testing

Integration tests involve running code in as close to a real environment as is practicable. The tests are generally run from our development machines or build servers. For example, ASP.NET Core lets us host an in-memory server for our application, which we can then connect to real databases, in-memory versions of systems like the Entity Framework, or emulators for services like Azure Storage and Cosmos DB.

Azure Functions v1 included some features to support integration testing of script-based Functions apps. But to date, I haven't found any guidance on how to run integration tests using a precompiled .NET Azure Functions app running against the v2 runtime.

Example Functions App

For the purposes of this post I've written a very simple Functions App with two functions that illustrate two common use cases. One function (HelloWorld) receives an HTTP message and returns a response, and the second (HelloQueue) receives an HTTP message and writes a message to a queue. The actual functions themselves are really just simple placeholders based on the Visual Studio starter function template:

In a real application you're likely to have a lot more going on than just writing to a queue, but the techniques below can be adapted to cover a range of different scenarios.

You can access all the code for this blog post on GitHub.

Implementing Integration Tests

The Azure Functions v2 core tools support local development and testing of Azure Functions apps. One of the components in this set of tools is func.dll, which lets us host the Azure Functions runtime. By automating this from our integration test project we can start our Functions app in a realistic host environment, run our tests, and tear it down again. This is ideal for running a suite of integration tests.

While you could use any test framework you like, the sample implementation I've provided uses xUnit.

Test Collection and Fixture

The xUnit framework provides a feature called collection fixtures. These fixtures let us group tests together; they also let us run initialisation code before the first test runs, and run teardown code after the last test finishes. Here's a placeholder test collection and fixture that will support our integration tests:

Starting the Functions Host

Now we have the fixture class definition, we can use it to start and stop the Azure Functions host. We will make use of the System.Diagnostics.Process class to start and stop the .NET Core CLI (dotnet.exe), which in turn starts the Azure Functions host through the func.dll library.

Note: I assume you already have the Azure Functions v2 core tools installed. You may already have these if you've got Visual Studio installed with the Azure Functions feature enabled. If not, you can install it into your global NPM packages by using the command npm install -g azure-functions-core-toolsas per the documentation here.

Our fixture code looks like this:

The code is fairly self-explanatory: during initialiation it reads various paths from configuration settings and starts the process; during teardown it kills the process and disposes of the Process object.

I haven't added any Task.Delay code, or anything to poll the function app to check if it's ready to receive requests. I haven't found this to be necessary. However, if you find that the first test run in a batch fails, this might be something you want to consider adding at the end of the fixture initialisation.

Configuring

Some of the code in the above fixture file won't compile yet because we have some configuration settings that need to be passed through. Specifically, we need to know the path to the dotnet.exe file (the .NET Core CLI), the func.dll file (the Azure Functions host), and to our Azure Functions app binaries.

I've created a class called ConfigurationHelper that initialises a static property that will help us:

The Settings class is then defined with the configuration settings we need:

Then we can create an appsettings.json file to set the settings:

That's all we need to run the tests. Now we can write a couple of actual test classes.

Writing Some Tests

You can write your integration tests in any framework you want, whether that be something very simple like pure xUnit tests or a more BDD-oriented framework like SpecFlow.

Personally I like BDDfy as a middle ground - it's simple and doesn't require a lot of extra plumbing like SpecFlow, while letting us write BDD-style tests in pure code.

Here are a couple of example integration tests I've written for the sample app:

Test the HelloWorld Function

This test simply calls the HTTP-triggered HelloWorld function and checks the output is as expected.

Test the HelloQueue Function

The second test checks that the HelloQueue function posts to a queue correctly. It does this by clearing the queue before it runs, letting the HelloQueue function run, and then confirming that a single message - with the expected contents - has been enqueued.

Running the Tests

Now we can compile the integration test project and run it from Visual Studio's Test Explorer. Behind the scenes it runs the .NET Core CLI, starts the Azure Functions host, executes our tests, and then kills the host when they're finished. And we can see the tests pass!

Running from Azure Pipelines

Getting the tests running from our local development environment is great, but integration tests are most useful when they run automatically as part of our continuous integration process. (Sometimes integration tests take so long to run that they get relegated to run on nightly builds instead, but that's a topic that's outside the scope of this post.)

Most build servers should be able to run our integration tests without any major problems. I'll use the example here of Azure Pipelines, which is part of the Azure DevOps service (and which used to be called VSTS's Build Configurations feature). Azure Pipelines lets us define our build process as a YAML file, which is also a very convenient way to document and share it!

Here's a build.yaml for building our Azure Functions app and running the integration tests:

The three key parts here are:

  • Lines 3 and 4 override the FunctionHostPath app setting with the location that Azure Pipelines hosted agents use for NPM packages, which is different to the location on most developers' PCs.

  • Line 6 links the build.yaml with the variable group IntegrationTestConnectionStrings. Variable groups are outside the scope of this post, but briefly, they let us create a predefined set of variables that are available as environment variables. Inside the IntegrationTestConnectionStrings variable group, I have set two variables - AzureWebJobsStorage and StorageConnectionString - to a connection string for an Azure Storage account that I want to use when I run from the hosted agent.

  • Lines 16 through 21 install the Azure Functions Core Tools, which gives us the func.dll host that we use. For Azure Pipelines hosted agents we need to run this step every time we run the build since NPM packages are reset after each build completes.

  • Lines 23 through 28 use the dotnet test command, which is part of the .NET Core tooling, to execute our integration tests. This automatically publishes the results to Azure DevOps too.

When we run the build, we can see the tests have run successfully:

AzurePipelines-TestResults-1024x330.png

And not only that, but we can see the console output in the build log, which can be helpful when diagnosing any issues we might see:

I've found this approach to be really useful when developing complex Azure Functions apps, where integration testing is a part of the quality control necessary for functions that run non-trivial workloads.

Remember you can view the complete code for this post on GitHub.

Update: A second post, adapting this method to test timer-triggered functions, is now available too. TODO

Deploying App Services with 'Run From Package', Azure Storage, and Azure Pipelines

This post was originally published on the Kloud blog.

Azure App Service recently introduced a feature called Run From Package. Rather than uploading our application binaries and other files to an App Service directly, we can instead package them into a zip file and provide App Services with the URL. This is a useful feature because it eliminates issues with file locking during deployments, it allows for atomic updates of application code, and it reduces the time required to boot an application. It also means that the 'release' of an application simply involves the deployment of a configuration setting. And because Azure Functions runs on top of App Services, the same technique can be used for Azure Functions too.

While we can store the packages anywhere on the internet, and then provide a URL to App Services to find them, a common approach is to use Azure Storage blobs to host the packages. Azure Storage is a relatively cheap and easy way to host files and get URLs to them, making it perfect for this type of deployment. Additionally, by permanently storing our packages in an Azure Storage account we can keep a permanent record of all deployments - and we can even use feature of Azure Storage like immutable blobs to ensure that the blobs can't be tampered with or deleted.

However there are a few different considerations that it's important to think through when using storage accounts in conjunction with App Services. In this post I'll describe one way to use this feature with Azure DevOps build and release pipelines, and some of the pros and cons of this approach.

Storage Account Considerations

When provisioning an Azure Storage account to contain your application packages, there are a few things you should consider.

First, I'd strongly recommend using SAS tokens in conjunction with a container access policy to ensure that your packages can't be accessed by anyone who shouldn't access them. Typically application packages that are destined for an App Service aren't files you want to be made available to everyone on the internet.

Second, consider the replication options you have for your storage account. For something that runs an App Service I'd generally recommend using the RA-GRS replication type to ensure that even if Azure Storage has a regional outage App Services can still access the packages. This needs to be considered as part of a wider disaster recovery strategy though, and also remember that in the event that the primary region for your Azure Storage account is unavailable, you need to do some manual work to switch your App Service to read your package from the secondary region.

Third, virtual network integration is not currently possible for storage accounts that are used by App Services, although it should be soon. Today, Azure allows joining an App Service into a virtual network, but service endpoints - the feature that allows blocking access to Azure Storage outside of a virtual network - aren't supported by App Services yet. This feature is in preview, though, so I'm hopeful that we'll be able to lock down the storage accounts used by App Services packages soon.

Fourth, consider who deploys the storage account(s), and how many you need. In most cases, having a single storage account per App Service is going to be too much overhead to maintain. But if you decide to share a single storage account across many App Services throughout your organisation, you'll need to consider how to share the keys to generate SAS tokens.

Fifth, consider the lifetime of the application versus the storage account. In Azure, resource groups provide a natural boundary for resources that share a common lifetime. An application and all of its components would typically be deployed into a single resource group. If we're dealing with a storage account that may be used across multiple applications, it would be best to have the storage account in its own resource group so that decommissioning one application won't result in the storage account being removed, and so that Azure RBAC permissions can be separately assigned.

Using Azure Pipelines

Azure Pipelines is the new term for what used to be called VSTS's build and release management features. Pipelines let us define the steps involved in building, packaging, and deploying our applications.

There are many different ways we can create an App Service package from Azure Pipelines, upload them to Azure Storage, and then deploy them. Each option has its own pros and cons, and the choice will often depending on how pure you want your build and release processes to be. For example, I prefer that my build processes are completely standalone, and don't interact with Azure at all if possible. They emit artifacts and the release process then manages the communication with Azure to deploy them. Others may not be as concerned about this separation, though.

I also insist on 100% automation for all of my build and release tasks, and vastly prefer to script things out in text files and commit them to a version control system like Git, rather than changing options on a web portal. Therefore I typically use a combination of build YAMLs, ARM templates, and PowerShell scripts to build and deploy applications.

It's fairly straightforward to automate the creation, upload, and deployment of App Service packages using these technologies. In this post I'll describe one approach that I use to deploy an App Service package, but towards the end I'll outline some variations you might want to consider.

Example

I'll use the hypothetical example of an ASP.NET web application. I'm not going to deploy any actual real application here - just the placeholder code that you get from Visual Studio when creating a new ASP.NET MVC app - but this could easily be replaced with a real application. Similarly, you could replace this with a Node.js or PHP app, or any other language and framework supported by App Service.

You can access the entire repository on GitHub here, and I've included the relevant pieces below.

Step 1: Deploy Storage Account

The first step is to deploy a storage account to contain the packages. My preference is for a storage account to be created separately to the build and release process. As I noted above, this helps with reusability - that account can be used for any number of applications you want to deploy - and it also ensures that you're not tying the lifetime of your storage account to the lifetime of your application.

I've provided an ARM template below that deploys a storage account with a unique name, and creates a blob container called packages that is used to store all of our packages. The template also outputs the connection details necessary to upload the blobs and generate a SAS token in later steps.

Step 2: Build Definition

Our Azure Pipelines build definition is pretty straightforward. All we do here is build our app and publish a build artifact. In Azure Pipelines, 'publishing' a build artifact means that it's available for releases to use - the build process doesn't actually save the package to the Azure Storage account. I've used a build YAML file to define the process, which is provided here:

Step 3: Release Definition

Our release definition runs through the remaining steps. I've provided a PowerShell script that executes the release process:

First, it uploads the build artifacts to our storage account's packages container. It generates a unique filename for each blob, ensuring that every release is fully independent and won't accidentally overwrite another release's package.

Next, it generates a SAS token for the packages it's uploaded. The token just needs to provide read access to the blob. I've used a 100-year expiry, but you could shorten this if you need to - just make sure you don't make it too short, or App Services won't be able to boot your application once the expiry date passes.

Finally, it deploys the App Service instance using an ARM template, and passes the full URL to the package - including its SAS token - into the ARM deployment parameters. The key part for the purposes of this post is on line 51 of the template, where we create a new app setting called WEBSITE_RUN_FROM_PACKAGE and set it to the full blob URL, including the SAS token part. Here's the ARM template we execute from the script:

Note that if you want to use this PowerShell script in your own release process, you'll want to adjust the variables so that you're using the correct source alias name for the artifact, as well as the correct resource group name.

Pros and Cons

The approach I've outlined here has a few benefits: it allows for a storage account to be shared across multiple applications; it keeps the build process clean and simple, and doesn't require the build to interact with Azure Storage; and it ensures that each release runs independently, uploading its own private copy of a package.

However there are a few problems with it. First, the storage account credentials need to be available to the release definition. This may not be desirable if the account is shared by multiple teams or multiple applications. Second, while having independent copies of each package is useful, it also means there's some wasted space if we deploy a single app multiple times.

If these are concerns to you, there are a number of things you could consider, depending on your concern.

If your concern is that credentials are being shared, then you could consider creating a dedicated storage account as part of the release process. The release process can provision the storage account (if it doesn't already exist), retrieve the keys to it, upload the package, generate a SAS token, and then deploy the App Service with the URL to the package. The storage account's credentials would never leave the release process. Of course, this also makes it harder to share the storage account across multiple applications.

Keeping the storage account with the application also makes the release more complicated, since you can no longer deploy everything in a single ARM template deployment operation. You'd need at least two ARM deployments, with some scripting required in between. The first ARM template deployment would deploy the storage account and container. You'd then execute some PowerShell or another script to upload the package and generate a SAS token. Then you could execute a second ARM template deployment to deploy the App Service and point it to the package URL.

Another alternative is to pre-create SAS tokens for your deployments to use. One SAS token would be used for the upload of the blobs (and would therefore need write permissions assigned), while a second would be used for the App Service to access all blobs within the container (and would only need read permissions assigned).

Yet another alternative is to use the preview Azure RBAC feature of Azure Storage to authenticate the release process to the storage account. This is outside the scope of this post, but this approach could be used to delegate permissions to the storage account without sharing any account keys.

If your concern is that the packages may be duplicated, you have a few options. One is to simply not create unique names during each release, but instead use a naming scheme that results in consistent names for the same build artifacts. For example, you might use the convention .zip. Subsequent releases could check if the package already exists and leave it alone if it does.

If you don't want to use Azure Storage at all, you can also upload a package directly to the App Service's d:\home\data\SitePackages folder. This way you gain some of the benefits of the Run From Package feature - namely the speed and atomicity of deployments - but lose the advantage of having a simpler deployment with immutable blobs. This is documented on the official documentation page. Also, you can of course use any file storage system you like, such as Amazon S3, to host your packages.

Also, bear in mind that currently App Services on Linux don't support the Run from Package feature at all currently.

VSTS Build Definitions as YAML Part 2: How?

This post was originally published on the Kloud blog.

In the last post, I described why you might want to define your build definition as a YAML file using the new YAML Build Definitions feature in VSTS. In this post, we will walk through an example of a simple VSTS YAML build definition for a .NET Core application.

Application Setup

Our example application will be a blank ASP.NET Core web application with a unit test project. I created these using Visual Studio for Mac's ASP.NET Core Web application template, and added a blank unit test project. The template adds a single unit test with no logic in it. For our purposes here we don't need any actual code beyond this. Here's the project layout in Visual Studio:

1-vsproject1.png

I set up a project on VSTS with a Git repository, and I pushed my code into that repository, in a /src folder. Once pushed, here's how the code looks in VSTS:

2-vstscodesrc1.png

Enable YAML Build Definitions

As of the time of writing (November 2017), YAML build definitions are currently a preview feature in VSTS. This means we have to explicitly turn them on. To do this, click on your profile picture in the top right of the screen, and then click Preview Features.

3-vstspreviewfeatures1.png

Switch the drop-down to For this account [youraccountname], and turn the Build YAML Definitions feature to On.

4-vstspreviewyaml1.png

Design the Build Process

Now we need to decide how we want to build our application. This will form the initial set of build steps we'll run. Because we're building a .NET Core application, we need to do the following steps:

  • We need to restore the NuGet packages (dotnet restore).

  • We need to build our code (dotnet build).

  • We need to run our unit tests (dotnet test).

  • We need to publish our application (dotnet publish).

  • Finally, we need to collect our application files into a build artifact.

As it happens, we can actually collapse this down to a slightly shorter set of steps (for example, the dotnet build command also runs an implicit dotnet restore), but for now we'll keep all four of these steps so we can be very explicit in our build definition. For a real application, we would likely try to simplify and optimise this process.

I created a new build folder at the same level as the src folder, and added a file in there called build.yaml.
VSTS also allows you to create a file in the root of the repository called .vsts-ci.yml. When this file is pushed to VSTS, it will create a build configuration for you automatically. Personally I don't like this convention, partly because I want the file to live in the build folder, and partly because I'm not generally a fan of this kind of 'magic' and would rather do things manually.

Once I'd created a placeholder build.yaml file, here's how things looked in my repository:

Writing the YAML

Now we can actually start writing our build definition!

In future updates to VSTS, we will be able to export an existing build definition as a YAML file. For now, though, we have to write them by hand. Personally I prefer that anyway, as it helps me to understand what's going on and gives me a lot of control.

First, we need to put a steps: line at the top of our YAML file. This will indicate that we're about to define a sequence of build steps. Note that VSTS also lets you define multiple build phases, and steps within those phases. That's a slightly more advanced feature, and I won't go into that here.

Next, we want to add an actual build step to call dotnet restore. VSTS provides a task called .NET Core, which has the internal task name DotNetCoreCLI. This will do exactly what we want. Here's how we define this step:

Let's break this down a bit. Also, make sure to pay attention to indentation - this is very important in YAML.

- task: DotNetCoreCLI@2 indicates that this is a build task, and that it is of type DotNetCoreCLI. This task is fully defined in Microsoft's VSTS Tasks repository. Looking at that JSON file, we can see that the version of the task is 2.1.8 (at the time of writing). We only need to specify the major version within our step definition, and we do that just after the @ symbol.

displayName: Restore NuGet Packages specifies the user-displayable name of the step, which will appear on the build logs.

inputs: specifies the properties that the task takes as inputs. This will vary from task to task, and the task definition will be one source you can use to find the correct names and values for these inputs.

command: restore tells the .NET Core task that it should run the dotnet restore command.

projects: src tells the .NET Core task that it should run the command with the src folder as an additional argument. This means that this task is the equivalent of running dotnet restore src from the command line.

The other .NET Core tasks are similar, so I won't include them here - but you can see the full YAML file below.
Finally, we have a build step to publish the artifacts that we've generated. Here's how we define this step:

This uses the PublishBuildArtifacts task. If we consult the definition for this task on the Microsoft GitHub repository, we can see This task accepts several arguments. The ones we're setting are:

  • pathToPublish is the path on the build agent where the dotnet publish step has saved its output. (As you will see in the full YAML file below, I manually overrode this in the dotnet publish step.)

  • artifactName is the name that is given to the build artifact. As we only have one, I've kept the name fairly generic and just called it deploy. In other projects, you might have multiple artifacts and then give them more meaningful names.

  • artifactType is set to container, which is the internal ID for the Artifact publish location: Visual Studio Team Services/TFS option.

Here is the complete build.yaml file:

Set up a Build Configuration and Run

6-vstsbuildnew1.png

Now we can set up a build configuration in VSTS and tell it to use the YAML file. This is a one-time operation. In VSTS's Build and Release section, go to the Builds tab, and then click New Definition.

You should see YAML as a template type. If you don't see this option, check you enabled the feature as described above.

We'll configure our build configuration with the Hosted VS2017 queue (this just means that our builds will run on a Microsoft-managed build agent, which has Visual Studio 2017 installed). We also have to specify the relative path to the YAML file in the repository, which in our case is build/build.yaml.

8-vstsbuildcomplete1.png

Now we can save and queue a build. Here's the final output from the build:

(Yes, this is build number 4 - I made a couple of silly syntax errors in the YAML file and had to retry a few times!)

As you can see, the tasks all ran successfully and the test passed. Under the Artifacts tab, we can also see that the deploy artifact was created:

10-vstsbuildartifacts1.png

Tips and Other Resources

This is still a preview feature, so there are still some known issues and things to watch out for.

Documentation

There is a limited amount of documentation available for creating and using this feature. The official documentation links so far are:

In particular, the properties for each task are not documented, and you need to consult the task's task.json file to understand how to structure the YAML syntax. Many of the built-in tasks are defined in Microsoft's GitHub repository, and this is a great starting point, but more comprehensive documentation would definitely be helpful.

Examples

There aren't very many example templates available yet. That is why I wrote this article. I also recommend Marcus Felling's recent blog post, where he provides a more complex example of a YAML build definition.

Tooling

As I mentioned above, there is limited tooling available currently. The VSTS team have indicated that they will soon provide the ability to export an existing build definition as YAML, which will help a lot when trying to generate and understand YAML build definitions. My personal preference will still be to craft them by hand, but this would be a useful feature to help bootstrap new templates.

Similarly, there currently doesn't appear to be any validation of the parameters passed to each task. If you misspell a property name, you won't get an error - but the task won't behave as you expect. Hopefully this experience will be improved over time.

Error Messages

The error messages that you get from the build system aren't always very clear. One error message I've seen frequently is Mapping values are not allowed in this context. Whenever I've had this, it's been because I did something wrong with my YAML indentation. Hopefully this saves somebody some time!

Releases

This feature is only available for VSTS build configurations right now. Release configurations will also be getting the YAML treatment, but it seems this won't be for another few months. This will be an exciting development, as in my experience, release definitions can be even more complex than build configurations, and would benefit from all of the peer review, versioning, and branching goodness I described above.

Variable Binding and Evaluation Order

While you can use VSTS variables in many places within the YAML build definitions, there appear to be some properties that can't be bound to a variable. One I've encountered is when trying to link an Azure subscription to a property.

For example, in one of my build configurations, I want to publish a Docker image to an Azure Container Registry. In order to do this I need to pass in the Azure service endpoint details. However, if I specify this as a variable, I get an error - I have to hard-code the service endpoint's identifier into the YAML file. This is not something I want to do, and will become a particular issue when release definitions can be defined in YAML, so I hope this gets fixed soon.

Summary

Build definitions and build scripts are an integral part of your application's source code, and should be treated as such. By storing your build definition as a YAML file and placing it into your repository, you can begin to improve the quality of your build pipeline, and take advantage of source control features like diffing, versioning, branching, and pull request review.

VSTS Build Definitions as YAML Part 1: What and Why?

This post was originally published on the Kloud blog.

Visual Studio Team Services (VSTS) has recently gained the ability to create build definitions as YAML files. This feature is currently in preview. In this post, I'll explain why this is a great addition to the VSTS platform and why you might want to define your builds in this way. In the next post I'll work through an example of using this feature, and I'll also provide some tips and links to documentation and guidance that I found helpful when constructing some build definitions myself.

What Are Build Definitions?

If you use a build server of some kind (and you almost certainly should!), you need to tell the build server how to actually build your software. VSTS has the concept of a build configuration, which specifies how and when to build your application. The 'how' part of this is the build definition. Typically, a build definition will outline how the system should take your source code, apply some operations to it (like compiling the code into binaries), and emit build artifacts. These artifacts usually then get passed through to a release configuration, which will deploy them to your release environment.

In a simple static website, the build definition might simply be one step that copies some files from your source control system into a build artifact. In a .NET Core application, you will generally use the dotnet command-line tool to build, test, and publish your application. Other application frameworks and languages will have their own way of building their artifacts, and in a non-trivial application, the steps involved in building the application might get fairly complex, and may even trigger PowerShell or Bash scripts to allow for further flexibility and advanced control flow and logic.

Until now, VSTS has really only allowed us to create and modify build definitions in its web editor. This is a great way to get started with a new build definition, and you can browse the catalog of available steps, add them into your build definition, and configure them as necessary. You can also define and use variables to allow for reuse of information across steps. However, there are some major drawbacks to defining your build definition in this way.

Why Store Build Definitions in YAML?

Build definitions are really just another type of source code. A build definition is just a specification of how your application should be built. The exact list of steps, and the sequence in which they run, is the way in which we are defining part of our system's behaviour. If we adopt a DevOps mindset, then we want to make sure we treat all of our system as source code, including our build definitions, and we want to take this source code seriously.

In November 2017, Microsoft announced that VSTS now has the ability to run builds that have been defined as a YAML file, instead of through the visual editor. The YAML file gets checked into the source control system, exactly the same way as any other file. This is similar to the way Travis CI allows for build definitions to be specified in YAML files, and is great news for those of us who want to treat our build definitions as code. It gives us many advantages.

Versioning

We can keep build definitions versioned, and can track the history of a build definition over time. This is great for audibility, as well as to ensure that we have the ability to consult or roll back to a previous version if we accidentally make a breaking change.

Keeping Build Definitions with Code

We can store the build definitions alongside the actual code that it builds, meaning that we are keeping everything tidy and in one place. Until now, if we wanted to fully understand the way the application was built, we'd have to look at at the code repository as well as the VSTS build definition. This made the overall process harder to understand and reason about. In my opinion, the fewer places we have to remember to check or update during changes the better.

Branching

Related to the last point, we can also use make use of important features of our source control system like branching. If your team uses GitHub Flow or a similar Git-based branching strategy, then this is particularly advantageous.

Let's take the example of adding a new unit test project to a .NET Core application. Until now, the way you might do this is to set up a feature branch on which you develop the new unit test project. At some point, you'll want to add the execution of these tests to your build definition. This would require some careful timing and consideration. If you update the build definition before your branch is merged, then any builds you run in the meantime will likely fail - they'll be trying to run a unit test project that doesn't exist outside of your feature branch yet. Alternatively, you can use a draft version of your build configuration, but then you need to plan exactly when to publish that draft.

If we specify our build definition in a YAML file, then this change to the build process is simply another change that happens on our feature branch. The feature branch's YAML file will contain a new step to run the unit tests, but the master branch will not yet have that step defined in its YAML file. When the build configuration runs on the master branch before we merge, it will get the old version of the build definition and will not try to run our new unit tests. But any builds on our feature branch, or the master branch once our feature branch is merged, will include the new tests.

This is very powerful, especially in the early stages of a project where you are adding and changing build steps frequently.

Peer Review

Taking the above point even further, we can review a change to a build definition YAML file in the same way that we would review any other code changes. If we use pull requests (PRs) to merge our changes back into the master branch then we can see the change in the build definition right within the PR, and our team members can give them the same rigorous level of review as they do our other code files. Similarly, they can make sure that changes in the application code are reflected in the build definition, and vice versa.

Reuse

Another advantage of storing build definitions in a simple format like YAML is being able to copy and paste the files, or sections from the files, and reuse them in other projects. Until now, copying a step from one build definition to another was very difficult, and required manually setting up the steps. Now that we can define our build steps in YAML files, it's often simply a matter of copying and pasting.

Linting

A common technique in many programming environments is linting, which involves running some sort of analysis over a file to check for potential errors or policy violations. Once our build definition is defined in YAML, we can perform this same type of analysis if we wanted to write a tool to do it. For example, we might write a custom linter to check that we haven't accidentally added any credentials or connection strings into our build definition, or that we haven't mistakenly used the wrong variable syntax. YAML is a standard format and is easily parsable across many different platforms, so writing a simple linter to check for your particular policies is not a complex endeavour.

Abstraction and Declarative Instruction

I'm a big fan of declarative programming - expressing your intent to the computer. In declarative programming, software figures out the right way to proceed based on your high-level instructions. This is the idea behind many different types of 'desired-state' automation, and in abstractions like LINQ in C#. This can be contrasted with an imperative approach, where you specify an explicit sequence of steps to achieve that intent.

One approach that I've seen some teams adopt in their build servers is to use PowerShell or Bash scripts to do the entire build. This provided the benefits I outlined above, since those script files could be checked into source control. However, by dropping down to raw script files, this meant that these teams couldn't take advantage of all of the built-in build steps that VSTS provides, or the ecosystem of custom steps that can be added to the VSTS instance.

Build definitions in VSTS are a great example of a declarative approach. They are essentially an abstraction of a sequence of steps. If you design your build process well, it should be possible for a newcomer to look at your build definition and determine what the sequence of steps are, why they are happening, and what the side-effects and outcomes will be. By using abstractions like VSTS build tasks, rather than hand-writing imperative build logic as a sequence of command-line steps, you are helping to increase the readability of your code - and ultimately, you may increase the performance and quality by allowing the software to translate your instructions into actions.

YAML build definitions give us all of the benefits of keeping our build definitions in source control, but still allows us to make use of the full power of the VSTS build system.

Inline Documentation

YAML files allow for comments to be added, and this is a very helpful way to document your build process. Frequently, build definitions can get quite complex, with multiple steps required to do something that appears rather trivial, so having the ability to document the process right inside the definition itself is very helpful.

Summary

Hopefully through this post, I've convinced you that storing your build definition in a YAML file is a much tidier and saner approach than using the VSTS web UI to define how your application is built. In the next post, I'll walk through an example of how I set up a simple build definition in YAML, and provide some tips that I found useful along the way.