Simon Online

2023

2023-03-30

Alerting on Blob Storage Throttling

Blob storage is the workhorse of Azure. It is one of the original services and has grown with the times to allow storing data in a variety of formats. It is able to scale perhaps not to the moon but certainly to objects in low earth orbit(LEO).

One of my clients has a fair bit of data stored in a file share hosted in Azure Storage. They do nightly processing on this data using a legacy IaaS system. We were concerned that we might saturate the blob storage account with our requests. Fortunately, there are metrics we can use to understand what’s going on inside blob storage. Nobody wants to monitor these all the time so we set up some alerting rules for the storage account.

Alert rules can easily be created by going to the file share in the storage account and clicking on metrics. Then in the top bar click on New Alert Rule

The typical rules we applied were

Alerting if we reach a certain % of capacity. We set this to about 90%
Alerting if we see the number of transactions fall outside a typical range. We used a dynamic rule for this to account for how the load on this batch processing system changes overnight.

However there was one additional metric we wanted to catch: when we have hit throttling. This was a bit trickier to set up because we’ve never actually hit this threshold. This means that the dimensions to filter on don’t actually show up in the portal. They must be entered by hand.

These are the normal values we see

By clicking on add custom value we were able to add 3 new response codes

ClientAccountBandwidthThrottlingError
ClientShareIopsThrottlingError
ClientThrottlingError

With these in place we can be confident that should these ever occur we’ll be alerted to it

2023-03-15

Importing Vuetify via ESM

If you want a quick way to add Vue 3 and Vuetify to your project via the UMD CDN then you can do so using ESM. ESM are ECMAScript Modules and are now supported by the majority of browsers. This is going to look like

<script type="importmap">
    { "imports": 
        { "vue": "https://unpkg.com/vue@3.2.47/dist/vue.esm-browser.js" }
          "vuetify": "https://unpkg.com/vuetify@3.1.10/dist/vuetify.esm.js"
        }
</script>
<script type="module">
    import {createApp} from 'vue'
    import { createVuetify } from 'vuetify'
    const vuetify = createVuetify()

    createApp({
      data() {
        return {
          message: 'Hello Vue!'
        }
      }
    }).use(vuetify).mount('#app')
</script>

The import map is a way to map a module name to a URL. This is necessary because the Vuetify ESM module imports from Vue. Don’t forget you’ll also need to add in the CSS for Vuetify

2023-03-04

App Service Quota Issue

I was deploying an app service in a new region today and ran into a quota issue. The error message was:

Error: creating Service Plan: (Serverfarm Name "***devplan" / Resource Group "***_dev"): web.AppServicePlansClient#CreateOrUpdate: Failure sending request: StatusCode=401 -- Original Error: Code="Unauthorized" Message="This region has quota of 0 instances for your subscription. Try selecting different region or SKU."

This was a pretty simple deployment to an S1 app service plan. I’ve run into this before and it’s typically easy to request a bump in quota in the subscription. My problem today was that it isn’t obvious what CPU quota I need to request. I Googled around and found some suggestion that S1 ran on A series VMs but that wasn’t something I had any limits on.

Creating in the UI gave the same error

I asked around and eventually somebody in the know was able to look into the consumption in that region. The cloud was full! Well not full but creation of some resources was restricted. Fortunately this was just a dev deployment so I was able to move to a different region and get things working. It would have been pretty miserable if this was a production deployment or if I was adding onto an existing deployment.

2023-03-04

Azure KeyVault Reference Gotcha

I was working on a deployment today and ran into an issue with a keyvault reference. In the app service the keyvault reference showed that it wasn’t able to get the secret. The reference seemed good but I wasn’t seeing what I wanted to see which was a couple of green checkmarks

The managed identity on the app service had only GET access to the keyvault. I added LIST access and the reference started working. I’m not sure why this is but I’m guessing that the reference is doing a LIST to get the secret and then a GET to get the secret value.

2023-02-16

Allow Comments in JSON Payload in ExpressJS

Officially comments are not supported in the JSON format. In fact this lack of ability to comment is one of the reasons that lead to the downfall of the JSON based project system during the rewrite of the .NET some years back. However they sure can be useful to support. In my case I wanted to add some comments to the body of a request to explain a parameter in Postman. I like to keep comments as close to the thing they describe as possible so I didn’t want this on a wiki somewhere nobody would ever find.

The content looked something like

{
    "data": {
        "form_data": {
            "effective_date": "2023-02-23",
            "match_on_per_pay_period_basis": 0, /* 0 if yes, 1 if no */
            "simple_or_tiered": 1, /* 0 if simple 1 if tiered */
        }
    }
}

This was going to an ExpressJS application which was parsing the body using body-parser. These days we can just use express.json() and avoid taking on that additional dependency. The JSON parsing in both these is too strict to allow for comments. Fortunately, we can use middleware to resolve the issue. There is a swell package called strip-json-comments which does the surprisingly difficult task of stripping comments. We can use that.

The typical json paring middleware looks like

app.use(express.json())

or 

app.use(bodyParser.json())

Instead we can do

import stripJsonComments from 'strip-json-comments';

...

app.use(express.text{
    type: "application/json" // 
}) //or app.use(bodyParser.text({type: "application/json}))
app.use((req,res,next)=> {
    if(req.body){
        req.body = stripJsonComments(req.body);
    }
    next();
})

This still allows us to take advantage of the compression and character encoding facilities in the original parser while also intercepting and cleaning up the JSON payload.

2023-02-15

Excel and Ruby

Excel is the king of spreadsheets and I often find myself in situation where I have to write our Excel files in an application. I’d say that as an application grows the probability of needing Excel import or export approaches 1. Fortunately, there are lots of libraries out there to help with Excel across just about every language. The quality and usefuleness of these libraries varies a lot. In Ruby land there seem to be a few options.

Spreadsheet

https://github.com/zdavatz/spreadsheet/

As the name suggests this library deals with Excel spreadsheets. It is able to both read and write them by using Spreadsheet::Excel Library and the ParseExcel Library. However it only supports the older XLS file format. While this is still widely used it is not the default format for Excel 2007 and later. I try to stay clear of the format as much as possible. There have not been any releases of this library in about 18 months but there haven’t been any releases of the XLS file format for decades so it doesn’t seem like a big deal.

The library can be installed using

gem install spreadsheet

Then you can use it like so

require 'spreadsheet'

workbook = Spreadsheet.open("test.xls")
worksheet = workbook.worksheet 0
worksheet.rows[1][1] = "Hello there!"
workbook.write("test2.xls")

There are some limitations around editing files such as cell formats not updating but for most things it should be fine.

RubyXL

https://github.com/weshatheleopard/rubyXL

This library works on the more modern XLSX file formats. It is able to read and write files with modifications. However there are some limitations such as being unable to insert images

require 'rubyXL'

  # only do this if you don't care about memory usage, otherwise you can load submodules separately
  # depending on what you need
require 'rubyXL/convenience_methods'

workbook = RubyXL::Parser.parse("test.xlsx")
worksheet = workbook[0]
cell = worksheet.cell_at('A1')
cell.change_contents("Hello there!")
workbook.write("test2.xlsx")

CAXLSX

https://github.com/caxlsx/caxlsx

This library is the community supported version of AXLSX. It is able to generate XLSX files but not read them or modify them. There is rich support for charts, images and other more advanced excel features. The

Install using

gem install caxlsx

And then a simple example looks like

require 'axlsx'

p = Axlsx::Package.new
workbook = p.workbook

wb.add_worksheet(name: 'Test') do |sheet|
  sheet.add_row ['Hello there!']
end

p.serialize "test.xlsx"

Of all the libraries mentioned here the documentation for this one is the best. It is also the most actively maintained. The examples directory https://github.com/caxlsx/caxlsx/tree/master/examples gives a plethora of examples of how to use the library.

Fast Excel

https://github.com/Paxa/fast_excel

This library focuses on being the fastest excel library for ruby. It is actually written in C to speed it up so comes with all the caveats about running native code. Similar to CAXLSX it is only able to read and write files and not modify them.

require 'fast_excel'

  # constant_memory: true streams changes to disk so it means that you cannot
  # modify an already written record
workbook = FastExcel.open("test.xlsx", constant_memory: true)
worksheet = workbook.add_worksheet("Test")

bold = workbook.bold_format
worksheet.set_column(0, 0, FastExcel::DEF_COL_WIDTH, bold)
worksheet << ["Hello World"]
workbook.close

As you can see here the library really excels at adding consistently shaped rows. You’re unlikely to get a complex spreadsheet with headers and footers built using this tooling.

2022

2022-11-17

Bulk Insert SQL Geometry on .NET Core

I have been updating an application from full framework to .NET 6 this week. One of the things this app does is bulk load data into SQL Server. Normally this works just fine but some of the data is geography data which requires a special package to be installed: Microsoft.SqlServer.Types. This package is owned by the SQL server team so, as you’d expect, it is ridiculously behind the times. Fortunately, they are working on updating it and it is now available for Netstandard 2.1 in a preview mode.

The steps I needed to take to update the app were:

Install the preview package for Microsoft.SqlServer.Types
Update the SQL client package from System.Data.SqlClient to Microsoft.Data.SqlClient

After that the tests we had for inserting polygons worked just great. This has been a bit of a challenge over the years but I’m delighted that we’re almost there. We just need a non-preview version of the types package and we should be good to go.

Gotchas

When I’d only done step 1 I ran into errors like

System.InvalidOperationException : The given value of type SqlGeometry from the data source cannot be converted to type udt of the specified target column.
---- System.ArgumentException : Specified type is not registered on the target server. Microsoft.SqlServer.Types.SqlGeometry, Microsoft.SqlServer.Types, Version=16.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91.

I went down a rabbit hole on that one before spotting a post from MVP Erik Jensen https://github.com/ErikEJ/EntityFramework6PowerTools/issues/103 which sent me in the right direction.

2022-11-11

Removing Azure Backups in Terraform

If you have a VM backup in your Terraform state and need to get rid of it be aware that it is probably going to break your deployment pipeline. The reason is that Terraform will delete the item but then find that the resource still there. This is because backup deletion takes a while (say 14 days). Eventually the backup will delete but not before Terraform times out.

The solution I’m using is to just go in an manually delete the backup from the terraform state to unblock my pipelines.

terraform state list | grep <name of your backup>
-- make note of the resource identifier --
terraform state rm <found resource identifier>

Editing Terraform state seems scary but it’s not too bad after you do it a bit. Take backups!

2022-11-01

Dealing with Set-Output Depreciation Warnings in Terraform github-actions

I’ve got a build that is running terraform on github actions (I actually have loads of them) and I’ve been noticing that they are very chatty about warnings now.

The warning is

The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/

The history here without reading that link is basically that github are changing how we push variables to the pipeline for use in later steps. There were some security implications with the old approach and the new approach should be better

- name: Save variable
  run: echo "SOMENAME=PICKLE" >> $GITHUB_STATE

- name: Set output
  run: echo "SOMENAME=PICKLE" >> $GITHUB_OUTPUT

Problem was that the steps on which I was having trouble didn’t obviously use the set-output command.

 ...
- name: Init Terraform
  run: terraform init 
- name: Validate Terraform
  run: terraform validate
...

I had to dig a bit to find out that it was actually the terraform command that was causing the problem. See as part of the build I install the terraform cli using the

  - name: HashiCorp - Setup Terraform
    uses: hashicorp/setup-terraform@v2.0.2
    with:
        terraform_version: 1.1.9
        terraform_wrapper: true

Turns out that as of writing the latest version of the wrapper installed by the setup-terraform task makes use of an older version of the @actions/core package. This package is what is used to set the output and before version 1.10 it did so using set-output. A fix has been merged into the setup-terraform project but no update released yet.

For now I found that I had no need for the wrapper so I disabled it with

  - name: HashiCorp - Setup Terraform
    uses: hashicorp/setup-terraform@v2.0.2
    with:
        terraform_version: 1.1.9
        terraform_wrapper: false

but for future readers if there is a more recent version of setup-terraform than 2.0.2 then you can update to that to remove the warnings. Now my build is clean

2022-11-01

My Theory of GitHub Actions and IaC

I do all sorts of random work and one of those is helping out on some infrastructure deployments on Azure. Coming from a development background I’m allergic to clicking around inside an Azure website to configure things in a totally non-repeatable way. So I’ve been using Terraform to do the deployments. We have built up a pretty good history of using Terraform - today I might use Pulumi instead but the actual tool isn’t all that important as opposed to the theory.

What I’m looking to achieve is a number of things

Make deployments easy for people to do
Make deployments repeatable - we should be able to us the same deployment scripts to set up a dev enviornment or recover from a disaster with minimal effort
Ensure that changes are reviewed before they are applied

To meet these requirements a build pipeline in GitHub actions (or Azure DevOps, for that matter) is an ideal fit. We maintain our Terraform scripts in a repository. Typically we use one repository per resource group but your needs may vary around that. There isn’t any monetary cost to having multiple repositories but there can be some cognitive load to remembering where the right repository is (more on that later).

Source Code

Changes to the infrastructure definition code are checked into a shared repository. Membership over this code is fairly relaxed. Developers and ops people can all make changes to the code. We strive to make use of normal code review approaches when checking in changes. We’re not super rigorous about changes which are checked in because many of the people checking in changes have ops backgrounds and aren’t all that well versed in the PR process. I want to make this as easy for them as possible so they aren’t tempted to try to make changes directly in Azure.

In my experience there is a very strong temptation for people to abandon rigour when a change is needed at once to address a business need. We need to change a firewall rule - no time to review that let’s just do it. I’m not saying that this is a good thing but it is a reality. Driving people to the Terraform needs to be easy. Having their ad-hoc changes overwritten by a Terraform deploy will also help drive the point home. Stick and carrot.

Builds

A typical build pipeline for us will include 3 stages.

The build step runs on a checkin trigger. This will run an initial build step which validates the terraform scripts are syntactically correct and well linted. A small number of our builds stop here. Unlike application deployments we typically want these changes to be live right away or at most during some maintenance window shortly after the changes have been authored. That deployments are run close to the time the changes were authored helps with our lack of rigour around code reviews.

The next stage is to preview what changes will be performed by Terraform. This stage is gated such that it need somebody to actually approve it. It is low risk because no changes to made - we run a terraform plan and see what changes will be made. Reading over these changes is very helpful because we often catch unintended consequences here. Accidentally destroying and recreating a VM instead of renaming it? Caught here. Removing a tag that somebody manually applied to a resource and that should be preserved? Caught here.

The final stage in the pipeline is to run the Terraform changes. This step is also gated to prevent us from deploying it without proper approvals. Depending on the environment we might need 2 approvals or at least one approval that isn’t the person writing the change. More eyes on a change will catch problems more easily and also socialize changes so that it isn’t a huge shock to the entire ops team that we now have a MySQL server in the enviornment or whatever it may be.

Permissions

I mentioned earlier that we like to guide ops people to make enviornment changes in Terraform rather than directly in Azure. One of the approaches we take around that is to not grant owner or writer permissions to resources directly to people be they ops or dev. Instead we have a number of permission restricted service principals that are used to make changes to resources. These service principals are granted permissions to specific resource groups and these service principals are what’s used in the pipeline to make the changes. This means that if somebody wants to make a change to a resource they need to go through the Terraform pipeline.

We keep the client id and secret in the secrets of the github pipeline

In this example we just keep a single repository wide key because we only have one enviornment. We’d make use of enviornment specific secrets if we had more than one environment.

This approach has the added bonus of providing rip stops in the event that we leak some keys somewhere. At worst that service principal has access only to one resource group so an attacker is limited to being able to mess with that group and not escape to the larger enviornment.

Achieving our Goals

To my mind this approach is exactly how IaC was meant to be used. We have a single source of truth for our infrastructure. We can make changes to that infrastructure in a repeatable way. We can review those changes before they are applied. All this while keeping the barrier to entry low for people who are not familiar with the code review process.

Future Steps

We already make use of Terraform modules for most of our deployment but we’re not doing a great job of reusing these modules from project to project. We’re hoping to keep a library of these modules around which can help up standardize things. For instance our VM module doesn’t just provision a VM - it sets up backups and uses a standardized source image.

I also really like the idea of using the build pipeline to annotate pull requests with the Terraform changes using https://github.com/marketplace/actions/terraform-pr-commenter. Surfacing this directly on the PR would save the reviewers the trouble of going through the pipeline to see what changes are being made. However it would be added friction for our ops team as they’d have to set up PRs.

Archives

A blog about computer programming and technology.

My Books