2023-11-23

Docker COPY not Finding Files

My dad once told me that there are no such things a problems just solutions waiting to be applied. I don’t know what book he’d just read or course he’d just been on to spout such nonsense but I’ve never forgotten it.

Today my not problem was running a docker build wasn’t copying the files I was expecting it to. In particular I had a themes directory which was not ending up in the image and in fact the build was failing with something like

ERROR: failed to solve: failed to compute cache key: failed to calculate checksum of ref b1f3faa4-fdeb-41ed-b016-fac3862d370a::pjh3jwhj2huqmcgigjh9udlh2: "/themes": not found

I was really confused because themes absolutly did exist on disk. It was as if it wasn’t being added to the build context. In fact it wasn’t being added and, as it turns out, this was because my .dockerignore file contained

**

Which ignores everything from the local directory. That seemed a bit extreme so I changed it to

** 
!themes

With this in place the build worked as expected.

2023-11-20

Connect to a Service Container in Github Actions

Increasingly there is a need to run containers during a github action build to run realistic tests. In my specific scenario I had a database integration test that I wanted to run against a postgres database with our latest database migrations applied.

We run our builds inside a multi-stage docker build so we actually need to have a build container communicate with the database container during the build phase. This is easy enough in the run phase but in the build phase there is just a flag you can pass to the build called network which takes an argument but the arguments it can take don’t appear to be documented anywhere. After significant trial and error I found that the argument it takes that we want is host. This will build the container using the host networking. As we surfaced the ports for postgres in our workflow file like so

postgres:
    image: postgres:15.3
    ports:
        - 5432:5432
    env:
        POSTGRES_DB: default
        POSTGRES_USER: webapp_user
        POSTGRES_PASSWORD: password
    options: >-
        --health-cmd pg_isready
        --health-interval 10s
        --health-timeout 5s
        --health-retries 5

We are able to access the database from the build context with 127.0.0.1. So we can pass in a variable to our container build

docker build --network=host . --tag ${{ env.DOCKER_REGISTRY_NAME }}/${{ env.DOCKER_IMAGE_NAME }}:${{ github.run_number }} --build-arg 'DATABASE_CONNECTION_STRING=${{ env.DATABASE_CONNECTION_STRING }}'

With all this in place the tests run nicely in the container during the build. Phew.

2023-10-30

Setting up SMTP for Keycloak Using Mailgun

Quick entry here about setting up Mailgun as the email provider in Keycloak. To do this first you’ll need to create SMTP credentials in Mailgun and note the generated password

Next in Keycloak set the credentials up in the realm settings under email.

Check both the SSL boxes and give it port 465.

2023-10-14

Load Testing with Artillery

Load testing a site or an API can be a bit involved. There are lots of things to consider like what the traffic on your site typically looks like, what peaks look like and so forth. That’s mostly outside the scope of this article which is just about load testing with artillery.

Scenario

We have an API that we call which is super slow and super fragile. We were recently told by the team that maintains it that they’d made improvements and increased our rate limit from something like 200 requests per minute to 300 and could we test it. So sure, I guess we can do your job for you. For this we’re going to use the load testing tool artillery.

Getting started

Artillery is a node based tool so you’ll need to have node installed. You can install artillery with npm install -g artillery.

You then write a configuration file to tell artillery what to do. Here’s the one I used for this test (with the names of the guilty redacted):

config: 
  target: https://some.company.com
  phases:
    - duration: 1
      arrivalRate: 1
  http:
    timeout: 100
scenarios:
 - flow:
    - log: "Adding new user
    - post:
        url: /2020-04/graphql
        body: |
          {"query":"query readAllEmployees($limit: Int!, $cursor: String, $statusFilter: [String!]!) {\n company {\n employees(filter: {status: {in: $statusFilter}}, pagination: {first: $limit, after: $cursor}) {\n pageInfo {\n hasNextPage\n startCursor\n endCursor\n hasPreviousPage\n }\n nodes {\n id\n firstName\n lastName\n\t\tmiddleName\n birthDate\n displayName\n employmentDetail {\n employmentStatus\n hireDate\n terminationDate\n }\n taxIdentifiers {\n taxIdentifierType\n value\n }\n payrollProfile {\n preferredAddress {\n streetAddress1\n streetAddress2\n city\n zipCode\n county\n state\n country\n }\n preferredEmail\n preferredPhone\n compensations {\n id\n timeWorked {\n unit\n value\n }\n active\n amount\n multiplier\n employerCompensation {\n id\n name\n active\n amount\n timeWorked {\n unit\n value\n }\n multiplier\n }\n }\n }\n }\n }\n }\n}\n","variables":{
          "limit": 100,
          "cursor": null,
          "statusFilter": [
          "ACTIVE",
          "TERMINATED",
          "NOTONPAYROLL",
          "UNPAIDLEAVE",
          "PAIDLEAVE"
          ]
          }}
        headers:
          Content-Type: application/json
          Authorization: Bearer <redacted>

As you can see this is graphql and it is a private API so we need to pass in a bearer token. The body I just stole from our postman collection so it isn’t well formatted.

Running this is as simple as running artillery run <filename>.

At the top you can see arrival rates and duration. This is saying that we want to ramp up to 1 requests per second over the course of 1 second. So basically this is just proving that our request works. The first time I ran this I only got back 400 errors. To get the body of the response to allow me to see why I was getting a 400 I set

export DEBUG=http,http:capture,http:response

Once I had the simple case working I was able to increase the rates to higher levels. To do this I ended up adjusting the phases to look like

  phases:
    - duration: 30
      arrivalRate: 30
      maxVusers: 150

This provisions 30 users a second up to a maximum of 150 users - so that takes about 5 seconds to saturate. I left the duration higher because I’m lazy and artillery is smart enough to not provision more. Then to ensure that I was pretty constantly hitting the API with the maximum number of users I added a loop to the scenario like so:

scenarios:
 - flow:
    - log: "New virtual user running"
    - loop:
      - post:
          url: /2020-04/graphql
          body: |
            {"query":"query readAllEmployeePensions($limit: Int!, $cursor: String, $statusFilter: [String!]!) {\n company {\n employees(filter: {status: { in: $statusFilter }}, pagination: {first: $limit, after: $cursor}) {\n pageInfo {\n hasNextPage\n startCursor\n endCursor\n hasPreviousPage\n }\n nodes {\n id\n displayName\n payrollProfile {\n pensions {\n id\n active\n employeeSetup {\n amount {\n percentage\n value\n }\n cappings {\n amount\n timeInterval\n }\n }\n employerSetup {\n amount {\n percentage\n value\n }\n cappings {\n amount\n timeInterval\n }\n }\n employerPension {\n id\n name\n statutoryPensionPolicy\n }\n customFields {\n name\n value\n }\n }\n }\n }\n }\n }\n}\n","variables":{
            "limit": 100,
            "statusFilter": [
            "ACTIVE"
            ]
            }}
          headers:
            Content-Type: application/json
            Authorization: Bearer <redacted>
      count: 100

Pay attention to that count at the bottom.

I was able to use this to fire thousands of requests at the service and prove out that our rate limit was indeed higher than it was before and we could raise our concurrency.

2023-03-30

Alerting on Blob Storage Throttling

Blob storage is the workhorse of Azure. It is one of the original services and has grown with the times to allow storing data in a variety of formats. It is able to scale perhaps not to the moon but certainly to objects in low earth orbit(LEO).

One of my clients has a fair bit of data stored in a file share hosted in Azure Storage. They do nightly processing on this data using a legacy IaaS system. We were concerned that we might saturate the blob storage account with our requests. Fortunately, there are metrics we can use to understand what’s going on inside blob storage. Nobody wants to monitor these all the time so we set up some alerting rules for the storage account.

Alert rules can easily be created by going to the file share in the storage account and clicking on metrics. Then in the top bar click on New Alert Rule

The typical rules we applied were

  1. Alerting if we reach a certain % of capacity. We set this to about 90%
  2. Alerting if we see the number of transactions fall outside a typical range. We used a dynamic rule for this to account for how the load on this batch processing system changes overnight.

However there was one additional metric we wanted to catch: when we have hit throttling. This was a bit trickier to set up because we’ve never actually hit this threshold. This means that the dimensions to filter on don’t actually show up in the portal. They must be entered by hand.

These are the normal values we see

By clicking on add custom value we were able to add 3 new response codes

  • ClientAccountBandwidthThrottlingError
  • ClientShareIopsThrottlingError
  • ClientThrottlingError

With these in place we can be confident that should these ever occur we’ll be alerted to it

2023-03-15

Importing Vuetify via ESM

If you want a quick way to add Vue 3 and Vuetify to your project via the UMD CDN then you can do so using ESM. ESM are ECMAScript Modules and are now supported by the majority of browsers. This is going to look like

<script type="importmap">
    { "imports": 
        { "vue": "https://unpkg.com/vue@3.2.47/dist/vue.esm-browser.js" }
          "vuetify": "https://unpkg.com/vuetify@3.1.10/dist/vuetify.esm.js"
        }
</script>
<script type="module">
    import {createApp} from 'vue'
    import { createVuetify } from 'vuetify'
    const vuetify = createVuetify()

    createApp({
      data() {
        return {
          message: 'Hello Vue!'
        }
      }
    }).use(vuetify).mount('#app')
</script>

The import map is a way to map a module name to a URL. This is necessary because the Vuetify ESM module imports from Vue. Don’t forget you’ll also need to add in the CSS for Vuetify

2023-03-04

App Service Quota Issue

I was deploying an app service in a new region today and ran into a quota issue. The error message was:

Error: creating Service Plan: (Serverfarm Name "***devplan" / Resource Group "***_dev"): web.AppServicePlansClient#CreateOrUpdate: Failure sending request: StatusCode=401 -- Original Error: Code="Unauthorized" Message="This region has quota of 0 instances for your subscription. Try selecting different region or SKU."

This was a pretty simple deployment to an S1 app service plan. I’ve run into this before and it’s typically easy to request a bump in quota in the subscription. My problem today was that it isn’t obvious what CPU quota I need to request. I Googled around and found some suggestion that S1 ran on A series VMs but that wasn’t something I had any limits on.

Creating in the UI gave the same error

I asked around and eventually somebody in the know was able to look into the consumption in that region. The cloud was full! Well not full but creation of some resources was restricted. Fortunately this was just a dev deployment so I was able to move to a different region and get things working. It would have been pretty miserable if this was a production deployment or if I was adding onto an existing deployment.

2023-03-04

Azure KeyVault Reference Gotcha

I was working on a deployment today and ran into an issue with a keyvault reference. In the app service the keyvault reference showed that it wasn’t able to get the secret. The reference seemed good but I wasn’t seeing what I wanted to see which was a couple of green checkmarks

The managed identity on the app service had only GET access to the keyvault. I added LIST access and the reference started working. I’m not sure why this is but I’m guessing that the reference is doing a LIST to get the secret and then a GET to get the secret value.

2023-02-16

Allow Comments in JSON Payload in ExpressJS

Officially comments are not supported in the JSON format. In fact this lack of ability to comment is one of the reasons that lead to the downfall of the JSON based project system during the rewrite of the .NET some years back. However they sure can be useful to support. In my case I wanted to add some comments to the body of a request to explain a parameter in Postman. I like to keep comments as close to the thing they describe as possible so I didn’t want this on a wiki somewhere nobody would ever find.

The content looked something like

{
    "data": {
        "form_data": {
            "effective_date": "2023-02-23",
            "match_on_per_pay_period_basis": 0, /* 0 if yes, 1 if no */
            "simple_or_tiered": 1, /* 0 if simple 1 if tiered */
        }
    }
}

This was going to an ExpressJS application which was parsing the body using body-parser. These days we can just use express.json() and avoid taking on that additional dependency. The JSON parsing in both these is too strict to allow for comments. Fortunately, we can use middleware to resolve the issue. There is a swell package called strip-json-comments which does the surprisingly difficult task of stripping comments. We can use that.

The typical json paring middleware looks like

app.use(express.json())

or 

app.use(bodyParser.json())

Instead we can do

import stripJsonComments from 'strip-json-comments';

...

app.use(express.text{
    type: "application/json" // 
}) //or app.use(bodyParser.text({type: "application/json}))
app.use((req,res,next)=> {
    if(req.body){
        req.body = stripJsonComments(req.body);
    }
    next();
})

This still allows us to take advantage of the compression and character encoding facilities in the original parser while also intercepting and cleaning up the JSON payload.

2023-02-15

Excel and Ruby

Excel is the king of spreadsheets and I often find myself in situation where I have to write our Excel files in an application. I’d say that as an application grows the probability of needing Excel import or export approaches 1. Fortunately, there are lots of libraries out there to help with Excel across just about every language. The quality and usefuleness of these libraries varies a lot. In Ruby land there seem to be a few options.

Spreadsheet

https://github.com/zdavatz/spreadsheet/

As the name suggests this library deals with Excel spreadsheets. It is able to both read and write them by using Spreadsheet::Excel Library and the ParseExcel Library. However it only supports the older XLS file format. While this is still widely used it is not the default format for Excel 2007 and later. I try to stay clear of the format as much as possible. There have not been any releases of this library in about 18 months but there haven’t been any releases of the XLS file format for decades so it doesn’t seem like a big deal.

The library can be installed using

gem install spreadsheet

Then you can use it like so

require 'spreadsheet'

workbook = Spreadsheet.open("test.xls")
worksheet = workbook.worksheet 0
worksheet.rows[1][1] = "Hello there!"
workbook.write("test2.xls")

There are some limitations around editing files such as cell formats not updating but for most things it should be fine.

RubyXL

https://github.com/weshatheleopard/rubyXL

This library works on the more modern XLSX file formats. It is able to read and write files with modifications. However there are some limitations such as being unable to insert images

require 'rubyXL'

  # only do this if you don't care about memory usage, otherwise you can load submodules separately
  # depending on what you need
require 'rubyXL/convenience_methods'

workbook = RubyXL::Parser.parse("test.xlsx")
worksheet = workbook[0]
cell = worksheet.cell_at('A1')
cell.change_contents("Hello there!")
workbook.write("test2.xlsx")

CAXLSX

https://github.com/caxlsx/caxlsx

This library is the community supported version of AXLSX. It is able to generate XLSX files but not read them or modify them. There is rich support for charts, images and other more advanced excel features. The

Install using

gem install caxlsx

And then a simple example looks like

require 'axlsx'

p = Axlsx::Package.new
workbook = p.workbook

wb.add_worksheet(name: 'Test') do |sheet|
  sheet.add_row ['Hello there!']
end

p.serialize "test.xlsx"

Of all the libraries mentioned here the documentation for this one is the best. It is also the most actively maintained. The examples directory https://github.com/caxlsx/caxlsx/tree/master/examples gives a plethora of examples of how to use the library.

Fast Excel

https://github.com/Paxa/fast_excel

This library focuses on being the fastest excel library for ruby. It is actually written in C to speed it up so comes with all the caveats about running native code. Similar to CAXLSX it is only able to read and write files and not modify them.

require 'fast_excel'

  # constant_memory: true streams changes to disk so it means that you cannot
  # modify an already written record
workbook = FastExcel.open("test.xlsx", constant_memory: true)
worksheet = workbook.add_worksheet("Test")

bold = workbook.bold_format
worksheet.set_column(0, 0, FastExcel::DEF_COL_WIDTH, bold)
worksheet << ["Hello World"]
workbook.close

As you can see here the library really excels at adding consistently shaped rows. You’re unlikely to get a complex spreadsheet with headers and footers built using this tooling.