Simon Online

2025

2025-11-05

Jujutsu Cheat Sheet

I’ve started playing around a bit with the source control tool Jujutsu which is commonly referred to as jj. Git has been my go to tool for what seems like decades now but in the before times I worked as a release engineer and made use of a huge stable of source control tools as our code base was spread over many versions and had been created from purchasing lots of other companies. For a while there I was working on a daily basis with

ClearCase
Perforce
Subversion
CVS
Visual Source Safe
Mercurial
Git
CCC/Harvest

I’m using Jujutsu at my day job now because it just layers transparently on top of git so I don’t need to go seeking permission. I’m only a few days into using it and I’m not thoroughly convinced yet that it is better than git but I’m willing to keep trying.

Here are some of the commands I’m using so far:

Get the latest version of the code from a central repository locally

jj git fetch

Start new work from the latest mainline

jj new main@origin -m "Whatever I'm going to work on"

Bookmark the work with a name I’m going to use as a branch in git

jj bookmark my-feature-branch

Push my work up to Github

jj git push --allow-new

Create a new commit before my current one that I can squash into

jj new -B @ -m "Some description of the work"

Push individual files into the parent change

jj squash path/to/file1 path/to/file2

I’ll keep expanding this document with new commands as I uncover them.

2025-07-12

Postgres Data Masking and Anonymization

Predicting the performance of a web application is always a little bit difficult. If it’s a question of how the site will perform under load then you can use things like Artillery to throw requests at it. But sometimes problems arise from increasing amounts of data. This was the case for a site I helped develop a little while back.

It had been in production for a couple of years and was starting to have problems with some of the queries. I profiled some of them and found some query optimizations which cleared it up. But it was annoying that this hadn’t been caught earlier and that it was difficult to replicate in lower environments because they simply didn’t have enough data.

We could have generated data but for any sort of complex data model it’s difficult to create realistic data. Fortunately I knew of a place we could get really well structured data which would have the same performance profile as production: production. But this data contains sensitive information like phone numbers, addresses names and salaries. I didn’t want to just copy that over to the lower environments so I needed a way to clean up the data.

Initially I started with just throwing FakerJS at the data but performance of updating every row with values was not great. After some research I found the Postgres extension PostgreSQL Anonymizer which looked like it would fit the bill.

PostgreSQL Anonymizer

PostgreSQL Anonymizer is a Postgres extension which allows you to mask data in a variety of ways. It can be used to anonymize data or to mask it. It has a number of built-in functions for common tasks like replacing names with random names, addresses with random addresses and so on.

Getting it installed on Azure Postgres Flexible Server was a little tricky and I’ll probably post on that in a separate article. But once it was installed it was easy to use.

There are a couple of modes it can run in Dynamic and Static. In the dynamic mode it will leave the data in the table put but will apply rules based on the current user role to mask the data when queried. This is a pretty handy thing and you could use it for something like masking SSNs for any user other than the admin. Static mode will actually update the data in the table to a new value, per your rules. This is what I opted for as by default the masking wasn’t deterministic.

Using PostgreSQL Anonymizer

The script I put together to run after I had restored a backup into the test environment looked like this:

-- Start up the extension
CREATE EXTENSION anon;

-- Rules
SECURITY LABEL FOR anon ON COLUMN household.contact_email
  IS 'MASKED WITH FUNCTION anon.fake_email()';
SECURITY LABEL FOR anon ON COLUMN household.contact_phone
  IS 'MASKED WITH FUNCTION anon.random_int_between(10000000,90000000)';

SECURITY LABEL FOR anon ON COLUMN address.street1
  IS 'MASKED WITH FUNCTION anon.fake_address()';
SECURITY LABEL FOR anon ON COLUMN address.street2
  IS 'MASKED WITH FUNCTION anon.fake_address()';

...

-- Anonomize database statically
SELECT anon.anonymize_database();

You can see the sections there first enable the extension, then create a series of rules which will replace the data in the columns with fake data. Finally SELECT anon.anonymize_database(); will actually kick off the changes. A few seconds of crunching later and the data is all faked up and we can hand over the database to QA or developers without having to worry about sensitive data leaking.

2025-03-30

Open API Generator for C#

Every once in a while I run into the need to generate a C# client for some API which has been nice enough to provide me with OpenAPI specifications. But it’s one of those things that I do so infrequently that I always forget how to do it. So I thought I would document it here.

The first thing to do is to install the OpenAPI generator. It’s written in Java which is obviously a decision I’m solidly against. As I run a Mac most of the time I prefer to use brew since it’s easier that trying to figure out how to build stuff with Maven.

brew install openapi-generator

Now comes the fun part: figuring out the options to use. There are bunch of different generators for different languages and on top of that each generator has options. The C# generator is called csharp and the options are described in some detail here https://openapi-generator.tech/docs/generators/csharp. In my case I was looking to generator a client for a US Government API that they seem to be really snippy about people getting their hands on documentation about it without going through a heap of hoops so we’ll just call it USGovAPI because this administration is not one I want to be on the wrong side of.

In addition I wanted to generate one for .NET 8 because the project is still running on the long term support version of .NET. So the command ended up being

openapi-generator generate -i swagger.json -g csharp -o out/usgoveapi --additional-properties=packageName=USGovAPI,targetFramework=net8.0

And with that we get a nice little C# client that we can use to call the USGovAPI. It even includes some unit tests to go along with it.

2024

2024-11-20

Limit Dependabot to .NET 8

Just last week .NET 9 was realeased to much fanfare. There are a ton of cool and exciting things in it but for my current project I want to stick to a long term support version of .NET which is .NET 8. We might update later but for now 8 is great. Unfortunately dependabot isn’t able to read my mind so it was continually proposing updating to .NET 9 packages.

Fixing this is easy enough. I needed to add a couple of lines to my dependabot file to limit the sorts of updates it did to just minor and patch updates. Notice the update-types section.

updates:
  - package-ecosystem: "nuget"
    directory: "/."
    groups:
      all_packages:
        update-types:
          - "minor"
          - "patch"
        patterns:
          - "*"
    open-pull-requests-limit: 50
    schedule:
      interval: "weekly"

With this in place dependabot is only proposing minor and patch updates to my .NET packages. It does mean that if we see other major version updates to non-Microsoft packages we’ll have to manually update them.

2024-11-14

RavenDB on Kubernetes

I needed to get Particular Service Control up and running on our k8s cluster this week. Part of that is to get an instance of RavenDB running in the cluster and this actually caused me a bit of trouble. I kept running into problems where RavenDB would start up but then report that it couuld not access the data directory. What was up?

I tried overriding the entry point for the container and attaching to it to see what was going on but I couldn’t see anything wrong. I was able to write to the directory without issue. Eventually I stumbled on a note in the RavenDB documentation which mentioned a change in the 6.x version of RavenDB which meant that Raven no longer ran as root inside the container.

K8S has the ability to change the ownership of the volume to the user that the container is running as. This is done by setting the fsGroup property in the pod spec. In this case Raven runs as UID 999. So I updated my tanka spec to include the fsGroup property and the problem was solved.

...
deployment: deployment.new($._config.containers.ravendb.name) {
      metadata+: {
        namespace: 'wigglepiggle-' + $._config.environment,
        labels: {
          app: $._config.containers.ravendb.name,
        },
      },
      spec+: {
        replicas: 1,
        selector: {
          matchLabels: $._config.labels,
        },
        template: {
          metadata+: {
            labels: $._config.labels,
          },
          spec+: {
            securityContext: {
              fsGroup: 999,
              fsGroupChangePolicy: 'OnRootMismatch',
            },
            containers: [
              {
                name: $._config.containers.ravendb.name,
                image: $._config.containers.ravendb.image,
                ports: [{ containerPort: 8080 }, { containerPort: 38888 }],
                volumeMounts: [
                  {
                    name: 'data',
                    mountPath: '/var/lib/ravendb/data',
                  },
                ],
                env: [
                  {
                    name: 'RAVEN_Setup_Mode',
                    value: 'None',
                  },
                  {
                    name: 'RAVEN_License_Eula_Accepted',
                    value: 'true',
                  },
                  {
                    name: 'RAVEN_ARGS',
                    value: '--log-to-console',
                  },
                  {
                    name: 'RAVEN_Security_UnsecuredAccessAllowed',
                    value: 'PrivateNetwork',
                  },
                ],
              },
            ],
            volumes: [
              {
                name: 'data',
                persistentVolumeClaim: {
                  claimName: $._config.containers.ravendb.name,
                },
              },
            ],
          },
        },
      },
    },
...

This generated yml like

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: service-control-ravendb
  name: service-control-ravendb
  namespace: wigglepiggle-dev
spec:
  replicas: 1
  selector:
    matchLabels:
      app: service-control
      environment: dev
  template:
    metadata:
      labels:
        app: service-control
        environment: dev
    spec:
      containers:
      - env:
        - name: RAVEN_Setup_Mode
          value: None
        - name: RAVEN_License_Eula_Accepted
          value: "true"
        - name: RAVEN_ARGS
          value: --log-to-console
        - name: RAVEN_Security_UnsecuredAccessAllowed
          value: PrivateNetwork
        image: ravendb/ravendb:6.0-latest
        name: service-control-ravendb
        ports:
        - containerPort: 8080
        - containerPort: 38888
        volumeMounts:
        - mountPath: /var/lib/ravendb/data
          name: data
      securityContext:
        fsGroup: 999
        fsGroupChangePolicy: OnRootMismatch
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: service-control-ravendb

2024-11-10

Where is the disk space?

I had a report today from somebody unable to log into our Grafana instance. Super weird because this this has been running fine for months and we haven’t touched it. So I jumped onto the machine to see what was up. First up was just looking at the logs from Grafana.

docker logs -f --tail 10 5efd3ee0074a

There in the logs was the culprit No space left on device. Uh oh, what’s going on here? Sure enough the disk was full.

df -h

Filesystem                         Size  Used Avail Use% Mounted on
tmpfs                              1.6G  1.9M  1.6G   1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv   96G   93G     0 100% /
tmpfs                              7.8G     0  7.8G   0% /dev/shm
tmpfs                              5.0M     0  5.0M   0% /run/lock
/dev/sda2                          2.0G  182M  1.7G  10% /boot
tmpfs                              1.6G   12K  1.6G   1% /run/user/1001

This stuff is always annoying because you get to a point where you can’t run any commands because there is no space left. I started with cleaning up some small parts of docker

docker system prune -a

Then cleaned up docker logs

sudo truncate -s 0 /var/lib/docker/containers/**/*-json.log

This then gave me enough space to run docker system df and see where the space was being used. Containers were the culprit. So next was to run

docker ps --size

Which showed me the web scraper container had gone off the rails and was using over a 100GiB of space.

7cf14084c56a   webscraper:latest       "wsce start -v -y"       7 weeks ago   Up 20 minutes               0.0.0.0:7002->9924/tcp, [::]:7002->9924/tcp                                webscraper       123GB (virtual 125GB)

This thing is supposed to be stateless so I just killed an removed it.

docker kill 7cf14084c56a
docker rm 7cf14084c56a
docker compose up -d webscraper

After a few minutes these completed and all was good again. So we’ll keep an eye on that service and perhaps reboot it every few months to keep it in check.

2024-11-05

Consuming Github Packages in Yarn

My life is never without adventure but unfortunately it isn’t the living on a beach sort of adventure. No it’s the installing yarn packages. I wanted to have a package installed in my project which was one I’d published from another repository. In this case the package was called @stimms/uicomponents. There were a few tricks to getting Github actions to be able to pull the package: first I needed to create a .yarnrc.yml file. This gives yarn instructions about where it should look for packages.

nodeLinker: node-modules

npmScopes:
  stimms:
    npmRegistryServer: "https://npm.pkg.github.com"

npmRegistries:
  "https://npm.pkg.github.com":
    npmAlwaysAuth: true

Now in the build I needed to add a step in to populate the GITHUB_TOKEN which can be used for authentication. I found quite a bit of documentation which suggested that the .yarnrc.yml file would be able to read the environment variable but I had no luck with that approach. Instead I added a step to the build to populate the GITHUB_TOKEN in the .npmrc file.

- name: Configure GitHub Packages Auth
    run: echo "//npm.pkg.github.com/:_authToken=${GITHUB_TOKEN}" > ~/.npmrc
    env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Build
    run: |
    yarn install
    yarn lint
    yarn build

The final thing to remember is that by default the GITHUB_TOKEN here doesn’t have read permission over your packages. You’ll need to go into the package settings and add the repository to the list of repositories which can use the package. You just need read access. If you don’t do this step you’re going to see an error like error Error: https://npm.pkg.github.com/@stimms%2fuicomponents: authentication token not provided

2024-10-12

Fast Endpoints Listen Port

In order to set the listening port for Fast Endpoints you can use the same mechanism as a regular ASP.NET application. This invovles setting the Urls setting in the appsettings.json file. My file looks like this:

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "Urls": "http://0.0.0.0:8080",
  "AllowedHosts": "*"
}

2024-09-15

Update outdated Nuget packages

If you’re using Visual Studio Code to develop C# applications, you might need to update outdated Nuget packages. You can do that without having to do each one individually on the command line using dotnet outdated

Install it with

dotnet tool install --global dotnet-outdated

Then you can run it in the root of your project to list the packages which will be updated with

dotnet outdated

Then, if you’re happy, run it again with

dotnet outdated -u

to actually get everything updated.

2024-09-14

NServiceBus Kata 6 - When things go wrong

So far in this series things have been going pretty well. We’ve looked at sending messages, publishing messages, switching transports, long running processes, and timeouts. But what happens when things go wrong? In this kata we’re going to look at how to handle errors in NServiceBus.

Archives

A blog about computer programming and technology.

My Books

Simon Online

Jujutsu Cheat Sheet

Postgres Data Masking and Anonymization

Postgres Data Masking and Anonymization

PostgreSQL Anonymizer

Using PostgreSQL Anonymizer

Open API Generator for C#

Limit Dependabot to .NET 8

RavenDB on Kubernetes

Where is the disk space?

Consuming Github Packages in Yarn

Fast Endpoints Listen Port

Update outdated Nuget packages

NServiceBus Kata 6 - When things go wrong