2015-10-04

Yet another intro to docker

You would think that there were enough introductions to Docker out there already to convince me that the topic is well covered and unnecessary. Unfortunately the sickening mix of hubris and stubbornness that endears me so to rodents also makes me believe I can contribute.

In my case I want to play a bit with the ELK stack: that’s Elasticsearch, Logstash and Kibana. I could install these all directly on the macbook that is my primary machine but I actually already have a copy of Elasticsearch installed and I don’t want to polute my existing environment. Thus the very 2015 solution that is docker. If you’ve missed hearing the noise about docker over the last year then you’re in for a treat.

The story of docker is the story of isolating your software so that one piece of software doesn’t break another. This isn’t a new concept and one could argue that really that’s what kernel controlled processes do. Each process has its own memory space and, as far as the process is concerned, the memory space is the same as the computer’s memory space. However the kernel is lying to the process and is really remapping the memory addresses the program is using into the real memory space. If you consider the speed of processors today and the ubiquity of systems capable of running more than one process at a time then, as a civilization, we are lying at a rate several orders of magnitude greater than any other point in human history.

Any way, docker extends the process isolation model such that the isolation is stronger. Docker is a series of tools built on top of the linux kernel. The entire file system is now abstracted away, networking is virtualized, other processes are hidden and, in theory, it is impossible to break out of a container and damage other processes on the same machine. In practice everybody is very open about how it might be possible to break out of machine or, at the very least, gather information from the system running the container. Containers are a weaker form of isolation than virtual machines.

http://imgur.com/ntGolVE.png

On the flip side processes are more performant than containers which are, in turn more performant than virtual machines. The reason is simple: with more isolation more things need to run in each context bogging the machine down. Choosing an isolation level is an exercise in deciding how much trust you have in the processes you run to no interference with other things. In the scenario where you’ve written all the services then you can have a very high level of trust in them and run them with minimal isolation in a process. If it is SAP then you probably want the highest level of isolation possible: put the computer in a box and fire it to the moon.

Another nice feature of docker is that the containers can be shipped as a whole. They tend not to be prohibitively large as you might see with a virtual machine. This vastly improves the ease of deploy. In a world of micro-services it is easy to bundle up your services and ship them off as images. You can even have the result of your build process be a docker image.

The degree to which docker will change the world of software development and deployment remains an open one. While I feel like docker is a fairly disruptive technology the impact is still a couple of years out. I’d like to think that it is going to put a bunch of system administrators out of a job but in reality it is just going to change their job. Everybody needs a little shakeup now and then to keep them on their toes.

Anyway back to docker on OSX:

If you read carefully to this point you might have noticed that I said that docker runs on top of the Linux kernel. Of course OSX doesn’t have a linux kernel on which you can run docker. To solve this we actually run docker on top of a small virtual machine. To manage this we used to use a tool called boot2docker but this has, recently, been replace with docker-machine.

I had an older install of docker on my machine but I thought I might like to work a bit with docker compose as I was running a number of services. Docker compose allows for coordinating a number of containers to setup a whole environment. In order to keep the theme of isolating services it is desirable to run each service in its own container. So if you imagine a typical web application we would run teh web server in one container and the database in another one. These containers can be on the same machine.

Thus I grabbed the installation package from the docker website then followed the installation instructions at http://docs.docker.com/mac/step_one/. With docker installed I was able to let docker-machine create a new virtual machine in virtual box.

http://i.imgur.com/5uQjfq8.jpg

All looks pretty nifty. I then kicked off the ubiqutious hello-world image

~/Projects/western-devs-website/_posts$ docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world

535020c3e8ad: Pull complete 
af340544ed62: Pull complete 
Digest: sha256:a68868bfe696c00866942e8f5ca39e3e31b79c1e50feaee4ce5e28df2f051d5c
Status: Downloaded newer image for hello-world:latest

Hello from Docker.
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker Hub account:
 https://hub.docker.com

For more examples and ideas, visit:
 https://docs.docker.com/userguide/
 

It is shocking how poorly implemented this image is, notice that at no point does it actually just print “Hello World”. Don’t worry, though, not everything in docker land is so poorly implemented.

This hello world demo is kind of boring so let’s see if we can find a more exciting one. I’d like to serve a web page from the container. To do this I’d like to use nginx. There is already an nginx container so I can create a new Dockerfile for it. A Dockerfile gives docker some instructions about how to build a container out of a number of images. The Dockerfile here contains

FROM nginx
COPY *.html /usr/share/nginx/html/

The first line set the base image on which we want to base our container. The second line copies the local files with the .html extension to the web server directory on the nginx server container. To use this file we’ll have to build a docker image

/tmp/nginx$ docker build -t nginx_test .
Sending build context to Docker daemon 3.072 kB
Step 0 : FROM nginx
latest: Pulling from library/nginx
843e2bded498: Pull complete 
8c00acfb0175: Pull complete 
426ac73b867e: Pull complete 
d6c6bbd63f57: Pull complete 
4ac684e3f295: Pull complete 
91391bd3c4d3: Pull complete 
b4587525ed53: Pull complete 
0240288f5187: Pull complete 
28c109ec1572: Pull complete 
063d51552dac: Pull complete 
d8a70839d961: Pull complete 
ceab60537ad2: Pull complete 
Digest: sha256:9d0768452fe8f43c23292d24ec0fbd0ce06c98f776a084623d62ee12c4b7d58c
Status: Downloaded newer image for nginx:latest
 ---> ceab60537ad2
Step 1 : COPY *.html /usr/share/nginx/html/
 ---> ce25a968717f
Removing intermediate container c45b9eb73bc7
Successfully built ce25a968717f

The docker build command starts by pulling down the already build nginx container. Then it copies our files over and reports a hash for the container which makes it easily identifiable. To run this container we need to do

/tmp/nginx$ docker run --name simple_html -d -p 3001:80 -p 3002:443 nginx_test

This instructs docker to run the container nginx_test and call it simple_html. The -d tells docker to run the container in the background and finally the -p give the ports to forward, in this case we would like our local machine’s port 3001 to be mapped to the port inside the docker image 80 - the normal web server port. So now we should be able to connect to the web server. If we open up chrome and go to localhost:3001 we get

http://i.imgur.com/8Hdq9hN.jpg

Well that doesn’t look right! The problem is that docker doesn’t realize that it is being run in a virtual machine so we need to forward the port from the vm to our local machine

Docker container:80 -> vm host:3001 -> OSX:3001

This is easily done from the virtual machine manager

http://i.imgur.com/cGXHwRZ.jpg

Now we get

http://i.imgur.com/h8UJTSN.jpg

This is the content of the html file I put into the container. Perfect! I’m now ready to start playing with more complex containers.

Tip

One thing I have found is that running docker in virtual box at the same time as running parallels causes the whole system to hang. I suspect that running two different virtual machine tools is too much for something and a conflict results. I believe there is an effort underway to bring parallels support to docker-machine for the 0.5 release. Until then you can read http://kb.parallels.com/en/123356 and look at the docker-machine fork at https://github.com/Parallels/docker-machine.

2015-09-02

Running Process As A Different User On Windows

As part of a build pipeline I’m working on the octopus deploy process needs to talk to the database using roundhouse as a different user from the one running the deployment. This is done because the database uses integrated AD authentication, which I quite like. If this build were running on Linux then it would be as simple as editing the sudoers file and calling the command using sudo. Unfortunatly this is Windows and the command line has long been a secondary concern.

I started by asking on the western devs slack channel to see if anybody else had done this and how. Dave Paquette suggested using psexec. This is a tool designed for running commands on a remote computer but if you leave the computer name off it will run on the local machine. This sounded perfect.

However I had a great deal of trouble getting psexec to work in the way I wanted. The command I wanted to run seemed to fail all the time giving an confusing error code -1073741502. The fix provided didn’t seem to work for me so after an afternoon of bashing my head against psexec I went looking for another solution. Running remote processes gave me an idea: what about powershell remoting?

Some investigation suggested that the command I wanted to run would look like

Invoke-command localhost -scriptblock { rh.exe --some-parameters }

This would remote to localhost and run the roundhouse command as the current user. To get it to work using a different user then the command needed credentials passed into it. I had the credentials stored as sensitive variables in Octopus which set them up as variables in powershell. To turn these into credentials you need to do

$pwd = ConvertTo-SecureString $deployPassword -asplaintext -force
$cred =new-object -TypeName System.Management.Automation.PSCredential -argumentlist $deployUser,$pwd

Now these can be passed into invoke command as

Invoke-command localhost -authentication credssp -Credential $cred -scriptblock { rh.exe --some-parameters }

You might notice that authentication flag, this tells powershell the sort of authentication and cor credssp you also need to enable Credential Security Service Provider. To do this we run

Enable-WSManCredSSP -Role server
Enable-WSManCredSSP -Role client -DelegateComputer "*"

From an admin powershell session on the machine. Normall you would run these on different machines but we’re remoting to local host so it is both the client and the server.

Finally I needed to pass some parameters to roundhouse proper.

Invoke-command localhost -authentication credssp -Credential $cred -scriptblock { param($roundHouseExe,$databaseServer,$targetDb,$databaseEnvironment,$fName) & "$roundHouseExe" --s=$databaseServer --d="${targetDb}"  --f=$fName /ni --drop } -argumentlist $roundHouseExe,$databaseServer,$targetDb,$databaseEnvironment,$fName
2015-08-28

Ooops, Repointing Git Head

I screwed up. I force pushed a branch but I forgot to tell git which branch to push so it clobbered another branch.

C:\code\project [feature/feature27]> git push -f
Password for 'http://simon@remote.server.com:7990':
Counting objects: 63, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (61/61), done.
Writing objects: 100% (63/63), 9.25 KiB | 0 bytes/s, done.
Total 63 (delta 50), reused 0 (delta 0)
To http://simon@remote.server.com:7990/scm/ev/everest.git
 + 0baa5b8...e9a1c19 develop -> develop (forced update)  <--oops!
 + dbe6fce...5557ae7 feature/feature27 -> feature/feature27 (forced update)

Drat, since I hadn’t updated develop in a few hours there were a bunch of changes in it that I just killed. Fortunately I know that git is really just a glorified linked list and that nothing is ever deleted. I just needed to update where the head pointer was pointing. I grabbed the SHA of the latest develop commit from the build server knowing that it was late at night and nobody else was likely to have snuck a commit into develop that the server missed.

Then I just force updated my local develop and pushed it back up

git branch -f develop bbff5b810a19383fb11950a5d1e36676dd3ca85d  <-- sha from build server
git push

All was good again.

2015-08-14

Azure Point in Time Restore Is Near Useless

About a year ago Microsoft rolled out Azure point in time restore on their SQL databases. The idea is that you can restore your database to any point in time from the last little while (how long ago you can restore from is a function of the database scale). This means that if something weird happened to your data 8 hours ago you can restore back to that point. It even support restoring databases that have recently been deleted.

My reading of the marketing material around this feature is that it is meant to replace full database backups in a number of scenarios. In fact if you go to do a database export you’re warned about the performance implications and that point in time restore is much preferred. The problem is that it is slow.

Cripplingly. Shockingly. Amazingly. Slow.

The database I’m working with is about 140MiB as a backup file and just shy of 700MiB when deployed on a database server. Downloading and restoring the database on my laptop, a 3 year old macbook pro running an ancient version of Parallels takes between 6 and 10 minutes. Not a huge deal.

On azure I have some great statistics because restoring the database is part of our QA process. Since I switched from restoring nightly backups to using point in time restores I’ve done 45 builds. Of these 6 of them have failed to complete the restore before I gave up which usually takes a day. The rest are distributed like this in minutes

Scatter!

As you can see 23 of restores, or 59% took more than 50 minutes. There are a few there that are creeping up on 5 hours. That is insane. This is a very small database when you consider that these S1 databases scale to 250gig. Even if we take our fastest restore at 7 minutes and plot it out then this is a 29 hour restore process. What sort of a business can survive a 29 hour outage? If we take the longest then it is 47 days. By that time the company’s assets have been sold at auction and the shareholders have collected 10 cents on the dollar.

When I first set this process up it was on a web scale database and used a backup file. The restore consistently took 15 minutes. Then standard databases were released and the restore time increased to a consistent 40 minutes. Now I’m unable to tell the QA team to within 4 hours when the build will be up.

Fortunately I have a contact on the Azure SQL team who I pinged about the issue. Apparently this is a known issue and a fix is being rolled out in the next few weeks. I really hope that is the case because in the current configuration point in time restores are so slow and inconsistent that they’re in effect useless for disaster recovery scenarios as even for testing scenarios.

2015-08-12

Setting up an IIS Site Using PowerShell

The cloud has been such an omnipresent force in my development life that I’d kind of forgotten that IIS even existed. There are, however, some companies that either aren’t ready for the cloud or have legitimate legal limitations that make using the cloud difficult.

This doesn’t mean that we should abandon some of the niceties of deploying to the cloud such as being able to promote easily between environments. As part of being able to deploy automatically to new environments I wanted to be able to move to a machine that had nothing but IIS installed and run a script to do the deployment.

I was originally thinking about looking into PowerShell Desired State Configuration but noted brain-box Dylan Smith told me not to bother. He feeling was that it was a great idea whose time had come but the technology wasn’t there yet. Instead he suggested just using PowerShell proper.

Well okay. I had no idea how to do that.

So I started digging. I found that PowerShell is really pretty good at setting up IIS. It isn’t super well documented, however. The PowerShell documentation is crummy in comparison with stuff in the .net framework. I did hear on an episode of Dot Net Rocks that the UI for IIS calls out to PowerShell for everything now. So it must be possible.

The first step is to load in the powershell module for IIS

Import-Module WebAdministration

That gives us access to all sorts of cool IIS stuff. You can get information on the current configuration by cding into the IIS namespace.

C:\WINDOWS\system32> cd IIS:
IIS:\> ls

Name
----
AppPools
Sites
SslBindings

Well that’s pretty darn cool. From here you can poke about and look at the AppPools and sites. I was told that by fellow Western Dev Don Belcham that I should have one AppPool for each application so the first step is to create a new AppPool. I want to be able to deploy over my existing deploys so I have to turff it first.

if(Test-Path IIS:\AppPools\CoolWebSite)
{
    echo "App pool exists - removing"
    Remove-WebAppPool CoolWebSite
    gci IIS:\AppPools
}
$pool = New-Item IIS:\AppPools\CoolWebSite

This particular site needs to run as a particular user instead of the AppPoolUser or LocalSystem or anything like that. These will be passed in as a variable. We need to set the identity type to the confusing value of 3. This maps to using a specific user. The documentation on this is near impossible to find.aspx).

$pool.processModel.identityType = 3
$pool.processModel.userName = $deployUserName
$pool.processModel.password = $deployUserPassword
$pool | set-item

Opa! We have an app pool. Next up a website. We’ll follow the same model of deleting and adding. Really this delete block should be executed before adding the AppPool.

if(Test-Path IIS:\Sites\CoolWebSite)
{
echo "Website exists - removing"
Remove-WebSite CoolWebSite
gci IIS:\Sites
}

echo "Creating new website"
New-Website -name "CoolWebSite" -PhysicalPath $deploy_dir -ApplicationPool "CoolWebSite" -HostHeader $deployUrl    

The final step for this site is to change the authentication to turn off anonymous and turn on windows authentication. This requires using a setter to set individual properties.

Set-WebConfigurationProperty -filter /system.webServer/security/authentication/windowsAuthentication -name enabled -value true -PSPath IIS:\Sites\CoolWebSite

Set-WebConfigurationProperty -filter /system.webServer/security/authentication/anonymousAuthentication -name enabled -value false -PSPath IIS:\Sites\CoolWebSite
}

I’m not completely sure but I would bet that most other properties can also be set via these properties.

Well that’s all pretty cool. I think will still investigate PowerShell DSC because I really like the idea of specifying the state I want IIS to be in and have something else figure out how to get there. This is especially true for finicky things like setting authentication.

2015-08-07

Change Management for the Evolving World

I’ve had this blog post percolating for a while. When I started it I was working for a large company that has some internal projects I was involved with deploying. I came to the project with a background in evolving projects rapidly. It has been my experience that people are not upset that software doesn’t work so much as they are upset that when they discover a bug that it isn’t fixed promptly.

Velocity is the antidote to toxic bugs

Unfortunately the company had not kept up with the evolution of thinking in software deployment. Any change that needed to go in had to pass through the dreaded change management board. This slowed down deployments like crazy. Let’s say that somebody discovered a bug on a Tuesday morning. I might have the fix figured out by noon. Well that’s a problem because noon is the cut off for the change management meeting which is held at 5pm local time. So we’ve missed the change management for this week, but we’re on the agenda for next week.

Day 7.

The change management meeting comes around again and a concern is raised that the change might have a knock on effect on another system. Unfortunately the team responsible for that system isn’t on this call so this change is shelved until that other team can be contacted. We’ll put this change back on the agenda for next week.

Day 14.

Change management meeting number 2. The people responsible for the other system are present and confirm that their system doesn’t depend on the altered functionality. We can go ahead with the change! Changes have to go in on Fridays after noon, giving the weekend to solve any problems that arise. This particular change can only be done by Liz and Liz has to be at the dentist on Friday. So we’ll miss that window and have to deploy during the next window.

Day 24.

Deployment day has arrived! Liz runs the deployment and our changes are live. The minor problem has been solved in only 24 days. Of course during that time the user has been hounding the team on a daily basis, getting angrier and angrier. Everybody is pissed off and the business has suffered.

##Change management is a difficult problem.

There is a great schism between development and operations. The cause of this is that the teams have seemingly contradictory goals. Development is about changing existing applications to address a bug or a changing business need. For the development team to be successful they must show that they are improving the product. Everything about development is geared towards this. Think of the metrics we might use around development: KLoCs, issues resolved, time to resolve an issue, and so forth. All of these are about improving the rate of change. Developers thrive on rapid change.

Operations, on the other hand, their goal is to keep everything running properly. Mail server need to keep sending mail, web server need to keep serving web pages and domain controllers need to keep doing whatever it is that they do, control domains one would assume. Every time there is a change to this system then there is a good chance that something will break. This is why, if you wander into a server room, you’ll likely see a few machines that look like they were hand built by Grace Hopper herself. Most operations people see any change as a potential disturbance to the carefully crafted system they have built up. This is one of the reasons that change management boards and change management meetings have been created. They are perceived as gatekeepers around the system.

Personally I’ve never seen a change management board or meeting that really added any value to the process. Usually it slowed down deploying changes without really improving the testing around whether the changes would have a deleterious effect.

The truth of the matter is that figuring out what a change will do is very difficult. Complex systems are near impossible to model and predict. There is a whole bunch of research on the concept but it is usally easier to just link to

Let’s dig a bit deeper into the two sides of this issue.

##Why do we even want rapid change?

There are a number of really good reasons we’d like to be able to change our applications quickly

  1. Every minute spent with undesirable behaviour is costing the business money
  2. If security holes are uncovered then our chances of being attacked increase the longer it takes us to get a fix deployed
  3. Making smaller changes mean that when something does go wrong the list of potential culprits is quite short

On the other hand we have pushing back

  1. We don’t know the knock on effect of this change
  2. The problem is costing the business money but is it costing more money that the business being shut down totally due to a big bug?

Secretly we also have pushing back the fact that the ops team are really busy keeping things going. If a deployment takes a bunch of their time then they will be very likely to try to avoid doing it. I sure can’t blame them, often “I’m too busy” is not an acceptable excuse in corporate culture so it is replaced with bogus technical restrictions or even readings of the corporate policies that preclude rapid deployments.

If we look at the push back there is a clear theme: deployments are not well automated and we don’t have good trust that things won’t break during a deployment.

##How can we remove the fear?

The fear that ops people have of moving quickly is well founded. It is these brave souls who are up at oh-my-goodness O’clock fixing issues in production. So the fear of deploying needs to be removed from the process. I’m sure there are all sorts of solutions based in hypnosis but to me the real solution is

If something hurts do it more often

Instead of deploying once a month or once every two weeks let’s deploy every single day, perhaps even more than once a day. After every deploy everybody should sit down and identify one part of the process that was painful. Take that one painful part and fix it for the next deploy. Repeat this process, involving everybody, after each deploy. Eventually you’ll pay off the difficult parts and all of a sudden you can deploy more easily and more often. It doesn’t take many successes before everybody becomes a believer.

##What do the devs need to do?
As a developer I find myself falling into the trap of believing that it is the ops people who need to change. This is only half the story. Developers need to become much more involved in the running of the system. This can take many forms:

  • adding better instrumentation and providing understanding of what this instrumentation does
  • being available and involved during deploys
  • assisting with developing tooling
  • understanding the sorts of problems that are faced in operations

Perhaps the most important thing for developers to do is to be patient. Change on this sort of a scale takes time and there is no magic way to just make everything perfect right away.

I firmly believe that sort of change management we talked about at the start of the article is more theatre than practical. Sometimes it is desirable to show management that proper care and attention is being paid when making changes. Having really good test environments and automated tests is a whole lot better than the normal theatre, though.

It is time to remove the drama from deployments and close the Globe Theatre of deployments.

2015-07-30

Casting in Telerik Reports

Short post as I couldn’t find this documented anywhere. But if you need to cast a value inside the expression editor inside a Telerik Report then you can use the conversion functions

  • CBool
  • CDate
  • CDbl
  • CInt
  • CStr

I used it to cast the denominator here to get a percentage complete:

http://imgur.com/LE1hUUP.png

I also used the Format function to format it as a percentage. I believe the Format string here is passed directly into .net’s string format function so anything that works there will work here.

2015-07-22

Unit Conversions Done (Mostly) Right

Thanks to a certain country which, for the purposes of this blog let’s call it Backwardlandia, which uses a different unit system there is frequently a need to use two wildly different units for some value. Temperature is a classic one, it could be represented in Centigrade, Fahrenheit or Kelvin Rankine (that’s the absolute temperature scale, same as Kelvin, but using Fahrenheit). Centigrade is a great, well devised unit that is based on the freezing and boiling points of water at one standard atmosphere. Fahrenheit is a temperature system based on the number of pigs heads you can fit in a copper kettle sold by some bloke on Fleet Street in 1832. Basically it is a disaster. None the less Backwardlandia needs it and they have so many people and so much money that we can’t ignore them.

I cannot count the number of terrible approaches there are to doing unit conversions. Even the real pros get it wrong from time to time. I spent a pretty good amount of time working with a system that put unit conversions in between the database and the data layer in the stored procedures. The issue with that was that it wasn’t easily testable and it meant that directly querying the table could yield you units in either metric or imperial. You needed to explore the stored procedures to have any idea what units were being used. It also meant that any other system that wanted to use this database had to be aware of the, possibly irregular, units used within.

Moving the logic a layer away from the database puts it in the data retrieval logic. There could be a worse place for it but it does mean that all of your functions need to have the unit system in which they are currently operating passed into them. Your nice clean database retrievals become polluted with knowing about the units.

It would likely end up looking something like this:

public IEnumerable<Pipes> GetPipesForWell(int wellId, UnitSystem unitSystem)
{
    using(var connection = GetConnection()){
        var result = connection.Query<Pipes>("select id, boreDiameter from pipes where wellId=@wellId", new { wellId});
        return NormalizeForUnits(result, unitSystem);
    }
}

I’ve abstracted away some of the complexity with a magic function that accounts for the units and it is still a complex mess.

##A View Level Concern
I believe that unit conversion should be treated as a view level concern. This means that we delay doing unit conversions until the very last second. By doing this we don’t have to pass down the current unit information to some layer deep in our application. All the data is persisted in a known unit system(I recommend metric) and we never have any confusion about what the units are. This is the exact same approach I suggest for dealing with times and time zones. Everything that touches my database or any persistent store is in a common time zone, specifically UTC.

If you want to feel extra confident then stop treating your numbers as primitives and treat them as a value and a unit. Just by having the name of the type contain the unit system you’ll make future developers, including yourself, think twice about what unit system they’re using.

public class TemperatureInCentigrade{
    private readonly double _value;
    public TemperatureInCentigrade(double value){
        _value = value;
    }

    public TemperatureInCentigrade Add(TemperatureInCentigrade toAdd) 
    {
        return new TemperatureInCentigrade(_value + toAdd.AsNumeric());
    }
}

You’ll also notice in this class that I’ve made the value immutable. By doing so we save ourselves from a whole bunch of potential bugs. This is the same approach that functional programming languages take.

Having a complex type keep track of your units also protects you from taking illogical actions. For instance consider a unit that holds a distance in meters. The DistanceInMeters class would likely not contains a Multiply function or, if it did, the function would return AreaInSquareMeters. The compiler would protect you from making a lot of mistakes and this sort of thing would likely eliminate a bunch of manual testing.

The actual act of converting units is pretty simple and there are numerous libraries out there which can do a very effective job for us. I am personally a big fan of the js-quantities library. This lets you push your unit conversions all the way down to the browser. Of course math in JavaScript can, from time to time, be flaky. For the vast majority of non-scientific applications the level of resolution that JavaScripts native math supports is wholly sufficient. You generally don’t even need to worry about it.

If you’re not doing a lot of your rendering in JavaScript then there are libraries for .net which can handle unit conversions (disclaimer, I stole this list from the github page for QuantityType and haven’t tried them all).

Otherwise this might be a fine time to try out F# which supports units of measure natively

The long and short of it is that we’re trying to remove unit system confusion from our application and to do that we want to expose as little of the application to divergent units as possible. Catch the units as they are entered, normalize them and then pass them on to the rest of your code. You’ll save yourself a lot of headaches by taking this approach, trust a person who has done it wrong many times.

2015-06-09

Getting Lookup Data Into You View ASP.net MVC 6 Version

This is a super common problem I encounter when building ASP.net MVC applications. I have a form that has a drop down box. Not only do I need to select the correct item from the edit model to pick from the drop down but I need to populate the drop down with the possible values.

Over the years I’ve used two approaches to doing this. The first is to push into the ViewBag a list of values in the controller action. That looks like

public ActionResult Edit(int id){
    var model = repository.get(id);

    ViewBag.Provinces = provincesService.List();

    return View(model);
}

Then in the view you can retrieve this data and use it to populate the drop down. If you’re using the HTML helpers then this looks like

@Html.DropDownListFor(x=>x.province, (IEnumerable<SelectListItem>)ViewBag.Provinces)

This becomes somewhat messy when you have a lot of drop downs on a page. For instance consider something like

public ActionResult Edit(int id){
  var model = repository.get(id);

    ViewBag.Provinces = provincesService.List();
    ViewBag.States = statesService.List();
    ViewBag.StreetDirections = streetDirectionsService.List();
    ViewBag.Countries = countriesService.List();
    ViewBag.Counties = countiesService.List();

    return View(model);
}

The work of building up the data in the model becomes the primary focus of the view. We could extract it to a method but then we have to go hunting to find the different drop downs that are being populated. An approach I’ve taken in the past is to annotate the methods with an action filter to populate the ViewBag for me. This makes the action look like

[ProvincesFilter]
[StatesFilter]
[StreetDirectionsFilter]
[CountriesFilter]
[CountiesFilter]
public ActionResult Edit(int id){
  var model = repository.get(id);
  return View(model);
}

One of the filters might look like

public override void OnActionExecuting(ActionExecutingContext filterContext)
{
    var countries = new List<SelectListItem>();
    if ((countries = (filterContext.HttpContext.Cache.Get(GetType().FullName) as List<SelectListItem>)) == null)
    {
        countries = countriesService.List();
        filterContext.HttpContext.Cache.Insert(GetType().FullName, countries);
    }
    filterContext.Controller.ViewBag.Countries = countries;
    base.OnActionExecuting(filterContext);
}

This filter also adds a degree of caching to the request so that we don’t have to keep bugging the database.

Keeping a lot of data in the view bag presents a lot of opportunities for error. We don’t have any sort of intellisense with the dynamic view object and I frequently use the wrong name in the controller and view, by mistake. Finally building the drop down box using the HTML helper requires some nasty looking casting. Any time I cast I feel uncomfortable.

@Html.DropDownListFor(x=>x.province, (IEnumerable<SelectListItem>)ViewBag.Provinces)

Now a lot of people prefer transferring the data as part of the model; this is the second approach. There is nothing special about this approach you just put some collections into the model.

I’ve always disliked this approach because it mixes the data needed for editing with the data for the drop downs which is really incidental. This data seems like a view level concern that really doesn’t belong in the view model. This is a bit of a point of contention and I’ve challenged more than one person to a fight to the death over this very thing.

So neither option is particularly palatable. What we need is a third option and the new dependency injection capabilities of ASP.net MVC open up just such an option: we can inject the data services directly into the view. This means that we can consume the data right where we retrieve it without having to hammer it into some bloated DTO. We also don’t have to worry about annotating our action or filling it with junk view specific code.

To start let’s create a really simple service to return states.

public interface IStateService
{
    IEnumerable<State> List();
}

public class StateService : IStateService
{
    public IEnumerable<State> List() {
        return new List<State>
        {
            new State { Abbreviation = "AK", Name = "Alaska" },
            new State { Abbreviation = "AL", Name = "Alabama" }
        };
    }
}

Umm, looks like we’re down to only two states, sorry Kentucky.

Now we can add this to our container. I took a singleton approach and just registered a single instance in the Startup.cs.

services.AddInstance(typeof(IStateService), new StateService());

This is easily added the the view by adding

@inject ViewInjection.Services.IStateService StateService

As the first line in the file. Then the final step is to actually make use of the service to populate a drop down box:

<div class="col-lg-12">
        @Html.DropDownList("States", StateService.List().Select(x => new SelectListItem { Text = x.Name, Value = x.Abbreviation }))
</div>

That’s it! Now we have a brand new way of getting the data we need to the view without having to clutter up our controller with anything that should be contained in the view.

What do you think? Is this a better approach? Have I brought fire down upon us all with this? Post a comment. Source is at https://github.com/stimms/ViewInjection

2015-06-07

Building a Simple Slack Bot

A couple of friends and I have a slack channel we use to discuss deep and powerful questions like “should we make a distilled version of the ASP.net community standup that doesn’t waste everybody’s time?” or “could we create a startup whose business model was to create startups?”. We have so many terrible earth-shatteringly brilliant idea we needed a place to keep them. Fortunately Trello provides just such list functionality. There is already a Trello integration for Slack but it doesn’t have the ability to create cards but just notifies about changes to existing cards.

Lame.

Thus began our quest to build a slackbot. We wanted to be able to use /commands for our bot so

/trellobot add Buy a cheese factory and replace the workers with robotic rats

The bot should then reply to us with a link to the card should we need to fill in more details like the robot rat to worker ratio.

We started by creating what slack call a slash integration. This means that it will respond to IRC style commands (/join, /leave, …). This can be done from the slack webapp. Most of the fields were intuitive to full out but then we got to the one for a URL. This is the address to which slack sends an HTTP request when it sees a slash command matching yours.

This was a bit tricky as we were at a conference on wifi without the ability to route to our machines. We could have set up a server in the cloud but this would slow us down iterating. So we used http://localtunnel.me/ to tunnel request to us. What a great service!

The service was going to be simple so we opted for nodejs. This let us get up and running without ceremony. You can build large and impressive applications also with node but I always feel it excels at rapidly iterating. It other words we just hacked some thing out and you shouldn’t base your banking software on the terrible code here.

To start we needed an http server

  var http = require('http');
  var Uri = require('jsuri');

  http.createServer(function (req, res) {
  req.setEncoding('utf8');
  req.on('data', function(data){

    startResponse(res);

    var uri = new Uri();
    uri.setQuery(data);

    var text = uri.getQueryParamValue('text');

    var responseSettings = {
                            channelId: uri.getQueryParamValue('channel_id'),
                            userId: uri.getQueryParamValue('user_id')
                          };
    if(text.split(' ')[0] === "add")
    {
      performAdd(text, res, responseSettings);
    }
  });

}).listen(port);
console.log('Server running at http://127.0.0.1:/' + port);

The information passed to us by slack is URL encoded so we can just parse it out using the jsuri package. We’re looking for any message that starts with “add”. When we find it we run the performAdd function giving it the message, the response to write to and the response settings extracted from the request. We want to know the channel in which the message was sent and the user who sent it so we can reply properly.

If your bot doesn’t need to reply to the whole channel and instead needs to whisper back to the person sending the command that can be done by just writing back to the response. The contents will be show in slack.

Now we need to create our Trello card. I can’t help but feel that coupling a bunch of APIs together is going to be a big thing in the future.

Trello uses OAuth to allow authentication. This is slightly problematic, as we need to have a user agree to allow our bot to interact with it as them. This is done using a prompt on a website, which we don’t really have. If this was a fully-fledged bot we could find a way around it but instead we’re going to take advantage of Trello permitting a key that never expires. This is kind of a security problem on their end but for our purposes it is great.

Visit https://trello.com/1/appKey/generate and generate a key pair for Trello integration. I didn’t find a need for the private one but I wrote it down anyway, might need it in the future.

With that key visit https://trello.com/1/authorize?key=PUBLIC_KEY_HERE&name=APPLICATION_NAME_HERE&expiration=never&response_type=token&scope=read,write in a browser logged in using the account you want to use to post to Trello. The resulting key will never expire and we can use it in our application.

We’ll use this key to find the list to which we want to post. I manually ran

curl "https://trello.com/1/members/my/boards?key=PUBLIC_KEY_HERE&token=TOKEN_GENERATED_ABOVE_HERE"

Which gave me back a list of all of the boards in my account. I searched through the content using the powerful combination of less and my powerful reading eyes finding, in short order, the ID of a board I had just created for this purpose. Using the ID of the board I wanted I ran

curl "https://api.trello.com/1/boards/BOARD_ID_HERE?lists=open&list_fields=name&fields=name,desc&key=PUBLIC_KEY_HERE&token=TOKEN_GENERATED_ABOVE_HERE"

Again using my reading eyes I found the ID of the list in the board I wanted. (It wasn’t very hard, there was only one). Now I could hard code that into the script along with all the other bits and pieces (I mentioned not writing your banking software like this, right?). I put everything into a config object, because that sounded at least a little like something a good programmer would do - hide the fact I’m using global variables by putting them in an object, stored globally.

function performAdd(text, slackResponse, responseSettings){
  var pathParameters = "key=" + config.trello.key + "&token=" + config.trello.token + "&name=" + querystring.escape(text.split(' ').splice(1).join(" ")) + "&desc=&idList=" + config.trello.listId;
  var post_options = {
      host: 'api.trello.com',
      port: '443',
      path: '/1/cards?' + pathParameters,
      method: 'POST'
  };

  // Set up the request
  var post_req = https.request(post_options, function(res) {
      res.setEncoding('utf8');
      res.on('data', function (chunk) {
          var trelloResponse = JSON.parse(chunk);
          getUserProperties(responseSettings.userId, 
                              function(user){ 
                                responseSettings.user = user; 
                                postToSlack(["/" + text, "I have created a card at " + trelloResponse.shortUrl], responseSettings)});
      });
  });

  // post an empty body to trello, content is in url
  post_req.write("");
  post_req.end();
  //return url
}

Here we send a request to Trello to create the card. Weirdly, despite the request being a POST, we put all the data in the URL. I honestly don’t know why smart people like those at Trello design APIs like this…

Anyway the callback will send a message to Slack with the short URL extracted from the response from Trello. We want to make the response from the bot seem like it came from one of the people in the channel, specifically the one who sent the message. So we’ll pull the user information from Trello and set the bot’s name to be theirs as well as matching the icon.


function postToSlack(messages, responseSettings){
console.dir(responseSettings);
  for(var i = 0; i < messages.length; i++)
  {

    var pathParameters = "username=" + responseSettings.user.name + "&icon_url=" + querystring.escape(responseSettings.user.image) + "&token=" + config.slack.token + "&channel=" + responseSettings.channelId + "&text=" + querystring.escape(messages[i]);
    var post_options = {
        host: 'slack.com',
        port: '443',
        path: '/api/chat.postMessage?' + pathParameters,
        method: 'POST'
    };

    // Set up the request
    var post_req = https.request(post_options, function(res) {
        res.setEncoding('utf8');
        res.on('data', function (chunk) {

          console.log(chunk);
        });
    });
    post_req.on('error', function(e) {
      console.log('problem with request: ' + e);
    });
    // post the data
    post_req.write("");
    post_req.end();
  }
}

function getUserProperties(userId, callback){
  var pathParameters = "user=" + userId + "&token=" + config.slack.token;
  var post_options = {
      host: 'slack.com',
      port: '443',
      path: '/api/users.info?' + pathParameters,
      method: 'GET'
  };
  var get_req = https.request(post_options, function(res) {
      res.setEncoding('utf8');
      res.on('data', function (chunk) {
        var json = JSON.parse(chunk);
        callback({name: json.user.name, image: json.user.profile.image_192});
      });
  });
  get_req.end();
}

This is all we need to make a Slack bot that can post a card to trello. As it turns out this was all made rather more verbose by the use of callbacks and API writers inability to grasp what the body of a POST request is for. Roll on ES7 async/await, I say.

It should be simple to apply this same sort of logic to any number of other Slack bots.

Thanks to Canadian James for helping me build this bot.