Simon Online

2013-04-19

Slurp - CoffeeScript Part 1

I’m doing a couple of talk at Prairie DevConin Winnipeg in early May. One of them is about data visualizations and the other is about next generation JavaScript. This is the first time I’ve talked about JavaScript in front of an audience which actually knows about JavaScript. Scary.

CoffeeScriptis a language which compiles down to JavaScript instead of some binary language. In that way it is not unlike TypeScript about which I have written a few things.To get started the easiest thing to do is download and install node.js from their site. Node provides an implementation of the V8 JavaScript engine which can be run headless and it includes the npm package manager. Once you have node installed it is as simple as running

npm install -g coffee-script

that will even add CoffeeScriptto your path. What a friendly fellow that node package manager is! If you’re running this on OSX or Linux and you want to install it globally (that’s what the -g flag does) then you’ll need to have sudo.

We can now get started with a simple CoffeeScriptprogram. By convention CoffeeScript files end in .coffee but for maximum confusion you should end them with .java or .c (please don’t). The lengthening of file extensions is kind of funny, don’t you think?

.c -> .cpp-> .java -> .coffee -> .heythisisajavascriptfile

So let’s start with a really simpleCoffeeScript program.

I’ve spoken before about how to arrange large code bases using namespaces and classes. CoffeeScript doesn’t have a built in concept of namespace or module but it does have a concept of classes. You can simulate modules but that’s a lesson for another day, classes will do just fine for now.

The first thing you’ll notice is that the syntax is different from JavaScript. CoffeeScript uses a python style syntax which means that whitespace is important. The constructor keyword creates a constructor which takes a single parameter name. You’ll notice that name is prefaced by an @ symbol. This makes name an automatically assigned property.

Here you can see the syntax for setting up ifs and elses. Again notice the lack of braces, code blocks are denoted by indentation.

Over the next week I’ll be delving more into coffeescript in preparation for my talk.

2013-04-18

A Spike on SQL Server Backups

I frequently feel that if there is a hole in my knowledge of my development stack it is SQL Server. Honestly the thing is a mystery to me for the most part. I put data in it and data comes out. From time to time I add an index and data comes out faster. Sometimes I look at query plans but mostly I do that in front of other developers so they believe that I know SQL Server. I tell you looking at a query plan then tapping your monitor with a pen and going “yep, yep” gives you a lot of street cred in development shops.

Street CredStreet Cred

But my lack of understanding of SQL Server backups ends this day!

Okay so the first thing is that there are 3 different recovery modes for SQL Server: Simple, Full and Bulk logged.

Simple -In this mode no log files are backed up only a snapshot of the data. This mode means that there is no need to keep transaction log files around. However if you lose the database then you also lose all the data since your last backup. Depending on your use case this might be okay but I doubt it. I would, personally, stay away from simple recovery mode.

Full ““ The log file is backed up in this version which means that you can recover up to the last transaction. In addition you can stop the restore at any point, essentially giving you a snapshot of the database at any point in time. That’s pretty awesome for many applications which are not even disaster recovery related.

Bulk logged -This is a special case of full backups which allows for better performance by not logging the details of certain bulk operations. For the most part this mode is as recoverable as the previous but saves you space if you make use of a lot of bulk operations. You probably shouldn’t make use of a lot of bulk operations anyway.

Great that clears things up, except: what’s a transaction log?

A transaction log is an append only file which contains a record of every transaction, or change, which occurs on a SQL server. Information is written during the lifetime of a transaction which means that if the sever crashes it can finish up or roll back half finished transactions using the log. Replaying a full log can restore a database. The log can also be shipped to other instances which can replay the log to bring themselves up to date. You might want to do that if you have a number of read only mirrors of your database for scalability or high availability purposes.

As you might imagine these log files can grow out of control pretty quickly.Fortunatelyyou can truncate them from time to time. From what I’m reading the log file is truncated as soon as it has been backed up. However the log file may not actually be reduced in size. This is because allocating disk is expensive so SQL server keeps hold of what is, in effect, an empty file. If you really need the space back you can run

DBCC SHRINKFILE (‘YourDataBase_Log’, 1000)

There are also some corner cases where the log file will not be truncated during a backup ““ such as when a restore is running using that transaction log.

The log file should be stored on a different disk from the mdf file. And when I say a different disk I mean a different storage systementirely. So if the mdf is on a SAN then the log file shouldn’t be on that SAN unless you’re willing to lose data should the entire SAN fail(hey, it happens).

2013-04-17

Building Your Source Controlled Access Database

Last week I published an oddly popular post about version controlling access databases. Could be that other people are also responsible for maintaining an access database? Let’s pause for a moment and sigh collectively.

<>

If you read this blog with any frequency(hi, mom!) then you’ll know that I’m big on producing automatic builds. I’m not the only one, it is even part of the Joel test. How can we get access to build from the source files automatically? Unsurprisingly it turns out it sucks.

Because the source files are separated from the compiled database something needs to assemble the files into the accdb file. In an ideal world the assembly would be done by a command line build tool. No such tool exists. I tried to create one but I ended up deep in a COM hole attempting to call the source control bindings in Access without having to open up the UI. Turns out that this COM stuff, which largelydisappearedbefore I started programming on Windows, is kind of hard. I backed away from thatapproachand instead I made use of a UI automation library called White.

White is an open source project which I believe came out of Thoughworks back in the day. Since then it has been picked up a group of people known as TestStack. There was a flurry of activity when they first started maintaining the tool but that’s died of since. I’ll do a more in detailed post on White and how to use it at a later date.

Unfortunately, I can’t release everything I did to get Access to build the sourcecontrolledversion of my database as myemployerisn’t big on releasing things through open source. I think I can get away with some general hints, though.

The first thing is to launch Access without any flags which, once you’ve included White.Core looks like

var application = Application.Launch(@”c:Program Files (x86)Microsoft OfficeOffice14MSACCESS.EXE”);

Then you basically just drive the UI as if it were you at the keyboard. Selecting a ribbon tab requires using the Window.Tabs collection and firing a Click event on the appropriate tab, in this case Source Control. Then it is just a question of selecting Create from Team Foundation and driving the rest of the UI.

With this approach I could easily generate the required accdb file. I haven’t got there yet but I’m going to tie the building application into TFS as a build step so that full packages can be generated.

2013-04-16

Setting up a PhoneGap Development Environment

There’s probably already a zillion posts out there about how to set up a PhoneGap development environment on OSX. I wanted my own because I did it once and now I’ve forgotten how. Typical idiot thing to do.

First download the Android Development Tools from here. There are some installation instructions but they basically amount to “run unzip”. This bundle include eclipse and a swack of Androidy stuff. I moved the contents to a Tools directory, because that’s how I roll.

In to that directory I also put the latest version of phone gap. At the time of writing that was 2.6.0.

I added all the tools directories to my PATH in .bash_profile

echoexport PATH=${PATH}:~/Tools/sdk/tools:~/Tools/sdk/platform-tools:~/Tools/phonegap/lib/android/bin

Here if there was a need to do iOS I would replace android in the last part of the path with ios. Or, perhaps it is opposite day and BlackBerry is a popular platform with a strong future then do s/android/blackberry/. But, let’s be honest, if it’s opposite day then I’m a Russian oil billionaire and I’m probably out hunting with Putin.

That's me, on the right.That’s me, on Putin’s right.

It is kind of a pain that the scripts to create PhoneGap projects are all named the same but just in different directories. If I were making a lot of new apps then probably I would just alias a ton of them instead of adding to my path.

alias create-ios=~/Tools/phonegap/lib/ios/bin/create alias create-android=~/Tools/phonegap/lib/android/bin/create alias create-hahahaha=~/Tools/phonegap/lib/blackberry/bin/create

That’s pretty much it. New projects are created by running the create script and then pointing eclipse at the folder.

2013-04-15

What I'm Excited About at Prairie DevCon

Every year I go to Prairie DevCon which is put on by the stylish and dapper fellow who is D’Arcy Lussier. Every year he schedules me in the worst speaker’s spot and laughs and laughs. Once he put me in the last spot of the conference and gave away free beer in the room next to me. 3 people came to my talk. 1 left when I mentioned the free beer. Come to think of it, he really isn’t so much dapper as an ass.

None the less he finds interesting topics and great speakers. I wanted to highlight the talks which I think are going to be reallyinteresting. Probably after reading this D’Arcy will change my schedule so I can’t attend any of them but I’m going to take the risk:

1.Machine Learning for Developers ““ In university I tried some machine learning and it was terrible. That being said it is a reallyinterestingfield and if it really is possible, with modern tools, to get something built in an hour then I’m adding machine learning to everything.

2.A Primer on LESS and Sass ““ The last time I tried to design a website 3 large men broke down my door and broke my fingers. They said that I had killed all the beauty in the world and that if I ever designed another site they would be back and they would bring pizza. I don’t know what that means nor am I anxious to find out. Eden, on the other hand, can design things and she is a computer scientist. So when you can learn about CSS from a designer to knows how to program you should jump at it.

3.Building Cross-Platform Mobile Apps with Sencha Touch and PhoneGap -I’m starting to really think that PhoneGap is the way to go for developing the vast majority of mobile applications. It is the only real way to get cross-platformcompatibilityand it’s also quick to develop. I’ve played a bit with it but I really am excited to see how a skilled person can use it. I don’t really know anything about Sencha Touch, so that should be fun to learn too.

  1. Git Magic ““ A couple of conference back David Alpert taught me a bunch about git. I haven’t used any of it. Frankly I felt like the advanced stuff he was teaching me would only be used if you had somehow failed to use git properly and had messed up. I’m starting to change my mind about that so I’m going to give James Kovacs a chance to solidly change my mind.

  2. Hacking Hardware ““ I talked a couple of months back with Donald Belcham about his experiments with hardware hacking and was inspired. Not inspired enough to actually do anything but still inspired that a software guy could pick up hardware. I’m super curious to see what he has built.

There are many other interesting looking talks out there but these were my top 5. If you’re nearWinnipeg or if you can run there at the speed of light you should pick up yourconferencetickets now and come out. It will be a riot.

2013-04-12

So You Want to Version Your Access Database

First off let me say that I’m sorry that you’re in that position. It happens to the best of us. There is still a terrifying amount of business logic written in MS Access and MS Excel. One of the things I’ve found working with Access to be greatly improved if you use source control. This is because access has a couple of serious flaws which can bealleviatedby using source control.

The first is that access ismonolithic, it is a single file which contains forms, queries, logic and, sometimes, data. This makes shipping the database easy and doesn’t confuse users with a bunch of dlls and stuff. It also means that exactly one person can work on designing the database at any time. Hello, scalability nightmare.

Next up is that access has a tendency to change things you didn’t change. As soon as you open a form in design mode Access makes some sort of a change. Who knows why but it worries me. If I’m not changing anything then why is Access changing something?

Finally Access files grow totally out of control. Every time you open the database its size increases seemingly at random. This is probably an extension of the previous point.

Access is a nightmare to work with, an absolute nightmare. I have no secret inside knowledge about what Microsoft is doing with Access and Office in general but I suspect that desktop versions of office have a limited future. There have been no real updates to the programming model in”¦ well ever as far as I can tell.

Okay well let’s put the project under source control and then I’ll talk a bit about how this improves our life. I’ll be using TFS for the source control because we might as well give ourselves a challenge and have twonightmares to deal with.

The first thing you’ll need is the access MSSCCI extensionsfollowed up by the MS Access Developer Tools. Now when you open up access there should be a new tab available to you in the menu strip: Source Control. Yay!

Menu bar additionsMenu bar additions

Open up your current database and click the button marked Add Database to Team Foundation. You’ll be prompted for your TFS information. Once that’s been entered access will spool up and create a zillion files in source control for you. This confused us a lot when we first did it because none of the files created were mdb or accdb files: the actual database. Turns out the way it works is that the files in source control are mapped, one to one, with objects in the database. To create a “build” of the database you have to click on the “Create from Team Foundation” button. This pulls down all the files and recombines them into the database you love.

Selecting the TFS source (identifying information removed)Selecting the TFS source (identifying information removed)

You’ll now see that the object browser window now has hints on it telling you what’s checked out.Unfortunately you need to go and check out objects explicitly when you work on them. At first it is a pain but it becomes just part of your process in short order. One really important caveat is that you have to do the source control operations through the access integrations, you can’t just use TFS from Visual Studio. This is because the individual source files are not updated until you instruct access to check them in. Before that changes remain part of the mdb file and are not reflected in the individual files.

Right so what does this do for us? First having the code and objects split over many files improves the ability to work on a databasecollaboratively. While the individual objects are a total disaster ofserializationindividualscan still work on different parts of the same database at once. That’s a huge win. Second we’re protected from Access’ weird changes to unrelated files. If we didn’t change something then we just revert the file and shake our heads at Access. Finally because the mdb file is recreated each time we open it there is no longer unexpected growth.

This doesn’t make working with Accesspainlessbut it sure helps.

2013-04-11

A Day of Azure

Have you guys heard about this Azure thing but not sure what it is? Did your boss read an article about “The Cloud” in a 3 years old copy of CIO Magazine he found while waiting for Tip Top Tailors to fashion a suit of sufficient size to contain his gigantic, non-technical girth? Know all about Azure but haven’t had a chance yet to try it out?

Source: http://www.itstrulyrandom.com/2008/02/07/obese-man-kills-wife-by-sitting-on-her/Make me a cloud and waffles, mostly waffles.

Then boy do we have the event for you! The Calgary .net Users Group is putting on an Azure camp tocoincidewith the global Azure Bootcamp on April the 27th.

Been through the Azure training before? Don’t write us off, we’ve created a special Calgary themed project which show cases the part of Azure most applicable to the Calgary IT market. It might even have a bit of a Calgary theme to it.

I’ll be speaking as will the daring, dynamic anddutifulDavid Paquette.

Where: ICT 122 at the University of Calgary

When: April the 27th starting at 9 and going on until we run out of stuff to talk about. Before 4pm if nobody asks me what’s wrong with Java, 4 am if somebody does.

Cost: Free

Tell me more: Go to the official site:http://dotnetcalgary.com/

2013-04-10

Getting Started With Table Storage

In the last post we started looking at table storage and denormalization, but we didn’t actually use table storage at all. The first thing to know is that your data objects need to extend

Microsoft.WindowsAzure.Storage.Table.TableEntity

This base class is required to provide the two key properties: partition and row key. These two keys are important foridentifyingeach record in the table. The row key provides a unique value within the partition. The partition key provides a hint to the table storage about where the table data should be stored. All entries which share a partition key will be on the same node where as different partition keys within the table may be stored on different nodes. As you might expect this provides for someinterestingquery optimizations.

If you have a large amount of data or frequent queries you may wish to use multiple partition keys to allow for loadbalancingqueries. If you’re use to just slapping amonotonicallyincreasing number in for your ID you might want to rethink that strategy. The combination of knowing the partition key and the row key allows for very fast lookups. If you don’t know the row key then a scan of the entire partition may need to be performed which isrelativelyslow. This plays well into our idea of denormalizing. If you commonly need to look up user information using e-mail addresses then you might want to set your partition key to be the e-mail domain and the row key to be the user name. In this way queries are fast and distrubuted. Picking keys is a non-trivial task.

A key does not need to be constructed from a single field either, you might want to build your keys byconcatenatingseveral fields together. A great example might be if you had a set of geocoded points of interest. You could build a row key by joining the latitude and longitude into a single field.

To actually get access to the table storage you just need to use the table client API provided in the latest Azure SDK.

inserting records is simple. I use automapper to map between my domain objects and the table entities.

It is actually the same process for updating an existing record.

A simple retrieve operation looks like

Those are pretty much the basics of table storage. There is also support for a limited set of Linq operations against the table but for the most part they are ill advised because they fail to take full advantage of the key based architecture.

2013-04-09

Azure Table Storage

Azure table storage is another data persistence open when building applications on Azure. If you’re use to SQL Server or another pure SQL storage solution then table storage isn’t all that different. There is still a concept of a table made up of columns and row. What you lose is the ability to join tables. This makes for some interesting architectural patterns but it actually ends up being not that big of a leap.

We have been conditioned to believe that databases should be designed such that data is well partitioned into its own tiny corner. We build a whole raft of table each one representing bit of data which we don’t want to duplicate. Need an address for your users? Well that goes into the address table. As we continue down that line we typically end up with something like this:

Typical Relational Database StructureTypical Relational Database Structure

Here each entity is give its own table. This example might be a bit contrived, I hope. But I’m sure we’ve seen databases which are overly normalized like this. Gosh knows that I’ve created them in the past. How to arrange data is still one of the great unsolved problems of software engineering as far as I’m concerned(hey, there is a blog topic, right there).

With table storage you would have two options:

  1. Store the IDs just as you would in a fully relational database and retrieve the data in a series of operations.
  2. Denormalize the database

I like number 2. Storing IDs is fine but really it adds complexity to your retrieval and storage operations. If you use table storage like that then you’re not really using it properly, in my mind. It is better to denormalize your data. I know this is a scary concept to many because of the chances that data updates will be missed. However if you maintain a rigorousapproach to updating and creating data this concern is largely minimized.

DenormalizedDenormalized

To do so it is best to centralized your data persistence logic in a single area of code. In CQRS parlance this might be called a denormalizer or a disruptor in LMAX. If you have confidence that you can capture all the sources of change then you should have confidence that your denormalized views have correct data in them. By directing all change through a central location you can build that confidence.

In tomorrow’s post I’ll show how to make use of table storage in your web application.

2013-04-08

Hollywood Teaches Distributed Systems

I’m sitting here watching the movie Battle: LA and thinking about how it relates to distributed systems.The movie is available on Netflix if you haven’t seen it. You should stop reading now if you don’t want me spoiling it.

The movie takes place, unsurprisingly, in Los Angeles where an alien attack force is invading the city. As Aaron Eckhart and his band of marines struggle to take back the city they discover that the ships the aliens are using are actuallyunmanneddrones.

War MachineUnmanned, except for the man in front, obviously

This is not news the member of the signal corps they have hooked up with. She believes that all the drones are being controlled from a central location, a command and control center. The rest of the movie follows the marines as they attempt to find this structure which, as it turns out, isburiedin the ground. It is, seemingly, very secret and very secure hidden in the ruined city.Fortunately, as seems to happen frequently in these sorts of movies, the US military prevails and blows up this centralstructure.

The destruction of the command ship causes all the drones, which where were previously holding human forces at bay, to crash and explode. It is a total failure as a distributed system. Destroying one central node had the effect of taking out a whole fleet of automated ships. The invasion failed because some tentacled alien failed to read a book on distributed systems.

See the key is that you never know what is going to fail. Having a single point of failure is a huge weakness. Most people, when they’re designing systems don’t even realize that what they’ve got is a distributed system. I’ve seen this costing a lot of people a lot of time a couple of times. I’ve seen lone SAN failure take out an entire company’s ability to work. I’ve seen failures in data centers on the other side of the planet take out citrix here. If there is one truth to the information systems in large companies it is that they are complicated. However the people working on them frequently fail to realize that what they have on their hands is a single, large, distributed system.

For sure some services are distributed by default (Active Directory, DNS,”¦) but many are not. Think about file systems: most companies have files shared from a single point, a single server. Certainly that server might have multiple disks in a RAID but the server itself is still a single point of failure. This is why it’s important to invest in technologies like Microsoft’s Distributed File System which uses replication to ensure availability. Storage is generally cheap, much cheaper than dealing with downtime from a failed node in Austin.

Everything is a distributed system, let’s start treating it that way.