2013

2013-10-01

I do not get browser statistics

I don’t understand what the heck is going on with browser usage statistics. I honestly have no clue which browser has the most market share. If you look a different sources you get radically different numbers.

	W3 Schools	Net Market Share	Wikipedia	Global Stats counter	W3 Counter
Internet Explorer	11.80%	57.79%	20.47%	28.56%	23.90%
Firefox	28.20%	18.58%	17.71%	18.36%	17.80%
Chrome	52.90%	15.98%	46.02%	40.80%	31.60%
Safari	3.90%	5.77%	3.10%	8.52%	14.20%
Opera	1.80%	1.47%	5.45%	1.16%	2.40%

That’s hard to understand so let’s throw out some graphs. The one which caught my eye right away was Net Market Share. They show an overwhelming lead for Internet Explorer

On the other hand everybody else shows a lead, of various degrees, for Chrome

The divergence is because of the different methods used to get the data. For instance Wikipedia and W3 schools look only at the statistics on their site. Because both of them are used by people with a fair degree of technical ability they reflect a higher degree of usage by Chrome. The interesting ones are the first three, W3 counter, Global Stats Counter and Net Market Share. They are all aggregators of a large number of sites. I’m socked to see such a high degree of variability. Each of these sources use millions of page views to gather their information so a variation of more than a couple of points seems unusual.

It feels like the take-away here is that the browser usage statistics are garbage. As an industry we’re totally failing to measure the most basic of statistics about how people interact with the Internet as a whole. We should be ashamed of ourselves and we should do something about it. In the meantime it seems like we’re going to have to continue to support at least 3 possibly 4 different browsers to say nothing of the various versions of the browsers. The trends don’t reveal anything of any use either. There seems to be some momentum behind Chrome and IE and less for Firefox, Opera and Safari ““ but who really knows. The Internet is not homogeneous so we see different browser statistics when we slice our data geographically and topically. I bet the usage statistics on hacker news are interesting.

Unfortunately all of this means that you’re going to have to look at the statistics on your own website to see which browsers should be concentrated upon. I hate supporting old browsers but if your market is 6 guys sitting in their Unix holes* using Lynx then you’re supporting Lynx. Best of luck to you!

*Unix hole ““ it is a thing, trust me.

2013-09-25

What'a a good metric for programming language usage?

The whole “which programming language is most popular” debate was kicked off in my mind today by a tweet from @kellabyte. She tweeted

“X is dead” usually derived from small samples of our industry. http://t.co/0XFN0RQBrI greater growth than JS in last 12mo. Think about that

“” Kelly Sommers (@kellabyte) September 25, 2013

I was outraged that a well respected blogger/tweeter such as kellabyte would tweet horrific lies of this sort. “This is exactly”, I thought, “the problem with our industry ““ too many people corrupted by fame and supporting their own visual basic.net related agendas.” Of course I was wrong: kellabyte has no interested in VB and her numbers were not wrong.

I have always relied on TIOBE’s measurement of programming language popularityto give me an idea of what the top languages are. I think this is likely kellabyte’s source also. The methodology used is quite extensively outlined athttp://www.tiobe.com/index.php/content/paperinfo/tpci/tpci_definition.htm. If you don’t fancy reading all that the gist is that they use a series of search engines and count the number of results. The ebb and flow of these numbers is what makes up the rankings.

Obviously there are a number of flaws in this methodology:

The algorithms used by the search engines are not static
Not all programming languages are equally likely to be written about
Languages and technologies are often conflated

Let’s look at each one of those. The search engine market is a constantly changing landscape. Google and Bing are always working towards improving ranking and how results are reported. There is going to be some necessary churn around ranking changes. TIOBE average out a number of search engines in the hopes they can normalize that problem. They use 23 different search engines which is a good number but many of them are very specialized search engines such as Deviant Art. Certain search engines are also given higher ranking for instance Google gives 28% of the final score. In fact the top 3 search engines account for 69% of the score. I’m no statistician but that doesn’t seem like a good distribution. Interestingly 4 out of the top 5 sources are Google properties with the 5th(wikipedia) being heavily sponsored by Google.

The second point is that programming langues are not all equally likely to be written about. My feeling is that newer languages and “cooler” languages will gain an unfair advantage here. People are much more likely to be blogging about them than something boring like VBA. I would say half the code I’ve written in the last 6 months has been VBA but I don’t believe I have more than 2 blog posts on that topic.

I’m guilty of this: when I talk about .net in most cases I’m really talking about C#. Equally when people talk about Rails they’re talking about ruby. I’m not convinced that this information is well captured in TIOBE. It is a difficult problem because a search for “rails” is likely to return far more hits than just those related to programming. Context is important and without some natural language processing capabilities I don’t see how TIOBE can be accurate.

The alternatives to TIOBE are not particularly promising. James McKaysuggested that looking at job posting and github project would be a better metric. He specifically mentioned the job aggregatorhttp://www.itjobswatch.co.uk/. I’ve been thinking about this and it seems like a pretty good metric. The majority of development is likely done inside companies so looking at a job site gives a window into the inner workings of companies. Where it falls down is in looking at companies which are too small to post jobs and open source software. The counter balance to that is found in github statistics. These statistics are likely to have the opposite bias favoring upstart languages and open source contributions. I think we’re at the point where if you’re running an open source project you’re running it on github which makes it an invaluable source of data.

To the mix I would add stackoverflow as a source of numbers. They are a large enough question and answer site now that they’re a great source of data. I’m not sure what the biases would be there ““ C# perhaps?

Combining these statistics would be an interesting exercise ““ perhaps one for a quickly approaching winter’s day.

2013-09-18

Document control and DDD/CQRS - solving similar problems

I had the good fortune to have a two hour introduction to the world of document control the other day. It was refreshing to see that we programmers aren’t the only ones who don’t have things figured out yet. The entire document control process is an exercise in managing the flow and ownership of data. I spent a lot of time thinking about how similar the document control problem and the data flow problem mirror each other.

Document control is really interested in documents and doesn’t care at all about the contents of these documents. Their concerns are largely around

who owns this document?
what is the latest version of this document?
how is this document identified?
how long do I have to keep this document?
is this document superseded by some other document?

These sound a lot like issues with which we deal when using DDD. Document ownership is a simply a problem of knowing in which aggregate root a document belongs. Document versioning is similar to maintaining an event stream. Document identification is typically done through numbering ““ however the flow of documents is slow enough that sequential numbering isn’t a problem ““ no need for a randomly generated GUID.

Document retention isn’t one with which we typically spend much time in CQRS land. Storage is cheap so we just keep every version around or at least we’re able to generate every version through event sourcing. Perhaps the most congruent concept is taking snapshots of aggregates, but we’re typically only interested in the most recent version of the aggregate. With document control there is always some degree of manual intervention with documents so there is a significant cost to retaining all documents indefinitely. I’m only talking about digital copies of document here, Zuul protect you if you need to track paper copies of things too. I can’t even keep track of my keys let alone tens of thousands of documents. My strategy for paper documents would be to burn them as soon as I got them and refer people to the digital version.

Superseding documents also doesn’t seem like a problem we typically have in CQRS. In document control one or more documents may be supersededby one or more documents. For instance we may have a lot of temporary documents which are created by the business things like requests to move offices. They have value but only in a transitory way. Every week the new office seating chart is built from these office move documents and the documents discarded. Their purpose is complete and we no longer care about them as we have a summary document.

Many documents become one. I call it a Voltron operation.

In the opposite operation a document can be replaced by a series of documents. This activity is prevalent when adding detail to documents. A single data sheet may become several documents when examined in more detail

Reverse Voltron? Fan-out? The name may need some work.

This was originally going to be a post about how much we in the DDD/CQRS community have to learn from document control. I imagined that document control was a pretty old and well defined problem. There would surely be well defined solutions. I did not get that impression.

The problem of canonical source of truth or “who owns the data” is a very difficult one in document control. We’re spoiled in DDD because it is rare indeed that the owner of a piece of data can change during its lifespan. Typically the data would remain within an AR and never updated without the involvement of the AR. With document control it is probable that responsibility could jump from your AR to some other, possibly unknown, AR. It could then jump back. At any point in time it would be impossible, without querying every AR, who had control of the data. Of course with a distributed system like many people working on a document it is possible that there will be disagreement about which AR has responsibility at any one time. Yikes!

What we can learn from document control

I think that looking at document control gives us a window into what can happen when you relax some of the constraints around DDD. Data life-cycle is well defined in DDD and we know who owns data. If you don’t then you end up in trouble with knowing who is the source of truth. Document control must solve this problem constantly and it can only be done by going out and asking stakeholders a lot of questions ““ a time consuming exercise.

The introduction of splitting and combining documents, or in our case aggregates, over their lifetime is disastrous. You lose out on the history of information and knowing where to apply events becomes difficult. Instead we should retain aggregates as unchanged as possible (in terms of what fields they have, obviously the data can change) and rely on projections of the data to create different views of information. This is basically impossible to apply to formatted documents as you would have in document control.

What I think would help out document control

The first thing which comes to mind as being directly applicable to document control is removing meaning from the document identifiers. The documents document control manages tend to be numbered and the temptation to add meaning to a document number is too tempting to turn down. For instance you might get a number like

P334E-TT-6554

In our imaginary scheme all documents which start with P are piping diagrams. The 334 denotes the system to which it belong, E the operating pressure and TT the substance inside the pipe. The final digits are just incrementally assigned. The problem is just what you would expect: things change. When they do a decision must be made to either leave the number intact and damage its reliability or to renumber the document and lose the history. instead document control would do well to maintain an identifier whose sole purpose is to identify the document. The number can be retained but only as a field.

A more controversial assertion is that document control should retain all documentation. We retain a full history of messages used to build an entity, even if it is offline and used in favor of a snapshot. I believe that document control should do the same thing. Merging and splitting document is problematic and complicated. It is easier to just create a new document and reference the source documents. Ideally the generation of these new documents can be treated as a projection and the original documents retained.

In the end it is interesting to see how similar problem domains are solved by different people. That’s the beauty in learning a new development language; every language has different features and practices. I’m not, however, prepared to be the guy who learns document control in depth to bring their knowledge back to the community.

2013-09-11

The city of calgary doesn't get open data

It seems that the City of Calgary has updated its open data portal. I was alerted to it not by some sort of announcement but by a tweet from Grant Neufeld who isn’t a city employee any shouldn’t be my source of information on open data in Calgary.

Nice new City of Calgary Open Data website was quietly deployed a couple weeks ago! https://t.co/y1PnIuJg0o #yycdata #yyccc

“” Grant Neufeld (@grant) September 10, 2013

The new site is better than the old one. They have done away with the concept of having to add data to a shopping card and then check out with it. They have also made the data sets more obvious by putting them all in one table. They have also opened up an app showcasewhich is a fantastic feature. It can’t help to cross promote apps which make use of your data. There are also a few links to Google and Bing maps which do an integration with the city’s provided KML files. As I’ve said before I’m not a GIS guy so most of that is way over my head.

It is a big step forward”¦ well it is a step forward. I know the city is busy with more important things than open data but the improvements to the site are a couple of day’s worth of work at best. What frustrates me about the process is that despite having several years on lead time on this stuff the city is still not sure about what open data is. I draw your attention to the FoIP requests CSV&VariantId=1(CITYonlineDefault)). First thing you’ll notice is that despite being listed as a CSV it isn’t, it is an Excel document. Second is that the format is totally not machine readable, at least not without some painful parsing of different rows. Third the data is a summary and not the far more useful raw data. I bet there is some supposed reason that they can’t release detailed information. However if FoIP requests aren’t public knowledge then I don’t know what would be.

Open data is not that difficult. I’ve reproduced here the 8 principles of open data fromhttp://www.opengovdata.org/home/8principles

Data Must Be Complete

All public data are made available. Data are electronically stored information or recordings, including but not limited to documents, databases, transcripts, and audio/visual recordings. Public data are data that are not subject to valid privacy, security or privilege limitations, as governed by other statutes.

2. Data Must Be Primary

Data are published as collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.

3. Data Must Be Timely

Data are made available as quickly as necessary to preserve the value of the data.

4. Data Must Be Accessible

Data are available to the widest range of users for the widest range of purposes.

5. Data Must Be Machine processable

Data are reasonably structured to allow automated processing of it.

6. Access Must Be Non-Discriminatory

Data are available to anyone, with no requirement of registration.

7. Data Formats Must Be Non-Proprietary

Data are available in a format over which no entity has exclusive control.

8. Data Must Be License-free

Data are not subject to any copyright, patent, trademark or trade secret regulation. Reasonable privacy, security and privilege restrictions may be allowed as governed by other statutes.

The city is failing to meet a number of these. They are so simple, I just don’t get what they’re missing. The city employees aren’t stupid so all I can conclude is that there is either a great deal of resistance to open data somewhere in the government or nobody is really convinced of the value of it yet. In either case we need a good push from the top to get going.

2013-09-10

So I wrote a book

I’ve been pretty quiet on the old blog front as of late. This is largely attributable to me being busy with other things. The most interesting of which, in my mind, is that I wrote a book. It isn’t a very long book and it isn’t a very exciting book but I’m still proud of having written the little guy. This post is less about the book itself and more about what it was like to write a book.

First off it is a lot of work. Far more work than I was originally expecting. I’ve written lengthy things before, most notably about 100 pages during my masters. This was different because I didn’t feel like I knew the content as well as I did for the masters paper. Having restrictions on the length of the chapters was the most difficult part. Due to some confusion about the margins for a page I started by writing the equivalent of 15 pages for a 10 page chapter. I did this for 4 chapters before my editor caught it. I agreed to cut the content down and get back on track in accordance with the outline.

This was a mistake. It was really hard to cut content to that degree. A few words here or there was easy enough but what amounted to a third of the chapter content? Tough. Later in the project I realized that keeping within the outline pages was not nearly as important as I had been lead to believe. After throwing the limits out the window the writing process became much easier.

In order to treat writing a book with the same agile approach one might use for developing software it seems crucial to not involve page counts at all. A page count is a poor metric and I have no idea why one would optimize for it. Obviously there should be some rough guidelines for the whole thing you don’t want to end up with a 1000 page book when you only set out to write 200 nor do you want 200 pages when you set out to write 1000. But writing to within 50% of the target length is reasonable.

To put too much emphasis on length is to lose sight of the goals of the book. These are much more along the lines of education or entertainment or something like that. The goal isn’t to kill X trees.

Are books still relevant?

Umm, . I don’t know to be honest. I don’t read many programming books these days, I spend my time reading blogs and tutorials instead. I think there is still a space for paper form technical books even in a fast moving world like computer programming. There is certainly a place for books about techniques or styles or about the craft of programming in general. I have some well thumbed copies of Code Complete and The Pragmatic Programmer and even Clean Code. I do not, however, think there is a place for technology specific paper books. That target moves too quickly.

The long form technical document is not dead it just needs to remain spry. If you’re going to publish a longer book style document then publishing it in a form which can be changed and updated easily is key. This is where wikis and services like leanpub come into their own. As an author you need to keep updating the book or open it to a community which will do updates for you.

Would I do it again?

Not at the moment. Not through a traditional publisher. Not on my own.

I’ve had enough of writing books for now. I’m going to take a break from that, likely a long break. I might come back to it in a year or two but no sooner. I think I can understand why authors frequently have long breaks between their books. It is an exhausting slog, a death march really.

There was nothing wrong, per say, with Packt publishing. They did pretty good work and I liked my editor or editors or many many editors”¦ I’m not sure how many edits we had on some chapters. Frequently it would be edited by person X and then those edits reversed by person Y. There didn’t seem to be an overall guiding hand which was responsible for ensuring a quality product. Good editing has to be the selling feature for publishers, the way they attract both authors and purchasers. It is the only thing which sets them apart from self publishers.

Self publishing and micro-printing is coming into its own now. By micro-printing I mean being able to produce small runs of books economically rather than printing in very small text. If I were to do it again I would take this route. I would also hire a top notch editor who would stay with the project the whole time, somebody like @SocksOnBackward(she would tell me that top notch should be top-notch).

I would also like to work with somebody. Writing alone is difficult because there is nobody off of whom you can bounce ideas. I certainly could have reached out to random people I know in the community but it is a lot to ask of them. I would be much happier having somebody who could share the whole endeavour with me.

I guess watch this spot to see if I end up writing another book.

2013-08-28

Configuration Settings in an Azure Worker Role

I have found that developing an Azure worker role is somewhat poorly documented. Perhaps I am just not good at googling for what I need but that has not been my experience in the past. Anyway I have a few worker roles in a project I’ve been struggling with how to get connection strings into them. I need two strings: a database connection and an azure storage connection string. Typically I would set this up by having an app.config file but that doesn’t scale particularly well out to the cloud. You have to redeploy to change of the settings. Instead I thought I would make use of the settings mechanism provided by Azure.

The first step is to set up the settings in your Azure project. This is done by opening up the properties for the role and going to the settings tab

Good try, Harriet the Spy, you can’t see anything secret in this screenshot.

I added two settings: StorageConnectionString and DefaultConnection. While writing this I decided that I hate both those names, drat. You can pick the environment in the service configuration drop down. Cloud is used in the cloud and local is used during a locally emulated cloud.

In my code I created a static helper class to access these settings

You can see that I’m checking two sources for the information. If the role environment provider works then I use that connection string otherwise I use the fallback to the configuration file. For some reason the role environment bails with a full on exception if you try to get configuration information out of it without it running in the cloud or emulator. That seems like overkill, especially because the exception thrown is super general and provides no helpful information.

The config file settings are used when I’m running tests on the service locally outside of the emulator. Frequently the emulator is overkill for the simple debugging I’m doing, so I have a unit test which can be enabled that just launches the service. It is quick and close enough to production for most purposes.

In azure proper you can configure overrides for the cloud settings in the configure tab of your cloud service.

In Azure you can update the configuration settings.

All this seems to work pretty well.

2013-08-01

C# Contracts

A few weeks ago I stumbled on an excellent video of Greg Young talking at Ordev back in 2010. The topic was object oriented programming and, basically, how I’m an idiot. Not me in particular, it would be somewhat upsetting if Greg had taken the time to do an hour talk on how Simon Timms is an idiot. Upsetting or flattering, I’m not sure which. It is a very worthwhile video and you should make time to watch it. One of the takeaways was about code contracts.

I’ve never given much thought to code contracts before. I was never too impressed by what I considered to be a bunch of noise which tools like Resharper add to your code.

http://gist.github.com/stimms/6133393

“Asserting stuff is all nice and good but it should be caught by unit tests anyway” was my though.I have a lot of respect for Greg so I though I would look into code contracts. I look on them as a sort of extension to interfaces. Interfaces are a programmatic way of describing how an implementation should look. For instance a common interface in the projects I build is ILog which is an interface for logging. It is typically modeled after the ILog interface from Log4Net although it now includes some practices I picked up from my preferred logging framework, NLog.

The compiler guarantees that anything which implements that interface has at least some sort of implementation in place for each one of the defined methods. The compiler doesn’t care what the implementation is so long as there is one there. This allows me to create a “valid” implementation which looks like

This implementation doesn’t actually do what I had intended when I specified the interface. Unfortunately, there is no way, through, interfaces to require that functions actually do what they claim to do. Code contracts add another layer of requirements to implementations and allow for the enforcing of some additional conditions. Having contracts in place allows you to replace many of your unit tests with static checking. Want to ensure that null isn’t passed in? Build a contract. I decided to dig a bit more into how code contracts were working for C#.

As it turns out finding information on code contracts for C# is really difficult. There have been a couple of efforts over the years to bring code contracts into the .net world. The latest and, seemingly, most successful is as part of the PEX project. There was a burst of videos and activity on that project in 2010 but since then activity seem to have fallen off rather dramatically. Most everything in the code contracts works but it is somewhat flaky on visual studio 2012.

To get started you need to install two visual studio extensions: Code Contracts Tools and Code Contracts Editor Extensions VS2012. You can also install the code digger which displays a table of inputs which are checked for your methods. It is useful but is crippleware compared to how it is shown to work in videos like this one. The tool use to have the ability to generate unit tests but as I understand it this functionality is limited to Visual Studio Ultimate. I’m not fabricated from money so I don’t have that. Boo. (well not “Boo” for not being made from money rather “Boo” for the restriction. I’m glad I’m not made from money. Money is filthy)

Code contract extensions

Once you have these extensions installed you can start playing around with code contracts. When you come across a method which has contracts attached to it they will be shown in the intellisensehint. Some parts of the .net BCLs have received code contracts treatment. However it is wildly inconsistent which parts have contracts associated with them. Some places where I think they would be useful have been missed and other places are oddly over specified. For instance System.Math:

Missing contracts

Overly complicated contracts

The contracts on Math.Ceiling are pretty obvious yet they don’t seem to have been implemented. Irritating!

If you would like to specify contracts on your own code then, as far as I’m concerned, you should do it at the interface level. Always. You can put contracts on your concrete classes but then you’re all coupled to implementation and that sucks.

Because code contracts are implemented as a library instead of being part of the language syntax like Eiffel you need to set them up in buddy classes next to your interfaces. It is a real shame that they went this way and perhaps, once Rosslyn gets going, there will be a way to modify the language with new key words to deal with contracts.

Let’s say you have a class which does some math, specifically it takes a square root of a number.

This class is an implementation of the IMath interface

Here I’ve added an annotation which points to another class as containing the contracts. I actually really like that the contracts are split out into another class. It keeps the code short and still allows communicating the information about the contracts viaintellisense. The buddy class looks like:

For some reason I don’t really understand you need to specify the class for which it is a contract in an annotation. I think that pollutes the idea of a contract. The implementer should know about what contract it implements but the contract shouldn’t care at all. Each method on which you want a contract is specified and you can put in requires (pre conditions) and ensures (post-conditions). We’ll ignore the existence of i to make a point. The method is never executed so the remainder of the body is not important.

You can try the contract out by attempting to pass in an illegal value.

This will result in errors like

A failing contract

This isn’t very exciting because, of course, -9 is a negative number. Where things get interesting is when you start coupling together contracts.

This will also fail because the contract checker will actually go out and build up a representation of how data moves around the application. It is able to spot the conflicting contracts and warning about them.

The checking won’t actually be run unless you enable it in the properties of your project. I couldn’t find any setting which showedintellisense for the contracts I had created. I believe that is just suppose to work but it didn’t on the machine I used.

Settings for contract checking

If you run into a contract which is failing and you can’t quite figure out what’s going on then the PEX Code Digger can come in handy. You can right click on the method with the contract and it will show you the paths through the method which caused a contract failure. By default it only works on portable class libraries, I understand you can reconfigure that but I don’t know what the repercussions are of that. So I created a portable class library.

Portable Class Library

The System.Diagnostics.Contracts namespace in which the contracts code lives is not part of any of the 4.0 portable subsets. You’ll need to get one of the .net 4.5 portable subsets. That’s not an obvious task. To do it you need to add a brand new library to your project and it needs to use the portable class library template.

New portable library

You’re then given a choice of platforms. Many of these platforms are not natively .net 4.5 and will result in a 4.0 library. It took some playing around but I found that this combination worked:

Only contract killers on the xbox, no code contracts

Conclusions

I don’t know about contracts. They have the potential to speed up unit testing by creating your tests for you. Well some of your tests. The simple boiler plate tests that everybody skips doing because they’re mind numbing are largely eliminated. Anything which removes a barrier to the adoption of TDD is a good thing in my mind.

However I don’t think the implementation for C# is ready yet. Maybe they’ll never be ready. I asked around a bit but nobody seems to know what happened to code contracts. Are they still being developed? If so where is the activity? How come the editor stuff doesn’t work for my code contracts? Contract checking is also super slow. Even on this small application running the checks took a minute. I cannot imagine what it must do on a large project. Contract checking seems like it might be the sort of thing you run on that build which runs over the weekend. That sort of long feedback cycle is terrible. The better solution is to run the contracts, generate unit tests from them and run the unit tests. However, like I said, that feature seems to have been moved to the elite SKUs.

I won’t be using contracts but I will be keeping an eye out for news of continued work on them.

2013-07-30

How I broke the Linux

Years ago I was big into the Linux. Heck I was big into all Unix stuff. I had a 3 node cluster of Solaris 9 servers in my basement once which, having been built from old hardware, was probably slower than any other single machine on my network. But then I got tired of screwing around with Linux and FreeBSD and OpenBSD and (I was young, I swear) OpenVMS. I got old and I just wanted things to work. If I buy a new video card I don’t want to recompile my fricking kernel from sources I downloaded using and FTP client I wrote myself based on an argument I had with RMS in which I accused him of being a Microsoft shrill. Just work, damn it.

That being said I keep a few Linux boxes around to do things like serve files and do DHCP and the such. It was one of these boxes I rebooted after some updates last week. Now this box is amazingly stable and I have it on a UPS so its uptime was over 500 days. When it came back up a drive was missing. “That’s weird” I thought and dug into it. This is my primary drive which contains terabytes of completely legally obtained videos. In a fit of anti-police sentiment I had encrypted the snot out of the drive with TrueCrypt. I had no idea what the passphrase was. I just remembered it was long. Like mindbogglinly long. So long that, were he still alive, Robert Jordan would be impressed. This drive was never going to be cracked and I didn’t have the passphrase.

Well shoot.

So I decided I would throw the whole thing out and start over. All my important files were backed up to CrashPlan using a key I actually remembered. I would reformat and start over. A fresh start! I could get higher res versions of the stuff I had lost. Name them sequentially. It would be gloriously well ordered. Then I made my second mistake. I decided to upgrade the OS to the latest while I was in there.

Turns out that when I set up that machine I had used a software RAID1. It had, of course, never really worked properly. During boot md (the software RAID) would complain about being degraded. It never really seemed to be a big deal and I had run out of time when setting it up so I had left it. Turns out that one of the changes in the new version of the OS was to make this warning a fatal error. Now the system won’t boot.

I get dropped into a recovery console and I sigh. Fortunatly I still had some memory of how Linux works so I started to debug

dmesg | less less: command not found >god damn it you stupid recovery shell god: command not found >tail dmesg

The final couple of lines of the output pointed to md as the culprit. I was going to rebuild the array anyway so I moved mdadm.conf out of /etc/mdadm to a backup so they system wouldn’t try to mount any md drives. However, as it turns out that does nothing now in Ubuntu. Since I stopped knowing about Linux they seem to have created an init ram disk into which a subset of files is loaded. I have no recollection of this existing so it may be new or I may have just never run into it before. Anyway it holds a protected, secret copy of your mdadm.conf file so you can change the one in /etc forever and your system still won’t boot. I call this the “you stupid newb” ram disk.

By this point I’d discovered that you could append

bootdegraded=true

to the kernel line to at least get a system up with a degraded array. I did that and managed to get into the system long enough to delete the array

mdadm –stop /dev/md0 mdadm –zero-superblock /dev/sd[bc]1

create a new array(RAID 0 this time)

mdadm –create /dev/md0 –level=0 –raid-devices=2 /dev/sdb1 /dev/sdc1

Update the mdadm.conf file

head -n -1 /etc/mdadm/mdadm.conf > /etc/mdadm/mdadm.conf.new mdadm –detail –scan >> /etc/mdadm/mdadm.conf.new mv/etc/mdadm/mdadm.conf.new/etc/mdadm/mdadm.conf

and set the init ram disk back up

update-initramfs -u

I rebooted and found everything to be in working order. Thank goodness.

So the lesson here is don’t touch anything which is working. Don’t touch it ever or you will break it and have to spend all evening fixing it instead of churning butter. Which would have been more fun.

Fun riot!

2013-07-28

Exciting Year for Calgary .net

For ages I’ve been meaning to get more involved in the local .net community and really the whole tech community in Calgary. This last year was my ear of effort and I’ve been out to a couple of activities and a couple of groups which made me feel old and stupid(I’m looking at you YYC.js). As it turned out the Entity Framework Demi-God David Paquette picked this year to move to hotter climates leaving the presidency of the .net group here in Calgary open. Upon discovering this I immediately calledBradley Whitford and we launched an exploratory committee. I knew that Alan Alda was gunning for the same position but Bradley and I dug up some dirt on him and I coasted the rest of the way.

It wasn’t a clean campaign but I won. We’ve also had a couple of other people join the .net user group executive and we’ve managed to retain most of the old team to boot. It is a perfect mixture of the seasoned and the new.

I am really excited for our talks this year. Already we’ve got two talks set up and I’m sure we’ll have a bunch more in no time. Typically the theme or our talks has been “What’s new and awesome”. That’s pretty much going to continue this year but with a bit more emphasis on “awesome” than on “new”. We’re looking to do talks on the likes of TypeScript, F# and NoSQL databases. We’re also partnering with a couple of other groups in town to do a showdown between Ruby on Rails and ASP.net MVC and something with the JavaScript group which hasn’t yet been fleshed out.

We’re actively looking to increase our membership and our sponsors as well as our relationships with people looking to bring specialized training into town.

This season is going to ~~rock~~ be a jolly good time.

If you want to have a say in our topics or come out to any events be sure to join our brand new meetup site over athttp://www.meetup.com/Calgary-net-User-Group/.

2013-07-19

Southern Alberta Flooding

In the past couple of weeks there have been two big rain storms in Canada which have caused a great deal of flooding. The first was the Southern Alberta floods and the second was the flood in Toronto. I was curious about how the amount of rain we have had stacks up against some other storms. I was always struck by the floods in India during the monsoon season so I looked up some number on that and also on the world record for most rain in 24 hours.

Of course I wanted to create a visualization of it because that’s what I do. Click on the picture to get through to the full visualization

Click for details

Now I know that the amount of rain is just one part of the flood story but the numbers are still interesting. Can you imagine being around to see 1.8m of rain fall in 24 hours? I guess it was the result of a major hurricane. Incidentally Foc-Foc is on an islandRÃ©union near Madagascar. I’d never heard of it, despite 800 000 people living there.

I used these as the data sources:

Toronto 126mm -http://www.cbc.ca/news/canada/toronto/story/2013/07/09/toronto-rain-flooding-power-ttc.html

Calgary45mm ““ http://www.cbc.ca/news/canada/calgary/story/2013/06/21/f-alberta-floods.html

Mumbai 181.1mm -http://www.dnaindia.com/mumbai/1845996/report-mumbai-gets-its-third-highest-rainfall-for-june-in-a-decade-at-181-1-mm

A blog about computer programming and technology.

My Books