2013-06-21

The Big Problem in Development

I frequently hear people saying “the difficult problem in computing is X” or “nothing is hard in computing but Y” for some value of X and Y. The most common one I hear is Phil Karlton’s quote

There are only two hard things in Computer Science: cache invalidation and naming things.

I think that both naming and cache invalidation are hard but I don’t think they are the hardest problems in computing science. Nor do I think “Is P == NP?” is the hard problem. It is certainly hard but it isn’t all that important except for a certain class of problems. Maybe I’m doing the wrong things with my life but I can count on two hands the number of times I’ve run into a problem where the solution was a linear approximation or asearchin NP space.

No I think the big problem is: how is this going to change in the future?

Every design decision is informed by how much you’re going to need to change the code infuture and in what ways it is going to change. Should you use dependency injection? The answer lies in whether you expect new implementations to be needed in future. Should you use a message based architecture? Well are other services going to be interested inreceivingnotifications in future? How about deploying your reports to the client or to the server? If the reports are going to change frequently then to the server but if they’re static then client deployment provides a better user experience.Even problems like designing for scalability are informed by a knowledge of just how much the application is going to need to scale in the future.

The worst part about this problem is that it is clearly unsolvable. All you can do is guess about how maintainable and future proof your code base needs to be and make a decision. No matter should you be toopessimisticor too positive about your guess there is going to be a cost for guessing wrong. I’m a software guy more than a business guy so I think that the best bet is to guesspessimisticand build your application to be moreresilientto change. Business people might side with building the application as quickly and easily as possible because you don’t know if it is going to succeed in the first place.

Make no mistake, it is aconundrum. When you pay for senior developers and architects you’re paying for people who’ve been around enough to make good guesses. In my experience good guessing more than pays for the cost of hiring good developers.

If you happen to come up with a way to tell the future and solve this problem then let me know. You could just let me know by sending me the winning lotto numbers for the next 5 or 6 major lottery jackpots.

2013-06-20

I don't hate OAuth

It’s weird that I don’t hate OAuth. It is a combination of lots of things I hate: A complicated protocol and supported by Facebook(who I strongly dislike). Yet OAuth and OpenID are both technologies I support fully.

OpenID is a method of delegating authentication to a third party. So say I wanted to have user accounts on my site but I didn’t want to go through the trouble of hashing passwords (you’re not hashing with SHA are you?). Instead I can delegate all themessinessof storing password information with a third party. When a user signs in to my site I’llactuallyhave them sign in with their choice of account. This third party will pass a token back to my site to let me know that the user did sign in successfully. Any time you see those sign in with buttons chances are that it is implemented using OpenID.

OpenID LoginOAuth is a similar concept to OpenID except that instead of the third party site giving me an authentication token I ask it for permission to access a protected resource. So if I waswritinga tool for displaying tweets in your timeline I would need to access the protected information held by twitter. My application would refer you to the server(Twitter) which would ask for your password and then refer the session back to my application. My application never sees your password, instead if my app is accessing Twitter you can remain confident that only Twitter is getting your password.

I really like the idea that only tokens are passed around and never passwords. Being able to revoke the access of applications to a protected resource at any time without invalidating your password.

OAuth has a reputation for being not just difficult to implement but also inconsistent from one implementation to another. This is notwhollyan undeserved reputation. The fact that there exist two competing standards 1.0a and 2.0 doesn’t help at all. There is some argument that 2.0 is less secure and probably should not be used. I’m not versed enough to give anopinionon that.

If you want to provide API access to your data then OAuth is probably worth looking into even ifimplementationsare a bit spotty.

2013-06-19

Browser Performance Counters

I was reading a blog post the other day on dealing with memory leaks in gmail, I’vecompletelylost the link now but it put me onto a lot ofinterestingclient side tools of which I was onlytangentiallyaware. Readers of this blog will know that I’m on this constant push towards shifting functionality from the server to the client. My push is just part of the great cycle of architecture.

Hooray! It's another one of Simon's MSPaint graphsHooray! It’s another one of Simon’s MS Paint graphs

In another 10 years we’ll start moving all our functionality back to the server as a new generation discovers the power of centralizing your logic. It is a funny old world.

Anyway this post is about the information available to you within chrome’s performance namespace. If you open up any page in chrome and hit F12 you’ll get the developer tools. In there grab the console and type “˜performance’. The object returned has a number of properties, largely these properties are filled with counters for memory and load times.

There are a lot of counters in there so let’s take it one by one. The first one is performance.memory. This contains a MemoryInfo object which in turn contains 3 counters. jsHeapSizeLimit is the maximum sizeavailableto the heap, usedJSHeapSize is the amount of memory currently used and totalJsHeapSize is the total currently reserved heap space including segments not currently occupied. These counters can be used to find pages which use a lot of memory or find memory leaks in long living single page applications.

The next set of counters are timing records. Typically you would do timings on the server side but the client side is really far moreinteresting as it gives a true reflection of your user’s experience. The timing stuff is pretty complete and is specified in the W3”²s Navigation Timing specification. They have this awesome picture to give you an idea of when the timings are taken

TimelineTimeline

The timing information is only available in the client but you can easily wrap it up and send it back to your server. This performance information gives you a great way to gather as much information as possible about what’s happening for your users. You can find slow DNS lookups, inefficient edge caching, complicated dom events. I’m really excited to see what ideas I can come up with for this information.

These counters are available on all the modern web browsers and also Internet Explorer(ooooohhh, BURN).

2013-06-17

Selector Selection

In code review last week the topic of efficient CSS selectors in your JavaScript came up. I have always been a fan of using as simple selectors as possible. Typically I use a class name on the elements I’m selecting. Others on the team are fans of using more complex selectors such as

body > div > div:first > span:first > p

I hate this. I think that it is confusing for a human to figure out what’s being selected there without hunting through the DOM. I also think it isincrediblybrittle. All it takes is adding one element somewhere in that chain and your selectors stop working. With dynamic pages that hierarchy may well be open to change even during the life of the page. I think this example is particularly bad because it descends right from the root of the body. Even something simple like adding a notification box is going to break this.

My argument was not wellreceived. The counter argument was that any changes to the page would require a change to the JavaScript anyway so it isn’t a big deal. Humm”¦ okay, different approach.

Well I didn’t have the stats on speed of selectors at the time but today I found some. The benchmark somebody build at jsperf.comis fantastic for my purposes.

Benchmark!  Way better than a benchmatt. Benchmark! Way better than a benchmatt.

Ah ha! Looks like my method of selecting using just a class is way more efficient than parent child selectors (51% slower vs 80% slower than optimal). However my method is still quite a bit slower than having a context and using find. I was a bit surprised by that because I expected that with the commonality of selecting with a class would have been optimized to theumpteenthdegree. I was even more shocked to see that specifying the tag and then a class wasn’t faster than just selecting a class.

I think the take away here is to use classes over the confusing mess above(sorry, coworkers) and that having an id at least on a parent container allows for more efficient selection. I’m also not convinced that these aren’t micro-optimizations which should be ignored but as the rules above can be used in place of each other you might as well use the more optimal version.

2013-06-14

Automatic Credit Card Type Selection

I can’t tell you the number of times I go to a website to buy something and I have to select the type of credit card being used. It is a minor step but by gum you want to make checking out of your website as painless as possible for people. That’s why Amazon’s one click was such an amazing idea. I frequently get to that last step and abandon the order because I’ve thought better of it (sorry, WebStorm, one of these days).

Instead of asking for a credit card type why not use these simple rules(Taken from eHow)

The first two to six digits of a credit card determine who the credit provider is. For example, Master Card begins with the numbers 51-55, American Express begins with 34 or 37 and Discover begins with 6011

So it looks like from the first few digits you can tell what the card type is without having to ask your users. There is a much more complete list of the prefixes on Wikipedia. According to that article these card numbers are actually part of a larger scheme which can identify the major industry as well as the sort of card.

As an example of detecting the card type I threw together a little script. You can try it out here. It currently detects Visa, MasterCard, American Express and China Union cards(hey, they have 1.2 million ATMs, they’re doing something right). Just type in a couple of digits.

  • 4 Visa
  • 34 American Express
  • “¦

So much better than making people type and so quickly done.

2013-06-13

SVG Lines

I know what you’re thinking: you’re thinking that the lines in an SVG image are pretty boring and how did I get a whole post out of this? Actually lines have some interesting properties in SVG which makes them super-cool(you have to say super-cool with a French accent ““ the oxymoronic nature of being both French and cool is delicious).

Up first is that you can specify line endings for any line. If you’ve done work with PowerPoint or any sort of graph software then you’ll have seen lines with arrows at the end. In SVG you can actually end line with any shape you want. Firs you need to specify a marker then apply it to a line. For a normal arrow ending you just need to specify a quick triangular path.

There are also some fun attributes hanging off the marker itself. The id specifies a name; this is used later to attach the marker to the line. Next a view box is simply a way of defining how big of an object we’re creating. This is a small arrow so we’ve set the offset to (0,0) and it is 10 pixels by 10 pixels. The refX and refY will offset the arrow from the attachment point. We’re setting a value here just enough to center it on the line. Next is the actual size of the marker and the final orientation attribute rotates the marker so it follows the same slope as the final segment of the line.

This marker can easily be attached to a line like so:

And it ends up looking like

Slanty!Slanty!

Arrows are pretty boring so let’s make something more interesting.

Will get you

startstopThat stop sign is a little sickly but it gives you a good idea of what you can do.

The next cool thing is that you can specify a dotted line. This is easily done with thestroke-dasharray attribute.

dottedThis line is created with a value of 10,10. That’s 10 pixels of line then 10 pixels of space. The hilarious thing is that you can specify very complex pattern of dots and dashes. Each alternate number is either a space or a visible part of the line. For instance this is how it looks with a value ofstroke-dasharray=”3,3,3,3,3,3,10,10,10,10,10,10,3,3,3,3,3,3”³.

sosFor the very observant you may recognize this pattern as SOS in Morse code.

Don’t say I’ve never given you information you can use to signal planes from a life raft using only a mirror.

2013-06-12

Is SQL an Assembly Language for Data?

I talk a lot about alternatives to writing JavaScript(typescript, coffeescript, dart,”¦) and that JavaScript is really just an assembly language for the web. That got me thinking about SQL and whether we should be considering it an assembly language for databases.

SQL has a lot of problems which make it more difficult to use. Just look at the list of key words for it.aspx): the list is huge and the list of future reserved words bring the list to crazy levels. It would be nicer if the keywords were moved from being keywords to being library functions. The syntax is also not conducive to providing autocompletes. Consider the simple SQL

select Id, Name, Number from tblThingies

The autocomplete domain is the set of columns in tblThingies, because this is naturally typed from left to right (at least in English, how does this work in Arabic?) the domain isn’t defined until after the terms Id, Name and Number are written. This is not at all conducive to autocomplete.

The concepts in SQL are sound. Largely functions in SQL are set operations and if you treat the language as a set manipulation language instead of a procedural language then it is really good at it. Modern languages have a different approach to syntax and formatting from which SQL couldbenefit.

There are actually a number of languages which do compile to SQL. LINQ is one example and the HQL language from Hibernate and NHibernate is another. I think HQL gives a good example of what a more modern query language looks like. From the HQL community documentation I took this example:

You can see the select ordering still doesn’t provide us autocomplete capabilities and the key words are still very SQLly. I don’t think that HQL goes far enough in fixing the problems of SQL LINQ does a better job. That same query in LINQ would look something like

(although I haven’t tested this).

You can see the select is moved to the end and at the same time the select is transfigured into a more object oriented syntax. I like the idea of returning objects by default. It makes the language more powerful and it reduces the friction in talking to OO languages which are going to be the majority of languages now.

I don’t think we’ve explored the idea of SQL as an assembly language enough yet. LINQ is available through LINQ pad but I’m not aware of any standalone system which allows for HQL queries to be run against a server. I like the idea of compiling languages to SQL. The syntax is going to take some work but it would be great to see something adopted by the major vendors.

2013-06-11

How Chrome Changed the World

Some years back Google decided to build a browser. As I understand it Chrome cam about not as a 20% project like so many other google products but as an effort by the foundersSergey Brin and Larry Page. It was a brilliant move which has changed the face not just of browsers but of how we use computers in general.

Is it that Chrome ushered in the era of fast JavaScript? Is it that Chrome wasn’t afraid to innovate by including new features? Is it that Chrome introduced process per tab to eliminate the crashing hanging monstrosity that was Macromedia Flash? No the big innovation in Chrome was that it updated itself.

I know what you’re saying: lots of applications update themselves. Well that’s true. I remember building an automatic update system for a desktop application I worked on back in 2005. It was a great system but it had a flaw which I didn’t even realize until Chrome solved the problem. Myapplicationasked users if they wanted to update before the update ran. The update was not silent. Chrome just updates in the background. This results in adoption curves which look like this:

chrome-adoption-2012-11Within a few weeks almost everybody is using the latest and greatest. For years we’be dealing with the legacy of slow adoption of internet explorer versions. That should not be anywhere near as bad a problem with Chrome.

While it is impressive how easily Chome is updated what is moreimpressiveto me is that there are so few problems with this continual upgrade cycle. As was pointed out to me in a meeting today IT departments tend to be very resistant to this sort of update model. They are terrified that new versions of software are going to break things for their users. This has proven to be false. It was always amusing to me that IT departments would claim that they need to do some sort of comprehensive testing on all their software when a new version of a browser came out. There is so much to test I really don’t know how you would ever do it.

Instead it is easier to just be spry and update your apps to account for new versions of browsers. The continual updates are likely to be small and not break much. If the entire internet can handle chrome updates it seems like a single IT department could do it too.

Chrome’s big revolution has been to prove the viability of continual updates.

2013-06-10

Building a github page

I hear that the way github as a company works is that if you work there and think that github needs a feature you just build it. One of the features I’ve just recently discovered are github pages. Pages allow projects and people to put up simple static pages to communicate about the project.

I happen to do some development on a little project called Angela Smithand it needs a website. At the moment all the documentation is scatted over a series of blog posts by the various contributors. It would be great to bring that all together.

It was pretty simple to set up the most basic of pages. I checked out the latest version of AngelaSmith and created a new branch.

git branch gh-pages git checkout gh-pages

The gh-pages branch is a special branch which is monitored by a build process at github. When it detects a new checkin it will build the site and deploy it to .github.io/. In my case that meanshttp://stimms.github.io/AngelaSmith/.

That’s not a very friendly name so it is possible to set up a custom url to point at that.

You can also use the Jekyll static page generator to build pages from templates and using includes. There is a bunch of documentation out there on how to use Jekyll. I have to say that while the documentation looks good it isn’t super informative. It took me quite some time to figure out that the character encoding was very important. I ended up backing my content down to code page 437 and not UTF-8.

Once I get Jekyll figured out I’ll put up more posts on it.

2013-06-07

Improving the Best of Visualization

If you read the comments on yesterday’s post there was a bit of a discussion about some alternatives to the number of boxes in the best of visualization I created. When I built that visualization I based it on the number of games + 1. This would prevent a situation where a full series was required and the final score was ambiguous

Sorry, who won that last game?Sorry, who won that last game?

By using colours it is possible to make this look a bit better. This also offers us an opportunity to generalize some of the code written yesterday.

The most obvious improvement is around the function shouldBeFilled

First thing is that the function name is lying. It isn’t telling if the box should be filled or not. It is setting the fill color. So let’s rename it. Next up the use of hard coded values for the colours bugs me. If we’re going to be using a different colour for each team then we have to change this.

Here I’ve gone ahead and assumed that I have an options object in my class. I don’t so one will need to be added. I’d like to maintain backwards compatibility so I’ll set up some default values for options.

Nobody calling my needs to change their signature but they can still override the options if they wish. $.extend is a jQuery helper which merges the properties on two objects. One could even say that it composes a new object.

It is clear which team won nowIt is clear which team won now

To set the options a caller can just do

Guess which one uses pinkGuess which one uses pink

There was also talk of adding some emphasis to the center block. I do have some ideas for that which we’ll explore in a future post. You can see the results of today’s post athttp://bl.ocks.org/stimms/5734160