Simon Online

2013

2013-06-28

Turning on a dime

I don’t think it is any secret that Windows 8 is not doing well. The ModernUI or MetroUI was a bold move but not one which was well received by consumers. Big changes of that sort are seldom well received. It isn’t necessarily because the new UI is wrong or broken(I think it is broken but that’s not the point) it is just that people have invested a lot of time into learning the old way of doing things and change is scary. Remember when Office 2007 came out with the ribbon? Total disaster. At the time I watched a little documentary that Microsoft put out about the research they put into the productivity of the ribbon vs the old tool bars. It was amazing, they spent hours and hours on the thing doing A/B testing in a user interface labritory. I don’t remember the exact stats but they found the ribbon to be far more productive than the tool bar any only take a few hours to learn. I think the stat was that within 3 days users were more productive on the ribbon than the tool bars. Still the outcry was palpable and to this day my dad complains about not being able to find things on the ribbon(Office 1997 was really the high water mark for him).

I imagine similar testing went on with ModernUI and we’re seeing the same sort of backlash. Only this time users have alternatives: tablets and Macs. In 2007 there was no alternative to MS Office, I think you can argue that remains true. The Microsoft of today is a different beast from that of 2007: they are more responsive to user complaints. So this summer they are launching Windows 8.1 which is designed to fix many of the problems in Windows 8. Well fix the perceived problems with Windows 8. I never felt there was a big problem with Windows 8 which needed fixing in the first place. Already I’m seeing complaints that the new Start button is junk and that Windows 8.1 is no better than Windows 8. However the point is that Microsoft, a huge multi-zillion dollar company with more momentum than the ship in Speed 2 changed their whole Windows strategy.

I remember this being a terrible movie. I bet it is better than Numbers Station, though.

Good on them! Now it cost them a few executives to make the change but if Microsoft can do this then what is stopping you and your company from making big changes? See change isn’t that hard, it just requires that you value the end user more than the process.

2013-06-27

What happened to thin clients?

Somewhere at home I have this awesome little thin client computer called a Sun Ray. It doesn’t have any disk in it and has limited memory ““ basically it is a front end for a server. A visual terminal if you like. I think I got it for about $20 on eBay 7 or 8 years ago.When you boot it up it makes a DHCP request and, in addition to the normal data returned by DHCP, the server gives it the location of an image to boot. All the actual computing a user does is performed on a server and the Sun Ray simply shows images. The idea was that instead of buying a bunch of desktop computers which quickly become outdated you buy these thin clients and just upgrade the server. You might not even need to upgrade the server so much as add more servers as the old server could be upgraded easily. IT management costs would be reduced as there was no reason to send out techs to people’s offices except to replace a keyboard or mouse. The Sun Ray had no moving parts so the chances of failure were pretty damned small. Also it was built by Sun which had a reputation for making the most bullet proof hardware imaginable.

It was a well deserved reputation. Years ago we had some E450s and used them as go-carts around the office. We never broke one. Walls? Yes. A door. Yep. But the machine? Never.

Not even a purple shell could break one.

Without the need for onsite techs you could outsource all your desktop support to SE Asia or a country with more stans in its name than Syd Hoff’s famous book

More popular that the sequel “Stanley is Devoured”

For me to spend $20 on this thing I must have been convinced at the time that thin clients were the future. But look around today and do you see any thin clients in your offices or your house? I don’t know what happened. They are a great idea and one which seems even more sensible in a cloud computing world.

While I don’t see any thin clients on my desk I do see a Windows XP workstation. It hasn’t been upgraded by the IT group yet because they are too busy doing their normal jobs without running around upgrading thousands of work stations. Any ever-greening programme which existed has beencrippled by the financial crisis and even those getting new machines are getting Windows XP workstations. Upgrading an entire company is a huge undertaking both in terms of time an money. If the business is persisting on Windows XP then what’s the advantage in upgrading? It is a good question and the only answers are

Older OSes are not being supported any more and if you find an issue you’re on your own
Newer technology is more efficient from a power perspective and from a workflow perspective
People like having new hardware/software. It is a low cost way of keeping people happy (at least people like me)

Instead of upgrading all the workstations upgrading the servers is much easier. You can even do it remotely using virtual machines and have an easy downgrade path should it be needed. Why thin client computing hasn’t taken off in business I don’t know. Looking around the office here it seems that almost everybody just uses excel which can easily be put on a thin client without worrying about network latency. I suppose you could argue that Google’s Chromebook is a thin client but I don’t think it is the sort of product which is going to be on corporate desks any time soon.

So I’m asking: What happened to thin clients?

2013-06-26

3D printing is here!

A few years back I read the Cory Doctorow book Makers. It is a very interesting look at the future of technology and one of the things which made a big difference in the future of technology was 3D printing. People no longer needed to go to Wal-Mart to buy small things like kids’s toys or even tools. Instead they just printed whatever they needed on commercially available 3D printers. Need a new wrench at 3am to replace some car part I’ve never heard of? Print it. Heck, need the car part? Print it. Even if you need a cup holder for your car you can print it.

Printers are available which print with all sorts of different materials. I’m particularly excited by the ones which print in food. Yummy. However more practical are the ones which can print in metal or ceramic. Perhaps soon you’ll no longer go to Canadian Tyre to get that 5/16th socket you’re missing. Even if you can’t afford your own 3D printer a local print store will soon crop up.

When the book came out it really was a futuristic technology. I think that has changed today. Microsoft announced that Windows 8.1 will have support for 3D printing out of the box.

Microsoft is not really a very exciting company for the most part so if they’re adding support for 3D printing that means that 3D printing has moved from an experiment to a consumer-grade technology. Once the world of rapid prototyping 3D printing is starting to become actually useful.

Printers are still pretty expensive. This MakerBot Replicator 2 is over $2000

However that isn’t going to last very long. Every week on Kickstarter I see new printers which arepoppingup and are far cheaper than MakerBot’s offerings. I was particularly drawn to this Buccaneer printer.

Smaller than the MakerBot printer the rest of the specifications seem to be very similar.

Local manufacturing using 3D printers and milling machines is going to change the world. Mass production reduced the cost of goods but also removed the ability to customize the output. 3D printing brings back the customization while keeping the low price of mass production.

Exciting. 58 hours left on the buccaneer kickstart at the time of writing. Do you think my wife will mind if I buy one, or four?

2013-06-25

Retrieving Documents from Blob Storage with Shared Access Signatures

Azure blob storage provides a place where you can put large binary objects and then retrieve them relatively quickly. You can think of it it as a key-value store, if you like. I don’t know how many people use it for storing serialized objects from their applications but I bet hardly any. SQL storage is a far more common approach and even table storage with its built in serialization seems better suited for most storage of serializable data. For the most part people use it as a file system, at least I do. When there is a need to upload a document in my program the server side application takes the multi-part form data and shoves it into blob storage. When it needs to be retrieved, a far more common activity, then I like to let blob storage itself to the heavy lifting.

Because blob storage is accessed over HTTP you can actually give out the document name from blob storage and let access blob storage directly. This saves you on bandwidth and server load since you don’t have to transfer it to your server first and it is almost certainly faster for your clients.

From blob storage, to cloud server to end user

You can set up access control on blob storage to allow for anonymous read of the blob contents. However that isn’t necessarily what you want because now everybody can read the contents of that blob. What we need is a way to instruct blob storage to let people read a file but only specific groups of people. The common solution to this problem is to give people a one time key to access the blob storage. In azure world this is called a Shared Access Signature Token. I wouldn’t normally blog about this because there are a million other sources but I found that the terminology has changed since the 2.0 release of the Azure tools and now all the other tutorials are out of date. It took me a while to figure it all out so I thought I would save some time.

The first set is to generate a token.

Here I set up the blob security to remove public access then I generate aSharedAccessBlobPolicy to have an expiration time 1 minute in the past and 2 minutes in the future. The time in the past is to account for minor deviations in the clocks on the server and on blob storage. This policy is then assigned to the container.

The URL returned looks something likehttp://127.0.0.1:10000/devstoreaccount1/documents/cdb8335d-f001-46ca-84da-de12ac57157b?sv=2012-02-12&sr=c&si=temporaryaccesspolicy&sig=BC4YjrFtVfpkZry9VHCB9qDMKqGS4%2B46rcNt30kjH4o%3D

Fun! Let’s break it down

This part is the blob storage url(this one is against local dev storage and my blob id is a guid to prevent conflicts)

http://127.0.0.1:10000/devstoreaccount1/documents/cdb8335d-f001-46ca-84da-de12ac57157b

The second part here is the SAS Token which is valid for the next 2 minutes. Hurry up!

?sv=2012-02-12&sr=c&si=temporaryaccesspolicy&sig=BC4YjrFtVfpkZry9VHCB9qDMKqGS4%2B46rcNt30kjH4o%3D

If you attempt to retrive the document after the time window then you’ll get this error

While this method of preventing people from getting the protected documents is effective it isn’t flawless. It relies on providing a narrow window during which an attack can function. If you have several classes of documents, say one for each customer of your site, then it would be advisable to isolate each customer in their own blob container.

2013-06-24

Elasticsearch

I have previously written about how I think that search is the killer interface of the future. We’re producing more and more data every day and search is the best way to get at it. In the past I’ve implemented search both in a naÃ¯ve fashion and using lucene, a search engine. In thenaÃ¯ve approach I’ve just done wildcard searches inside a SQL database using a like clause

select * from table_to_search where search_column like ‘%’ + @searchTerm + ‘%’

This is a very inefficient way to search but when we benchmarked it the performance was well within our requirements even on data sets 5x the size of our current DB. A problem with this sort of search is that it requires that people be very precise with their search. So if the column being searched contains “I like to shop at the duty free shop” and they search for “duty-free” then they’re not going to get the results for which they’re looking. It is also very difficult to scale this over a large number of search columns you have to keep updating the query.

Lucene is an Apache project to provide a real search engine. It has support for everything you would expect in a search engine: stemming, synonyms, ranking,”¦ It is, however, a bit of a pain tointegratewith your existing application. That’s why I like Elasticsearch so much. It is an HTTP front end for Lucene.

I like having it as a search appliance on projects because it is just somewhere I can dump documents to be indexed for future search even if I don’t plan on searching the data right away.

Setting up a basic Elasticsearch couldn’t be simpler. You just need to download the search engine and start it with the binary in the bin directory.

bin/elasticsearch

This will start an HTTP server on port 9200(this can be configured, of course). Add documents to the collection using HTTP PUT like so

curl -PUT ‘http://localhost:9200/documents/tag/1' -d ‘{ “user” : “simon”, “post_date” : “2013-06-24T11:46:21”, “number” : “PT-0093-01A”, “description” : “Pressure transmitter” }’

The URL contains the index (documents) and the type (tag) as well as the id(1). To that we can put an arbitrary json document. If the index documents doesn’t already exist it will be created automatically with a set of defaults. These are fine defaults but lack any sort of stupport for stemming orlemming. You can set these up using the index creation API. I found a good example which demonstrates some more advanced index options:

In this example a snowball stemming filter is used to find word stems. This means that searching for walking in an index which contains only documents with walk will actually result in results. However stemming does not look words up in a dictionary, unlike lemming, so it won’t find good if you search for better. Stemming simply modified words according to an algorithm.

There are a number of ways to retrieve this document. The one we’re interested in is, of course, search. A very simple search looks like

curl -XGET ‘http://localhost:9200/documents/_search?pretty=true' -d ‘{ “query”: { “fuzzy” : {“_all”: “pressure”} } }

This will perform a fuzzy search of all the fields in our document index. You can specify an individual field in the fuzzy object by specifying a field instead of _all. You’ll also notice that I append pretty=true to the query, this just produces more readable json in the result.

Because everything is HTTP driven you might be tempted to have clients query directly against the ElasticSearch end point. However that isn’t recommend, instead it is suggested that the queries be run by a server against an ElasticSearch instance behind a firewall.

Adding search to an existing application is as easy as setting up an ElasticSearch instance. ElasticSearch can scale up over mulitple nodes if needed or you can use multiple indicies for different applications. So whether you’re big or small ElasticSearch can be a solution for searching.

2013-06-21

The Big Problem in Development

I frequently hear people saying “the difficult problem in computing is X” or “nothing is hard in computing but Y” for some value of X and Y. The most common one I hear is Phil Karlton’s quote

There are only two hard things in Computer Science: cache invalidation and naming things.

I think that both naming and cache invalidation are hard but I don’t think they are the hardest problems in computing science. Nor do I think “Is P == NP?” is the hard problem. It is certainly hard but it isn’t all that important except for a certain class of problems. Maybe I’m doing the wrong things with my life but I can count on two hands the number of times I’ve run into a problem where the solution was a linear approximation or asearchin NP space.

No I think the big problem is: how is this going to change in the future?

Every design decision is informed by how much you’re going to need to change the code infuture and in what ways it is going to change. Should you use dependency injection? The answer lies in whether you expect new implementations to be needed in future. Should you use a message based architecture? Well are other services going to be interested inreceivingnotifications in future? How about deploying your reports to the client or to the server? If the reports are going to change frequently then to the server but if they’re static then client deployment provides a better user experience.Even problems like designing for scalability are informed by a knowledge of just how much the application is going to need to scale in the future.

The worst part about this problem is that it is clearly unsolvable. All you can do is guess about how maintainable and future proof your code base needs to be and make a decision. No matter should you be toopessimisticor too positive about your guess there is going to be a cost for guessing wrong. I’m a software guy more than a business guy so I think that the best bet is to guesspessimisticand build your application to be moreresilientto change. Business people might side with building the application as quickly and easily as possible because you don’t know if it is going to succeed in the first place.

Make no mistake, it is aconundrum. When you pay for senior developers and architects you’re paying for people who’ve been around enough to make good guesses. In my experience good guessing more than pays for the cost of hiring good developers.

If you happen to come up with a way to tell the future and solve this problem then let me know. You could just let me know by sending me the winning lotto numbers for the next 5 or 6 major lottery jackpots.

2013-06-20

I don't hate OAuth

It’s weird that I don’t hate OAuth. It is a combination of lots of things I hate: A complicated protocol and supported by Facebook(who I strongly dislike). Yet OAuth and OpenID are both technologies I support fully.

OpenID is a method of delegating authentication to a third party. So say I wanted to have user accounts on my site but I didn’t want to go through the trouble of hashing passwords (you’re not hashing with SHA are you?). Instead I can delegate all themessinessof storing password information with a third party. When a user signs in to my site I’llactuallyhave them sign in with their choice of account. This third party will pass a token back to my site to let me know that the user did sign in successfully. Any time you see those sign in with buttons chances are that it is implemented using OpenID.

OAuth is a similar concept to OpenID except that instead of the third party site giving me an authentication token I ask it for permission to access a protected resource. So if I waswritinga tool for displaying tweets in your timeline I would need to access the protected information held by twitter. My application would refer you to the server(Twitter) which would ask for your password and then refer the session back to my application. My application never sees your password, instead if my app is accessing Twitter you can remain confident that only Twitter is getting your password.

I really like the idea that only tokens are passed around and never passwords. Being able to revoke the access of applications to a protected resource at any time without invalidating your password.

OAuth has a reputation for being not just difficult to implement but also inconsistent from one implementation to another. This is notwhollyan undeserved reputation. The fact that there exist two competing standards 1.0a and 2.0 doesn’t help at all. There is some argument that 2.0 is less secure and probably should not be used. I’m not versed enough to give anopinionon that.

If you want to provide API access to your data then OAuth is probably worth looking into even ifimplementationsare a bit spotty.

2013-06-19

Browser Performance Counters

I was reading a blog post the other day on dealing with memory leaks in gmail, I’vecompletelylost the link now but it put me onto a lot ofinterestingclient side tools of which I was onlytangentiallyaware. Readers of this blog will know that I’m on this constant push towards shifting functionality from the server to the client. My push is just part of the great cycle of architecture.

Hooray! It’s another one of Simon’s MS Paint graphs

In another 10 years we’ll start moving all our functionality back to the server as a new generation discovers the power of centralizing your logic. It is a funny old world.

Anyway this post is about the information available to you within chrome’s performance namespace. If you open up any page in chrome and hit F12 you’ll get the developer tools. In there grab the console and type “˜performance’. The object returned has a number of properties, largely these properties are filled with counters for memory and load times.

There are a lot of counters in there so let’s take it one by one. The first one is performance.memory. This contains a MemoryInfo object which in turn contains 3 counters. jsHeapSizeLimit is the maximum sizeavailableto the heap, usedJSHeapSize is the amount of memory currently used and totalJsHeapSize is the total currently reserved heap space including segments not currently occupied. These counters can be used to find pages which use a lot of memory or find memory leaks in long living single page applications.

The next set of counters are timing records. Typically you would do timings on the server side but the client side is really far moreinteresting as it gives a true reflection of your user’s experience. The timing stuff is pretty complete and is specified in the W3”²s Navigation Timing specification. They have this awesome picture to give you an idea of when the timings are taken

Timeline

The timing information is only available in the client but you can easily wrap it up and send it back to your server. This performance information gives you a great way to gather as much information as possible about what’s happening for your users. You can find slow DNS lookups, inefficient edge caching, complicated dom events. I’m really excited to see what ideas I can come up with for this information.

These counters are available on all the modern web browsers and also Internet Explorer(ooooohhh, BURN).

2013-06-17

Selector Selection

In code review last week the topic of efficient CSS selectors in your JavaScript came up. I have always been a fan of using as simple selectors as possible. Typically I use a class name on the elements I’m selecting. Others on the team are fans of using more complex selectors such as

body > div > div:first > span:first > p

I hate this. I think that it is confusing for a human to figure out what’s being selected there without hunting through the DOM. I also think it isincrediblybrittle. All it takes is adding one element somewhere in that chain and your selectors stop working. With dynamic pages that hierarchy may well be open to change even during the life of the page. I think this example is particularly bad because it descends right from the root of the body. Even something simple like adding a notification box is going to break this.

My argument was not wellreceived. The counter argument was that any changes to the page would require a change to the JavaScript anyway so it isn’t a big deal. Humm”¦ okay, different approach.

Well I didn’t have the stats on speed of selectors at the time but today I found some. The benchmark somebody build at jsperf.comis fantastic for my purposes.

Benchmark! Way better than a benchmatt.

Ah ha! Looks like my method of selecting using just a class is way more efficient than parent child selectors (51% slower vs 80% slower than optimal). However my method is still quite a bit slower than having a context and using find. I was a bit surprised by that because I expected that with the commonality of selecting with a class would have been optimized to theumpteenthdegree. I was even more shocked to see that specifying the tag and then a class wasn’t faster than just selecting a class.

I think the take away here is to use classes over the confusing mess above(sorry, coworkers) and that having an id at least on a parent container allows for more efficient selection. I’m also not convinced that these aren’t micro-optimizations which should be ignored but as the rules above can be used in place of each other you might as well use the more optimal version.

2013-06-14

Automatic Credit Card Type Selection

I can’t tell you the number of times I go to a website to buy something and I have to select the type of credit card being used. It is a minor step but by gum you want to make checking out of your website as painless as possible for people. That’s why Amazon’s one click was such an amazing idea. I frequently get to that last step and abandon the order because I’ve thought better of it (sorry, WebStorm, one of these days).

Instead of asking for a credit card type why not use these simple rules(Taken from eHow)

The first two to six digits of a credit card determine who the credit provider is. For example, Master Card begins with the numbers 51-55, American Express begins with 34 or 37 and Discover begins with 6011

So it looks like from the first few digits you can tell what the card type is without having to ask your users. There is a much more complete list of the prefixes on Wikipedia. According to that article these card numbers are actually part of a larger scheme which can identify the major industry as well as the sort of card.

As an example of detecting the card type I threw together a little script. You can try it out here. It currently detects Visa, MasterCard, American Express and China Union cards(hey, they have 1.2 million ATMs, they’re doing something right). Just type in a couple of digits.

4 Visa
34 American Express
“¦

So much better than making people type and so quickly done.

Archives

A blog about computer programming and technology.

My Books