2009-12-13

Measuring Language Productivity

I recently asked a question over at stackoverflow about the productivity gains in various languages.

Does anybody know of any research or benchmarks of how long it takes to develop the same
application in a variety of languages? Really I’m looking for Java vs. C++ but any
comparisons would be useful. I have the feeling there is a section in Code Complete
about this but my copy is at work.

I was really asking because I wanted to help justify my use of the Pellet semantic reasoner over the FaCT++ reasoner in a paper.

What emerged from the question was that there really was not much good research into the topic of language productivity and that any research which had been done was from the 2000 time-frame. What makes research like this difficult is finding a large sample size and finding problems which don’t favour one class of language greatly over another. That got me thinking, what better source of programmers is there than stackoverflow? There are developers from all across the spectrum of languages and abilities; there is even a pretty good geographic disbursement.

Let’s do this research ourselves! I propose a stackoverflow language programming contest. We’ll develop a suite of programming tasks which try as hard as possible to not focus on the advantages of one particular language and gather metrics. I think we should gather

  • Time taken to develop
  • Lines of code required
  • Runtime over the same input
  • Memory usage over the same input
  • Other things I haven’t thought of

I’ll set up a site to gather people’s solutions to the problems and collate statistics but the problems should be proposed by the community. We’ll allow people to checkout the problem set, time how long it takes to the to complete it and then submit the code for their answers. I’ll run the code and benchmark the results and after, say two weeks of having the contest open, publish my results as well as the dataset for anybody else to analyze.

2009-12-09

Abuse of Extension Methods

In the code base I’m working with we have a number of objects which augment existing objects. For instance I have a User object which is generated by my ORM so it looks like

 string userName;  
 string firstName;  
 string lastName;  
 int companyID;  
 int locationID;

In order to display user objects it is useful to have the name of the company and the location which are stored in another table. To limit the amount of stuff being passed around we defined an ExtendedUser which extends User and adds the fields

 string companyName;  
 string locationName;

Creating these extended classes requires passing in a base class and then pulling all the properties off of it and assigning them to the extended class. This is suboptimal because it means that when a new property is added to the bass class it has to be added to the code which extracts the properties in the extended class. To address this I created a method which iterates over the properties in the base class and assigns them to the extended class.

public static void CopyProperties(this object destination, object source, List<string> ignoreList)  
 {  
 foreach (var property in destination.GetType().GetProperties())  
 {  
 if (source.GetType().GetProperties().Select(p => p.Name).Contains(property.Name) && !ignoreList.Contains(property.Name))  
 {  
 var sourceProperty = source.GetType().GetProperty(property.Name);  
 if (property.CanWrite && sourceProperty.GetType() == property.GetType() && sourceProperty.GetValue(source, null) != null)  
 property.SetValue(destination, sourceProperty.GetValue(source, null), null);  
 }  
 }  
 }

If you have sharp eyes you’ll notice that I’ve defined this method as an extension method. This allows me to do insane things like

ExpandedClass expandedClass = new ExpandedClass();  
expandedClass.CopyProperties(cls);  
expandedClass.LocationName = GetLocationNameFromID(cls.LocationID);  
expandedClass.CourseName = GetCourseNameFromID(cls.CourseID);  
expandedClass.InstructorName = GetInstructorNameFromID(cls.InstructorID);  
expandedClass.CompanyName = GetCompanyNameFromID(cls.CompanyID);

I can also do this for any other two classes which share property names.

2009-11-14

Persisting in MVC

Rob Conery, who is a giant in the ASP.net MVC world(he wrote the ASP.net store front and is also the author of a 200 page book on inversion of control) is calling for suggestions about an active record persistence engine. I wanted to present how I think it should be done which is just a bit too long for a comment on the tail end of Rob’s blog. I’ve been reading a lot lately about areas in MVC2 and the portable areas project which is part of MVC contrib project. Now the portable areas aren’t yet finalized but the idea is that these areas will be able to be dropped into a project and will provide some set of controllers and views which will provide a swack of functionality.

The example I’ve seen bandied about is that of a forum. Everybody has written a forum or two in their time now you can just drop in an area and get the functionality for free. I can see a lot of these components becoming available on codeplex or git hub. Component based software like this is “the future” just like SOA was the future a few years ago. The problem with components like this is that it is difficult to keep things consistent across the various components. At one end of the spectrum of self containment If each component is self contained then it has to provide for its own data persistence as well as any other services it consumes.

I have helpfully harnessed the power of MS Paint to create an illustration of the spectrum between a component being self contained and being reliant on services being provided for it. If anybody is interested my artistic skills are available for hire. The further to the left the more portable the further to the right the more reliant the components are on services being provided for them and the less portable. We want to be towards the left, because left is right.

If these components really are the future then we need to find a way to couple the components and provide communication between them. This is where MEF steps up to the plate. What I’m proposing is that rather than spending our time creating unified interfaces for storing data we create a method agnostic object store. Components would call out to MEF for a persistence engine and then pass in whatever it was they wanted to save. The engine should handle the creation of database tables on the fly or files or web service callouts to a cloud. That is what I believe should exist instead of a more concrete IActiveRecordEngine.

What’s the advantage? We keep the standard interface for which Rob is striving but we can now have that interface implemented by a plugable component rather than having it hard coded into a web.config.

The second part of Rob’s post is about creating opinionated controllers. I have to say I’m dead against that. I agree with the goal of implementing the basic CRUD operations for people, in fact I’m in love with it. What I don’t like is that it is implemented in a base class from which my controllers descend. If I’m reading the post correctly then the base controller is implementing actual actions. It is dangerous to implement actions willy nilly, actions which could be dangerous and people wouldn’t even realize the actions exist. Chances are very good that users are just going to leave the actions implemented rather than overriding them with noop actions.

Another option is that I’m reading this post incorrectly and the methods in the base class are private and not actions. I like that a lot more, but even more I like generating template controllers. Subsonic 3 follows this methodology and it is really nice to be able to twiddle with bits of the implementation. What’s more the generation doesn’t have to stop at the controller. If the implementation in the controller is known then why not generate the views as well?

All in all I like the idea of improving the object store support in ASP.net MVC but I would like it to be as flexible as possible.

2009-10-28

Quick post on getting node information from Umbraco with IronPython

I was just working with the IronPython page type in umbraco and needed to get a property from the page I was on. This can be done by accessing the Node API found in umbraco.presentation.nodeFactory. In order to be able to pull a value you will need pull in that part of the API

import umbraco.presentation.nodeFactory
from umbraco.presentation.nodeFactory import *

Now you can get the current node and query its properties

print Node.GetCurrent().GetProperty(“Address”).Value

2009-10-07

xVal PostSharp 1.0 Demo Project

After my last post I thought I would look at the demo project for xVal 1.0 and see if I could get it working with PostSharp. It was a little bit differnt in how it way set up from my projects but I figured it could still be improved with PostSharp. My first issue was that the method I was intercepting was in the entity itself rather than in a repository. This meant that there were methods in the entity which I didn’t wish to intercept. There were two classes of those

  1. Accessor methods ““ we don’t need to intercept getters
  2. Internal methods ““ ASP.net MVC uses reflection to examine the internals of the data classes in order to bind form results to them. We can avoid intercepting these by ignoring methods which start with “˜.’

Next because the entity already contained all of the data it needed to persist the persistance method didn’t have any arguments. In the previous post I assumed that this would always be the case. You know what they say about assuming: if you assume you make a jerk out of everybody in Venice. Pretty sure that is the saying.

This required and expansion of the current validator

public override void OnEntry(MethodExecutionEventArgs eventArgs)
{
if (IsAccessor(eventArgs) || IsInternal(eventArgs))
return;
if (HasArguments(eventArgs))
{
CheckPassedInformation(eventArgs);
}
else
{
CheckSelfContainedEntity(eventArgs);
}
base.OnEntry(eventArgs);
}

private static bool HasArguments(MethodExecutionEventArgs eventArgs)
{
return eventArgs.GetReadOnlyArgumentArray() != null && eventArgs.GetReadOnlyArgumentArray().Count() &rt; 0;
}

private static void CheckPassedInformation(MethodExecutionEventArgs eventArgs)
{
var toValidate = eventArgs.GetReadOnlyArgumentArray()[0];
var errors = DataAnnotationsValidationRunner.GetErrors(toValidate);
if (errors.Any())
throw new RulesException(errors);
}

private static void CheckSelfContainedEntity(MethodExecutionEventArgs eventArgs)
{
var toValidate = eventArgs.Instance;
var errors = DataAnnotationsValidationRunner.GetErrors(toValidate);
if (errors.Any())
throw new RulesException(errors);
}

public bool IsAccessor(MethodExecutionEventArgs eventArgs)
{
if (eventArgs.Method.Name.StartsWith(“get_”))
return true;
return false;

}

public bool IsInternal(MethodExecutionEventArgs eventArgs)
{
if (eventArgs.Method.Name.StartsWith(“.”))
return true;
return false;

}

You can see I did a little bit of clean code refactoring in there to extract some methods. Now two different methods of saving information are checked.

The lesson here seems to be that the way in which you construct your data persistance layer has an effect on the construction of the validator such that there is no generic aspect which you can download and use.

You can download my modified xVal project here. In order to get it running you’ll need to have PostSharp installed but you’re going to want it for lots of other stuff so get going on installing it.

Oh just one more note, when you’re trying it out be sure to disable javascript so that the page actually posts back and doesn’t validate using the javascript validation.

2009-09-16

Cleaning Up xVal Validation With PostSharp

Even though the new ASP.net MVC 2.0 framework comes with built in validation it is useful to look at some alternatives. Afterall you don’t want Microsoft telling you how to do everything, do you? One of the better validation frameworks is the xVal framework which just went 1.0. In this release there has been added support for a number of new features, probably the coolest of which is AJAX based validation for complex input.

However xVal does have one drawback, it is quite verbose to implement. In every method which alters data you will have to put something like

 var errors = DataAnnotationsValidationRunner.GetErrors(this).ToList();  

 if(errors.Any())  
 throw new RulesException(errors);

This is a bit repedative and kind of tiresome. Sure we could extract a method from this and just call that each and every time we edit data but that doesn’t really solve the underlying problem of having code which is repeated.

Enter PostSharp.

PostSharp is an aspect oriented addon for .net languages. It acutally does most of its weaving in a post build step modifying the MSIL the compiler generates. We can extract the validation into an aspect.

 [Serializable]  
 public class ValidateAttribute : OnMethodBoundaryAspect  
 {  
 public ValidateAttribute(){}  

 public override void OnEntry(MethodExecutionEventArgs eventArgs)  
 {  
 if (eventArgs.GetReadOnlyArgumentArray() != null && eventArgs.GetReadOnlyArgumentArray().Count() > 0)  
 {  
 var toValidate = eventArgs.GetReadOnlyArgumentArray()[0];  
 var errors = DataAnnotationsValidationRunner.GetErrors(toValidate);  
 if (errors.Any())  
 throw new RulesException(errors);  
 }  
 }  
 }

Here we create an OnEntry method which is called before the method we are intercepting. We skip any method with no arguments since it isn’t likely to be updating data. Then we extract the argumetns and pass them into the validator for it to do its business.

Finaly we give the PostSharp framework a bit of information about where to use this aspect

AssemblyInfo.cs

[assembly: CertificateSearch.Aspects.Validate(AttributeTargetTypes = "CertificateSearch.Models.*Repository")]

I have applied it to all the methods in files which end with Repository the Models namespace. That covers all the data modification methods in the project. I can now add new methods without the cost of ensuring that I validate each one.

2009-09-16

IE Caches JSON

I ran into an interesting problem today on everybody’s favorite browser Internet Explorer. At issue was a page I had which was partially populated using jQuery’s getJSON function. As it turns out even though I had the caching turned to no-cache on the server IE was perfectly happy to cache the document because it was fetched using GET. Apparently this is OK to do. Obviously this ruins my site’s functionality so I instructed jQuery to override it by setting

$.ajaxSetup({ cache: true });

This function works by adding a nonsense value to the end of the request. Looking at the actual jQuery source we can see that the current time is appened to the request which makes the browser believe it is a new URL.

if ( s.cache === false && type === “GET” ) {
var ts = now();

// try replacing = if it is there
var ret = s.url.replace(rts, “$1
=” + ts + “$2”);

// if nothing was replaced, add timestamp to the end
s.url = ret + ((ret === s.url) ? (rquery.test(s.url) ? “&” : “?”) + “_=” + ts : “”);
}

Problem solved

2009-08-28

Caching and Why You Shouldn't Listen to Blogs

A while ago I wrote and article entitled “HTML Helper for Including and Compressing Javascript“, no don’t click on that link because it is all wrong. The gist of the article was that in order to save clients from downloading a bunch javascript files each time they visited the site opening up a bunch of costly connections a handler would include all those files in the HTML page and, as an added bonus, compress them. I forgot one key thing and leddt was good enough to point it out. Because each page you load on the site has the javascript inlined there is no way to cache it.

So what do we do? I think the best solution is to stop compressing the javascript on request. Instead combine and compress the javascript as part of the build process and then serve it up as a separate request. The disadvantage here is that if you use a library like jquery-ui on only one page you end up downloading it for any page the user visits. However the price you pay in getting that is a one time cost while using the terrible solution I suggested before you pay for it again and again and again. In that way it is much like not taking out the trash, I have to hear about it every day plus it smells. You wouldn’t believe how hard it is to get the smell out of the fur of the 40 dogs in the puppy mill I run in my garage.

How things are cached on a browser has always been a mystery to me and there isn’t a whole lot on the internet about the technicalities of what browsers do and do not cache. Basically it seems to come down to the cache-control header which govern how devices retain content. The ever so verbose W3C HTTP 1.1 spec defines the grammar for cache-control as


Cache-Control = "Cache-Control" ":" 1#cache-directive  
 cache-directive = cache-request-directive  
 | cache-response-directive  
 cache-request-directive =  
 "no-cache" ; Section 14.9.1  
 | "no-store" ; Section 14.9.2  
 | "max-age" "=" delta-seconds ; Section 14.9.3, 14.9.4  
 | "max-stale" [ "=" delta-seconds ] ; Section 14.9.3  
 | "min-fresh" "=" delta-seconds ; Section 14.9.3  
 | "no-transform" ; Section 14.9.5  
 | "only-if-cached" ; Section 14.9.4  
 | cache-extension ; Section 14.9.6  
 cache-response-directive =  
 "public" ; Section 14.9.1  
 | "private" [ "=" 1#field-name ] ; Section 14.9.1  
 | "no-cache" [ "=" 1#field-name ]; Section 14.9.1  
 | "no-store" ; Section 14.9.2  
 | "no-transform" ; Section 14.9.5  
 | "must-revalidate" ; Section 14.9.4  
 | "proxy-revalidate" ; Section 14.9.4  
 | "max-age" "=" delta-seconds ; Section 14.9.3  
 | "s-maxage" "=" delta-seconds ; Section 14.9.3  
 | cache-extension ; Section 14.9.6  
 cache-extension = token [ "=" ( token | quoted-string ) ]

Pretty simple. The fields you have to watch out for are max-age and public/private. The age is how long the cache is permitted to retain the document before it must rerequest it and public indicates the page is public and should be cached for all users. How the actual browsers will implement these is a function of the users so all you can do is make sure your site obeys the standard.

But don’t trust me, I’m just a blogger and I already lied to you once. In my next post I’ll talk a bit about caching in ASP.net and how to save database trips.

2009-08-19

Bookmarklet for MSDN

Today I was cruising the old MSDN using their much better low bandwidth version when I stumbled across a page on events in C#. What got my attention was the example code all grey and boring not to mention hard to follow. What this page needed was a little bit of SyntaxHighlighter Alex Gorbatchev’s glorious javascript library which add syntax to source code. I use it right here on my blog as does everybody else who passes the coolness test. The test, of course, being the use of SyntaxHighlighter. I hacked at the jQueryify bookmarklet and managed to get it to load the correct SyntaxHighlighter libraries and stylesheets. The next time you’re on MSDN squinting at a piece of code try hitting MSDN Style. It only works on firefox at the moment but I’ll update it to work on IE as well. Simply drag this link to your bookmarks bar and you’re good to go:

MSDN Style

Source

javascript:%20(function(){
function getScript(url,success){
var script=document.createElement(‘script’);
script.src=url;
var head=document.getElementsByTagName(“head”)[0], done=false;
script.onload=script.onreadystatechange = function(){
if ( !done && (!this.readyState ||
this.readyState == “loaded” || this.readyState == “complete”) ) {
done=true;
success();
}
};
head.appendChild(script);
};
getScript(‘http://alexgorbatchev.com/pub/sh/2.0.320/scripts/shCore.js',function(){});
getScript(‘http://alexgorbatchev.com/pub/sh/2.0.320/scripts/shBrushCSharp.js',function(){});
getScript(‘http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js',function() {
loadStyle(‘http://alexgorbatchev.com/pub/sh/2.0.320/styles/shCore.css');
loadStyle(‘http://alexgorbatchev.com/pub/sh/2.0.320/styles/shThemeDefault.css');
return completeLoad();
});
function loadStyle(url)
{
style=document.createElement(‘link’);
style.href = url;
style.rel = “stylesheet”;
style.type=”text/css”;
document.getElementsByTagName(“head”)[0].appendChild(style);
};

function completeLoad() {
$(“.libCScode:not(div)”).addClass(“brush: csharp”);

SyntaxHighlighter.highlight();
};
})();

2009-08-07

A watershed moment

News in the twitterverse today is all about this article at inforworld or, more precisely, the guidelines produced by the American Law Institute. In a small section they suggest that software companies should be held liable for shipping software with known bugs. Damn right they should.

The software world has long survived protected from their own mistakes through the EULA. Let’s look at a license agreement, perhaps the Windows XP license agreement as a typical example. Here is a excerpt from section 2.15:

LIMITATION ON REMEDIES; NO CONSEQUENTIAL OR OTHER DAMAGES. Your exclusive remedy for any breach of this Limited Warranty is as set forth below. Except for any refund elected by Microsoft, YOU ARE NOT ENTITLED TO ANY DAMAGES, INCLUDING BUT NOT LIMITED TO CONSEQUENTIAL DAMAGES

So if Windows XP crashes and you loose an assignment or if your battleship sinks Microsoft is sorry but they aren’t going to stand up and take responsibility. At least not more than the purchase price of Windows and I’ll bet you it is a trial to get them to cough up even that. A lot of people are upset about this guideline. By a lot of people I, of course, mean software vendors. I suppose I too would be upset if I shipped known bad software, but I don’t because my customers deserve more than that. I’m not saying my software is perfect, it isn’t, however it has no known bugs. And that’s the key right there: know bugs.

All software is going to have bugs in it, even with the most rigorous testing and test driven development there are still going to be issues. Most of these bugs are not covered under the guidelines because a concerted effort was made to find bugs and fix them during the development process. This doesn’t mean that you can’t ship software with know issues, it just means that now the risk of doing so is spread more evenly between you and your client. Which is only right.

Drawing again on the tired car analogy people would be outraged if Ford shipped a car which they knew imploded in the rain. Sure the car industry isn’t a perfect analogy but it is pretty good. Lots of software is responsible for our lives in the same way as cars, heck there is a huge amount of software in cars. Why should software vendors be any less liable?

What does this really mean for companies? In a word: transparency. Let’s say that I ship some software with an issue and it hurts somebody and they decide to sue. It is going to cost me money in lawyers even if I had no idea that defect existed. I’ve got to show that I didn’t know the defect existed at shipping time. That is going to require a bunch of e-discovery and searching of e-mails, costly. What can I do to try to avoid being sued in the first place? Easy, publish all the bugs in my software in a system the public can see. I mean internal bugs as well as external bugs, everything. Next I fix bugs before I write new code and I fix them promptly. If I can gain a reputation as open and responsive people are far more likely to write off my mistakes as just that.

Now somebody is going to argue that publishing every defect puts me at a PR disadvantage compared to the guys down the street who don’t advertise their bugs. I don’t believe that for a second. People who buy software are generally not dumb, they know that bugs exist and they know that they’re probably going to find some in the software they buy. Do you want to deal with a company which won’t admit that they have bugs and even threatens to sue people who bring them issues or with one which has shown itself to be responsive to customer issues? I’ll take responsive every time.

The time has passed when software was used only for esoteric purposes by men in white coats and it is time the industry grew up and learned that if you get a paper route you can’t dump the papers in the garbage without consequences. Stop dumping bad code on the world.