Open Data - Countdown to Open Data Day 3
This is day 3 of my countdown to Open Data Day. That sneaky Open Data Day is slithering and sneaking up on us like a greased snake on an ice rink. So far we’ve looked at data from the provincial government and the city. That leaves us just one level of government: the federal government. I actually found that the feds had the best collection of data. Their site data.gc.cahas a huge number of data sets. What’s more is that the government has announced that it will be adopting the same open government system which has been developed jointly by India and the US: http://www.opengovtplatform.org/.
One of the really interesting things you can do with open data is to merge multiple data sets from different sources and pull out conclusions nobody has ever looked at before. That’s what I’ll attempt to do here.
Data.gc.ca has an amazing number of data sets available so if you’re like me and you’re just browsing for something fun to play with then you’re in for a bit of a challenge. I eventually found a couple of data sets related to farming in Canada which looked like they could be fun. The first was a set of data about farm incomes and net worths between 2001 and 2010. The second was as collection of data about yields of various crops in the same time frame.
I started off in excel summarizing and linking these data sets. I was interested to see if there was acorrelationbetween high grain yields per hector and an increase in farm revenue. This would be a reasonable assumption as getting more grain per hector should allow you to sell more and earn more money. Using the power of Excel I merged and cut up data sets to get this table:
Farm Revenue | Yield Per Hector | Production in tonnes | |
2001 | 183267 | 2200 | 5864900 |
2002 | 211191 | 1900 | 3522400 |
2003 | 194331 | 2600 | 6429600 |
2004 | 238055 | 3100 | 7571400 |
2005 | 218350 | 3200 | 8371400 |
2006 | 262838 | 2900 | 7503400 |
2007 | 300918 | 2600 | 6076100 |
2008 | 381597 | 3200 | 8736200 |
2009 | 381250 | 2800 | 7440700 |
2010 | 356636 | 3200 | 8201300 |
2011 | 480056 | 3300 | 8839600 |
I threw it up against d3.js and produced some code which was very similar to my previous bar chart example in HTML 5 Data Visualizations ““ Part 5 ““ D3.js
I didn’t bother with any scales because it isimmediatelyapparent that there does not seem to be any correlation. Huh. I would have thought the opposite.
You can see a live demo and the code over athttp://bl.ocks.org/stimms/5008627