Wednesday, December 26, 2012

R, ah

Today I was back in the office, but it was very much denuded of people, which gave me a chance to buckle down and work on getting to grips with R.

R is a statistical workbench of a computer programme, a heinously complicated yet very capable set of tools for analysing data. It's designed more for a programmer who's good at stats than the other way round though: it laughs in the face of easy user interfaces and although there's a million and one things it can do, everything involves installing an extra library here or there.

I'm therefore not sure if it's been efficient for me to spend eight hours doing something that would take about four hours in Excel. That sounds obvious, but once you've done it once in R, it's much easier to do it again, and a competent R programmer can have it happily sucking in data and belching out information very quickly. I'm not that programmer, so I've spent hours fighting with it, trying to get it to play nice with the other tools installed on my laptop, and occasionally having glimmers of hope when it produces a nice looking box plot or a nicely formatted graph.

There are some really cool things it will do with mapping data, which allow you to mash together any geo-encoded data you like, whether it's crime rates, house prices or delivery fees for Pizza Hut, and then output a beautiful looking map. If you can get the mapping data in there: I'm fighting with documentation that's five years old, software that was written a month ago, and a grumpy man in Norway who believes everything in the world would be alright if we just RTFM.

We'll see. It's been interesting to discover that Singapore gives away mapping data for free to anyone with an internet connection. I just need a problem for my solution now. Then the people of Singapore will wonder at the glory of my correlation chart between MRT pricing and number of children in a family. We're on the edge of a great era.

Or was that 'error'?


Post a Comment