Friday, 31 May 2013

Brave new world... are we there yet?

Following recent high-profile terrorist activities, one of my friends asked me whether it was possible to leverage the latest IT solutions to monitor every email, chat room and the internet message for terrorist-related activity. 

Setting the legal, privacy and personal freedom issues aside, let's take a pragmatic approach. It is easy to look at the new big data solutions and the amazing hardware that is available and say "in theory, it's possible to be done".  

In theory......

Firstly, have a good honest look at the data on your systems. How good is it? Perhaps 80-90% correct?

Let's assume that there are about 1,000 terrorists in the UK. We have approximately 64 million inhabitants. If we have access to every piece of information that is exchanged, even if the data is 99.9% accurate, we will probably need to arrest and question 64,000 people, just to capture 1,000 terrorists. 

That would annoy a lot of people!

Now given that your computer has perhaps 90% correct data and that others are far worse than you... How much data do you think is actually correct on the internet? How much information that people exchange through the internet is truth? It's a darn sight less than 90%.

Then you have people using code words, different languages and 16 bit encryption.

We can build the most powerful computers in the world, and the most sophisticated solutions, but unless the data is correct, we don't stand a cat in hell's chance of getting anything right. A system is only as good as the data that it holds.

So don't expect minority-report analysis of our data with predictions of where bad things will happen next any time soon. For if the truth is known, these amazing new systems will probably to fall apart when we put real-world, inconsistent data into them.

Tuesday, 28 May 2013

Defection to Linux

It's official. I have changed my home computer's operating system over to Linux. I have had an iMac for the last 6 years. 

It has been a great piece of kit, and has delivered stable performance throughout that time. There is no doubt that when I bought my iMac, it delivered the very best computing experience that was available at that time.

But things change.

The new version of OSX - mountain lion is prohibited from my machine, due to it having too low specification. This was to be expected, as my iMac only just runs Lion well, and has become extremely slow  and more than a little buggy. I decided to check out the new range of Apple computers. Frankly, I was disappointed.

All of their machines in the new range do not have drives on them any more. For machines over a thousand pounds, you should expect them to be bristling with features.

I have watched over the years as OS-X features have become increasingly restrictive. The way iTunes' digital rights management locks you into the apple revenue stream is the clearest example of this. With the advent of the software centre and the removal of the optical CD/DVD drive, they are moving ever closer to dictating to the users what they can and cannot do with their own computers. 

For me, removing the CD/DVD drive from all of their new range of home computers was the last straw. I decided that I wasn't going to cave in to Apple's plan for the designed obsolescence of my machine and the locking of me into their revenue stream.

I first experimented with linux operating systems running on virtual machines (mint, ubuntu, crunchbang, mageia, debian, fedora to name a few). When I found the one I liked, I partitioned my iMac and dual booted into it. The performance is outstanding compared to Lion, and it is rock-stable. I have TOTAL control over my computer, and if I don't like what is on it, I can make any changes I want.

Linux has breathed new life into a computer that I thought I would need to trade in for a newer model. I am back to being happy with my iMac, and I predict that I have extended the useful life of my computer by another 3-5 years. 

Saturday, 25 May 2013

Data Quality - when the goalposts move

Abbreviating, shortening or simplifying language is not new. In England, the first abbreviation system called 'shorthand' was introduced in 1837, and was designed to record meetings or dictations and quickly found favour amongst secretaries all over the world. Businesses are always trying to find ways of speeding up communication with acronyms. Also, who could have forgotten the craze of CB radios in the 1970s and 80s?

The internet and text language/speak have introduced some real changes in grammar, spelling and syntax. For instance, there is text speech, shortening words like "U" instead of you. Joining acronyms into one word delimited by capital letters and then adding a popular file extension to make the word look like a file name is an Internet forum trick - "can't believe whats happening" becomes CantBelieveWhatsHappening.jpg

Twitter's limiting of communication to 140 characters has forced the use of hash tags to precede search titles - i.e. #DataQuality. Also referring to people's user names, you precede them with an @, like @TheDataGeek (which is my user name - follow me)

What makes text speak stand out is the sheer speed and scale upon which it has been taken up by the international community. Integrate this with the internet, and you have one of the most significant changes in international communications in modern times.

While the greater corporate interests are well known, some interesting social trends are beginning to emerge. People are starting to use text speak while filling in formal documents like CVs and business letters. Presently, this is frowned upon, but soon we will have to amend our algorithms to allow for them.

How long will it be before some people will want to put a delimiting character before their name? The possibilities are endless:

My name could become @RichardNorthwood or @richardnorthwood or even #RichardNorthwood

Could the &, @ or # become new gender neutral salutations? Will people start using other symbols in their names? Could we see the removal of spaces between words?  Could we see a further simplification of the spelling of words? It's possible. Whatever happens, our information systems must evolve to cope with the biggest change in the way we communicate since the Gutenberg printing press.