Delighted to have a post from Advent IM Operations Director, Julia McCarron.
Ellie has been asking me for a while now to do a blog piece on ‘big’ data, and I must confess to dragging my heels because I wasn’t really sure what it was. I guess if I had put my mind to it essentially it must have been the aggregation of information that made it ‘big’ and I’m not far off with that. But last night’s edition of Bang Goes the Theory made me think about what it means … and the fact that ‘big’ is probably too small a word to describe its reach.
If we want to be specific about it, big data is defined as a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. But it seems to me that this 2-D definition doesn’t do it justice. From what I can see, it’s about taking these large data sets and analysing them to find patterns – that’s what makes it ‘useful’. What you do with those patterns can be for good or bad and can range from diagnostic to research to marketing to preventative in nature, and affect people, places, processes, objects … you name it basically.
I know this kind of analysis goes on because I have a ‘loyalty’ card that regularly sends me money off vouchers for the things I buy on a frequent basis/ I know internet banner ads show me handbags for a reason, usually because I’ve just purchased another one online. I understand that it’s the accumulation of data about my buying habits that is profiled to appeal to me; but I hadn’t realised just how far this can go. On the programme in question a big data collection company said that as a result of the release of DfT data on bicycle accidents, someone had within days written an app for people which told them where to avoid riding their bicycle and therefore minimise the risk of having said accident. Who would have thought that was possible? Rolls Royce engines contain computers that analyse their activity, whilst in the air, and report in real time on peaks and troughs outside the ‘norm’, which enable airlines to do maintenance work before a problem occurs.
But if you think about it big data isn’t new. Einstein’s Theory of relativity came about because he carried out hundred of experiments and analysed them painstakingly by hand. Intelligence services cracked Hitler’s codes by looking for recurring patterns, first totally reliant on the human brain before that human brain created machines to make the analysis easier and quicker. I only get 100 free ‘bonus’ points with my next purchase of Warburton’s crumpets because a computer looks at my buying habits and has identified that I buy them every week. (Other crumpets are available – actually no they aren’t). All that has changed is the scale, speed, selectiveness and sensitivity of the collection and review of that data.
The issue comes though when that big data is also personal data, and this is probably where most of us start to question whether it’s a good thing or bad thing. The BGTT Team demonstrated how easy it is to profile individuals from their online data footprints. It’s not just about what you put on various social media but it could also be an innocent publication of contact details by your local golf club. I’m a security conscious person, for obvious reasons, but I’m sure if someone really wanted to they could find out more about me than I thought was possible, just by running a few scripts and analysing trends. I’m a genealogy enthusiast and within minutes I could potentially find out when you were born within a 3 month window, the names of your siblings, your mother and father …. and those all important security questions; your mother’s maiden name and town of your birth. So should we attempt to simply lock everything down?
At the same time as all this personal big data is being analysed its also being put to good use. Researchers are creating medical devices that can analyse brain activity and detect when a second brain trauma is occurring … and they’ve done this by analysing patterns and trends from hundreds of thousands of scan outputs to create a simply, non intrusive device that monitors pressures, electrical current and stimulus. If I opt out of my having my NHS patient record shared, I could make it that bit harder to find a cure … or be cured.
Ultimately, we wouldn’t be where we are today without big data but there is no doubt that in a digital age big data will just keep growing exponentially. I don’t think we can avoid big data and I don’t think we should, but from a security perspective I think we all just need to think about what we post, what we agree to make available, what we join up to and what we are prepared to say about ourselves in public forums. If a field isn’t mandatory don’t fill it in, don’t agree for your location to be published and maybe tell a little white lie about your age (girls we are good at that!). We can never be 100% secure – it’s not possible. Even our fridge can go rogue on us now and order food we’ve run out of but don’t actually want to replenish. But having a security conscious mind can protect us, whilst still providing a big data contribution.
some images courtesy of freedigitalphotos.net