Saturday, July 20, 2013

Big Data: Big, Fat, Stupid Data

I don't think it can be plausibly denied that there is an incredible amount of data available for analysis, and that there is great potential in discovering patters of behavior that can be leveraged to turbocharge customer experience and drive amazing financial results for business.   It's just that nobody seems to have quite figured out how to get from point A to point B in spite of millions of dollars spent on doing exactly that.

The amount of data being collected is massive - because storage has become very cheap (a terabyte drive costs less than $100 at retail) companies can now afford to hang onto the mountains of data that they used to throw away or simply regard as not being worth collecting in the first place.   Every click or keystroke of every visit to every site can be stored for analysis.   And this doesn't even begin to touch the vast amounts of data that can be gathered from the Internet, particular from social media forums in which over half a billion people declare which products they like and speak about them.

It's a huge mountain of data and there is the automatic assumption that anything that is very big is very important - but this is not always so.   If measured solely by the amount of bytes, most data is in the form of images and videos - one image can take as many bytes to store as a thick book, and one video can represent as many bytes as a small library.   But it's not easy to analyze, nor has anyone suggested that most snapshots and home videos contain much content that is meaningful at all.

But even for the data that is read-to-eat - ASCII text that can be easily parsed and analyzed, we're not doing a very good job of it.   Much of the data is locked up in databases that are stubbornly unavailable outside of the source or for any other purpose than to support the system to which they belong.  Moreover there are no standard formats and little interoperability, making it a chore to extract the data that is available.

And once the data is extracted, there's not much intelligence applied to the analysis.   Raw statistical analysis is practically pointless, as it tracks correlation without causation, rendering reports that do not serve to guide meaningful decisions.  There merely happens to be a statistical correlation between two bits of data and little can be done with it.

As a result, firms are data rich and knowledge poor, focused entirely on the numbers without considering their meaning.   At best, data analysis can tell you what people are doing, but not the reason why they are doing it.   The result of all of this is reports that provide random trivia or completely self-evident conclusions.   People who own dogs tend to buy a lot more dog food than people who do not.  Thanks a million for that brilliant insight.

All in all, I'm dour on the prospect of big data.   I cannot deny that there's a lot of information, and it stands to reason that there is likely to be highly valuable insight to be gleaned, but the approach that has been taken thus far has not produced anything particularly meaningful.

My experience dealing with numbers geeks is that they are reluctant to tell you anything until they have analyzed everything - which means that progress will likely continue to be painfully slow.   It also occurs to me that since it's going to take years to get anything useful out of the analyses of big data, perhaps it's worth taking a second look at small data.  Firms that have not quite figured out what to do with the limited subset of information in their order database (which is to say, most firms) likely have no business attempting to tackle the much larger task of aggregating information from all internal systems as well as harvesting it from the world outside the firewall.

And yet, that seems to be exactly what is happening, leaving me with the distinct sense that many firms are using "big data" as an excuse for their failure to make good use of the smaller collections of data that have been at their disposal all along - and hoping they can use the bigness of the task to convince others to refrain from expecting any meaningful results, at least until they are ready to retire or move on.

Seems to me that this has turned into something of a pessimistic rant, but it is not without a reason: very little seems to have been accomplished, and there aren't many signs that anything will be accomplished in the near future.   Until some results have been rendered, even a sufficiency of small "quick wins," I will remain highly skeptical, not that there is valuable information to be found, but that the prospectors haven't a clue as to how to go about finding it.

No comments:

Post a Comment