Getting to the Big Data Problem

The amount of data in your organization is big. You just won’t believe how vastly, hugely, mind- bogglingly big it is. I mean you might think that it take a long time to read your inbox but that’s just peanuts to how much your organization touches in a single day.

– Mangling of quote by Douglas Adams in Hitchhiker’s Guide to the Galaxy

The amount of data in your organization is massive. Anyone who has been in the Content Management industry for more than a few years can tell you that much. All those content repositories are nothing more than messy, poorly structured, data warehouses.

The part that I didn’t realize until watching Clay Shirky’s keynote at AIIM 2012 was that the amount of data that many organizations is amassing isn’t always enough. Many organizations just are dealing with what I will now shockingly classify as “traditional” Big Data issue. They don’t have the volume, variability, variety, or velocity of data. (Your actual “V”s may vary)

This is actually an even bigger problem. All those organizations are leaving useful insights on the table. How can tell what my customers, employees , or members need without actually tracking what they are doing?

imageLet’s talk about you, the reader. If you are reading this, you are likely an Information Professional and the kind of person that AIIM wants to help become better at solving all these information challenges.

Maybe you’ve already been to the website. Maybe you found it through a search on how to solve a problem. Maybe you found one resources and left when you couldn’t find another. Maybe you stayed a while and left a series of intelligent comments on blogs and in discussions.

Do I know this? Some of it. I can dig in and surface all sorts of web traffic information. What I can’t do is track your engagement on the site and use it to provide you with more content. I can’t use it to identify good content for others. I can’t use it to surface your contributions based upon the quality of the information.


Because the data isn’t there. I don’t have a big data problem. It isn’t a matter of coming up with questions, monitoring trends, and uncovering new questions. Without all that data, I can’t even begin to determine what questions I really need to be asking.

Should I “gamify” the website? Hate or love the term, some  people do respond when they are recognized for their contributions. Some people do respond to leaderboards. Some people use those tools to FIND quality, not just measure it.

The problem is, how can I reward for behavior that I don’t measure? How can I determine what behavior should be rewarded?

I have to collect data. I have to start amassing data so I can then find out what I have been missing. I have to do all of this in order to better serve my fellow members of AIIM.

I have to go out and create my own big data problem.

If I don’t, I’m making the answers to the most important question unanswerable.

I don’t even know what that question is.

5 thoughts on “Getting to the Big Data Problem

  1. Yes, the problem with Big Data is that it brings on many questions:

    1) Is the data model that we use correct?
    2) Is how we collect data correct?
    3) Is data correlation linked to some causality?
    4) When we act because of the data will our actions cause intended changes?
    5) Is how the changes will be measured related to our actions?
    6) Do we understand all the cross-influences of data and action?
    7) Will human behavior be reflected in the data?
    8) Will a human response even be reflected in the data?
    9) Are my statistical methods filtering our relevant detail?
    10) Can Big Data actually produce and sensible questions?
    11) And worse, given the above can it actually produce sensible answers about people?

    You are stumbling into the same trap as everyone else.


    • Not sure I am stumbling into a trap. I have a reason for collecting the information. If I collect the data but can’t make sense of it, I know as much as I know today. I understand that it requires some real analysis and planning but I actually have a few data geeks to lean on during this process. I’m sure I won’t collect everything I need and will collect some things that I don’t need. The goal is to start collecting, analyzing, and then refining.

      I know for a fact that there is data that I am not collecting that I can use to make things better. I’m going to start down that path. Now (or yesterday actually).


      • Lawrence, you simply need to be aware that the data are created by YOU and your assumptions and are not simply there … assuming that the data is there AS-IS is the trap!

        The usual response to available data is that they are treated as a fact, because there were measured and are not just intuitive opinions. But they are still based on the 10+ assumptions that I pointed out.

        Knowing how many klicks you had on your website doesn’t really say much beyond a lot of activity. Why and what that means is assumption.

        Taking it slow and iteratively is a good approach. You will learn something in the process, even if it is what doesn’t work. All the best.


  2. Michele Kersey says:

    Knowing what questions to ask … analytics can help with that of course. And as Douglas Adams advised, we may not know the question, but the answer is always 42!


Comments are closed.