[Originally published on the TeraThink blog]
I attended the annual AIIM Conference recently in New Orleans. As expected, it was a great event with a lot of interesting presentations. I spent a lot of time talking to people, learning what they were doing, how they were achieving success, and hearing about what wasn’t working. I may also have had a beignet or two.
My chief interest was content analytics. There has been a lot of buzz in the industry regarding this capability and I wanted to learn how real it was among practitioners. It seems like a simple concept; Take the classification technology from eDiscovery tools and apply it at the front end of the business process. Instead of reacting, become proactive in analyzing and acting upon content.
I learned that it is going to take some pioneers to make this a reality.
Making the Business Case
Many vendors were open in asking where the money was for them before they spend that money on R&D. While I thought that having the best solution might be enough of an argument, they need an actual business case made for the tech first.
At one roundtable, we talked about making the business case for developing solid content analytics tools. The simple answer is to not look at the end-case of analyzing content and finding value. The quick hit for vendors is to create analytics tools that can auto-classify content. Make adoption of the tools easy for staff while mitigating risk for the business is pretty straightforward.
Subsequently, if we can tell a system that 100 documents are related to a specific business area, the system can take that information and find all the other documents that fit that pattern. The value is knowing what you have and not being surprised down the road. The value is confidence that information is being properly managed because you know what you have. The value is in categorizing already existing information in unmanaged locations.
After tools are solving that problem, they can go out and look for patterns and anomalies in all content. Leveraging data analysts, and maybe the mythical data scientist, more value can be realized.
Is it ROT or Money?
There was a lot of debate on what to do with all of this information. Many want to get it correctly classified in order to remove duplicates and start purging “useless” information. This is best captured by the term ROT which stands for Redundant, Obsolete, and Trivial. Let’s break this concept down.
Redundant is simple. I do not need over 200 copies of the invitation to the company picnic in my systems. The fact that there are 200 copies of anything may be of interest to me but I don’t need all 200 copies.
Obsolete is a little trickier. Do you really need the first 20 versions of a requirements document? Maybe the versions that were final for a period of time but not every single version of any document qualifies. How about a 15 year old policy manual for the finance department? That information is clearly past its useful life and may not be needed anymore.
Trivia seems clear. Leftover lunch in the kitchen is an important email until that lunch is gone. At that point it becomes trivia. Leftover food email is among the first emails you can delete when you return from vacation. There is no apparent value in keeping the email.
What if there was a direct correlation between the leftover food in the kitchen emails and your company’s revenue? The more emails, the better revenue is going to be against goals. How can you determine that relationship? You would have to have a few years of emails to measure the rate of those emails against revenue. If you don’t know how many emails are sent in a bad quarter you cannot determine how many are needed before you can identify a good quarter.
That’s the thing about ROT; You don’t know what you will find until you analyze the patterns. You cannot identify patterns in content that no longer exists. This debate is one that accelerated at AIIM and is going to continue for a while.
Next Year in Orlando
Next year the AIIM Conference will be in Orlando. It will be interesting to see how things evolve in this space over that timeframe. In the meantime I’ll be talking to everyone I can about content analytics to learn more about what people, organizations, and vendors are doing. There is going to be a lot of change out there.
If you’d like to talk about what is happening, or just share something you’ve observed, drop us a line and we can chat. I’d love to hear your stories, be they successes or warnings.
[Editor’s Note: This post was originally published on the blog of Dominion Consulting. On November 1, 2017, Dominion Consulting merged with TeraThink and are now operating jointly as TeraThink. All blog posts migrated from the Dominion Consulting website have been updated to refer to ourselves as TeraThink.]