Data, Content, Information, and Records Management


Information Coalition's initial view on the relationship between data, content, information, records, knowledge, and documentsThere are so many terms for the things that we manage everyday. Most people’s understanding of them are remnants of what was learned as we each entered the industry. This understanding has been expanded by how we use it in our daily life. The Information Coalition is working on their InfoBok that seeks to finally define these disciplines.

Recently, I was part of a twitter discussion with several people, primarily hailing from the web side of the content management world. It has been many years since I made the argument that the world of Enterprise Content Management (ECM) should include the Web Content Management (WCM) space. The worlds turned out to be connected but distinct. The uses of the word “content” and how it relates to information is evidence of that difference.

I thought I would take time to better share my thoughts where there were more than 280 characters to frame my thoughts. Hopefully, this will stir some more discussions.

Data to Information

The one thing we all agreed upon was that data is the base unit. All the other terms are built upon data. A piece of data by itself doesn’t mean much. It takes context to turn it into information.

  1. Laurence: Likely a name but of what? A person? A street?
  2. First Name=Laurence: Okay, now we know it is a person’s first name but still just data about a person.
  3. My first name is Laurence: That is information. The data is labeled and assigned a context.

A row in a database is data. Sure, with good table and field names it can be information but that is typically only true of primary entities, and not always then. If you extract the data and place it into a form, you then have information. One could argue that it is a collection of multiple pieces of information.

Now what if you insert those pieces of information into a piece of content?

Information to Content

The logic put forth by those in the web industry was that a piece of content has multiple pieces of information in it, and as such, information is a subset of content. This makes sense in the world of the web. The goal is to publish and produce content for people to consume. The content’s goal is typically to inform and drive an action.

It totally makes sense.

And they are correct. Content does contain information and when done correctly, a single piece of content addresses a single purpose or goal for the consumer.

But content doesn’t stop there.

Information as a Broader Construct

In the Enterprise Content Management (ECM) world, content is part of a bigger picture. In the typical case management application, you will have content submitted by a requester, content and data referenced by the reviewer, content produced by the reviewer, and data captured about the entire process.

When we collaborate on a document, be it a design, a proposal, or a quarterly report, we are creating a piece of content. That content is surrounded by pieces of information that need to be included and discussions about the how the content should be constructed.

An image is content but it requires metadata for better understanding of what it contains, when it was created, and the proper usage rights. With video it is even more complicated.

In the ECM world, content never stands alone. It is a piece of information that leads to a business decision which then leads to an action. Information in many ways is like data. It can be both singular and plural. When ECM people think about content, they see a container that is similar to how data people see a database table. It can wholly contain information but it is just part of the large information landscape.

What About Records?

This is the simple bit. Records are collections of information that describe an entity, a transaction, a process, or any discrete thing.

  • My customer information at Amazon is a record.
  • My order of five books is a record that is also part of my larger record.
  • Each of the two shipments that were used to send me my books are records.
  • My Amazon Smile donation to Girls Who Code is a record.

There is no content there. If I decided to sent a letter to Amazon telling them that I was going to bring legal action against them for not adequately protecting my information, that letter and any legal response would be part of both my record and a new legal record.

Two things to notice. First, the same information, be it data or content, can be part of multiple records. Second, a record can be just data, a gap that needs to be crossed by both Information Governance and Data Management professionals.

All About Context

There are a few lessons to be learned. The first is to let the web people believe what they want to believe. They are working towards a different goal and their paradigm works for them. I still fight to get many of them to understand that there are Content Management Systems (CMS) that have nothing to do with the web. Finally, you will never win an argument with them on Twitter. They outnumber you and for many of them, it is their job to use and understand social media.

The second is that your relationship with content and information is firmly rooted in how you use them. For web and communication professionals, a piece of content is a collection of information that represents what they are delivering. They deliver collections of content in brochures, websites, and other publications.

For ECM professionals, we capture information of which content is just a part of the picture. For us, content is just another piece of information that makes up the entity we are managing. Sometimes we are just managing the big picture in our systems but supporting bigger information systems with content provided by content services.

The real difference is that in the web world, content is the means to their ends. In the enterprise, content is just the starting point.