Old Documentum Architecture Habits are Hard to Break


A while back, John Kominetz wrote a nice post on The Elephant and the Blind Man. I’ve been checking John out for a while and been looking for an excuse to link to his stuff for a while, but I always get sidetracked. Aside from his fun habit to reference Douglas Adams, he has been working with Documentum for a very long time. He has developed a healthy skepticism about the product.

In his post on the Elephant, John talks about the load of Junk DNA in Documentum. As the product has evolved over the last 15 years, things have been left behind and other things that worked, haven’t evolved. My recent post on the Audit Trail has led to a couple of posts addressing both of these aspects.

User Names versus Object ID

I mentioned in the Audit Trail that I and an index on either user_name or user_id. Rajendra wrote a post about the merits of each field in the system and how both are used. It is a nicely detailed post and comes to the same conclusion that I always had in my mind. The name is better for performance and the ID is better from a space/data consistency perspective. Denormalization of the database can be good for end-user performance as long as you test it thoroughly, which Documentum has on this feature.

Then came a comment from Andrew Goodale that explains why both are used. The ID was not guaranteed to be immutable because of how the LDAP synchronization worked. That has been resolved in D6, so going forward it will be useful. The user name is still everywhere though and it will be for years.

Just one more piece of junk DNA in Documentum.

Is the Audit Trail Secure and Complete?

As always when I talk about the internal workings of Documentum, James McGovern chimed into the conversation. He raised a few good points that I wanted to share and comment upon here.

First is security. I don’t always address security in the same way I don’t always address breathing when I tell people about my day. It is a part that I do and don’t always elaborate. James is correct on the need for maintaining security on the logs. This isn’t just limiting access to the logs, but is also in restricting the ability for users to change the log from those that can view the logs.

Documentum does a fairly good job at this. While the Audit Logs can only be stored in the same database as the rest of Documentum, it is locked down by default and super users do not have access by default, not can they grant themselves access. You can even divide your system administration in such a way that the Documentum administrators can’t access the logs. There does reach a point though where you have to trust your admins as someone will always have access on some level.

As it stands now, changing the audit trail is only available through direct SQL access using either a database system user or the schema owner account, which does not necessarily map to an actual Documentum user. Remember that database access does not equate to access to the content as that is stored elsewhere, unlike SharePoint.

Having the Audit Log stored in a second system as a feature, and not in a custom Data Shack, would be a nice extension that should be simple to implement.

Second on his list was the concern that not everything was in the log. He raised the excellent point regarding Web Services. Now, I know that in DFS a session has to be established by the calling system. In architectures that are direct integrations to systems, users usually map one-to-one. What about other systems?

Take the scenario where objects in the repository are being updated by users of a remote system that do not correspond to users in Documentum. How do you map these? We created custom audit events within Documentum and then, in our Data Shack, we transform them into one coherent trail of actions. We track the calling system and the user from that system. We also have users defined specifically to operate our Web Services so we can correlate all of the actions.

It is seems like a lot of work, but it is necessary and saved us a lot of work after we went live.

This, of course, can lead to a big licensing issue. Many clients have licenses based upon users. In theory, these remote users count against the total, even if they don’t exist within the system. There are many different licensing models out there, so when planning, be sure to check to make sure you aren’t in violation. This applies to any system, not just Documentum.

James and everyone needs to remember, DFS is still in version 1.0. 2.0 is being released as part of D6.5. I think the version that comes out with D7 is the one to keep our eyes on as it should incorporate feedback from the adopters.

5 thoughts on “Old Documentum Architecture Habits are Hard to Break

  1. Laurence, my comment isn’t about trust but more about security practices that should be about least privilege, separation of duties, etc.

    I could “trust” someone to have access to both accounts payable and accounts receivable but that still doesn’t make it a good idea….

    Like

  2. James, I agree with your points. I added the trust piece. I work with many organizations that have a limited staff and they can’t separate duties as much the average security person would like. For them, trust is important because they don’t have a budget to spread the duties out.

    The key is validation. The concept of a key that validates the log entries is a good one that should be investigated. If purges are strictly limited from the log, you could have your validation as a simple check could see if an entry was missing or altered.

    -Pie

    Like

  3. Chris Campbell says:

    No mention of the “junk DNA” that is still floating around in the workflow sequencing? With more emphasis on TCM in this iteration, I’m surprised that an update to the core objects hasn’t been done. My number one gripe to this today is the fact that dm_activity doesn’t have an instance object. Workflow objects, workitems, packages, and even queue items have instances, but not activities?!? Oh, the things I could do if only these shackles were removed!

    A close second is the lack of a two-way referencing workflow objects. Often it’s just one-way or you have to construct methods to find out the relationship between a package and a workitem. (Note to self: consider constructing a custom database view to take care of this problem.) Call me lazy, but I’d be a happy guy if I could get all the workflow info I needed without having to fetch and query ten objects to get what I need.

    There’s something to be said for backward compatibility, but at some point you need to cut the purse strings. I think the move to a lightweight sysobject is a HUGE step in the right direction. Personally, I like the way it’s designed and from Victor’s talk at EMC World, it seemed to me that the design also encouraged people to make the change willingly.

    One last thing, before I turn this comment into it’s own mini-blog. In the 6.5 documentation that I’ve peeked at, is the first mention that Docbasic is being deprecated. It’s only one sentence, but carries a lot of weight. So much of 6.0 and 6.5 still runs on Docbasic to handle behind the scenes stuff that I’m wondering how EMC is going to pull this off. As much as I love Docbasic (and it’s a guilty pleasure to quickly script something in Docbasic) that little bit of software is the definition of Junk DNA. How do you get rid of Docbasic? Do you replace it and with what? That might be something to look at…

    Like

  4. Chris, I agree with you 100%. A book could be written on the junk DNA in Documentum and some of the older ECM systems.

    -Pie

    Like

  5. Chris, your 3rd point is an interesting one. I suspect that deprecating docbasic is not quite the same thing as supporting documentum’s own legacy code. D6/6.5 provides a DMCL/API layer as part of DFC and there is no reason why that won’t continue to be provided for the forseeable future (I pity the developer who has to convert the object replication code into Java!); it’s just you won’t get support if you log calls when trying to develop against the unsupported API.

    What are the alternatives? Well I’ve been playing round with jython as documented on my blog. It could do with a dctm package to make it really as ‘user-friendly’ as docbasic or iapi. Maybe I’ll start one some time but right now I don’t have the time (anyone else?)

    Like

Comments are closed.