So the other day I was looking at why one of the lookups in my system was always providing me a blank value. The keyword DISTINCT seemed to keep it down to one, but it didn’t do more than limit the amount of blanks to one. It is a required field, so the user can’t save it with that value, but even with a valid default value, it is something that annoys me. So I began trying to banish it from my sight.
Populating Lookup Values
There are several ways to have lookup values for a piece of metadata in Content Server. One is a typed list. Simple, efficient, and fool-proof. However, the results sit in the data dictionary, making the lookup not very dynamic. The ability to make it conditional is also limited by the capabilities of Docbasic.
The second is to use a registered table. This is quite flexible, but there is no easy way to transfer this from one repository to another without using native database interfaces.
The third is to use a DQL Query against objects in the repository. This option seems to have everything. It is dynamic, so that as the data changes, the lookup values change. It is native to Documentum, which allows users to manipulate the data without knowing DQL or SQL. It can also be transferred quite readily from one Repository to another. However, there is one little problem, NULLS.
How Object Data is Stored
Experienced Documentum professionals can skip this part.
Documentum is an object-oriented ECM platform. However, it relies on a relational database to store data, such as Oracle or SQL Server. Each “object” consists of two database tables, one for normal attributes and one for repeating attributes. In addition, for each object, data is stored in the tables for the parent type as well. So if I create a new document, rows are created in dm_sysobject_s, dm_sysobject_r, and dm_document_s. There are no repeating attributes specific to dm_document, so it doesn’t have a repeating attribute table. Actually, dm_document doesn’t have any specific attributes, so the dm_document_s table is just a list of Object Ids (r_object_id).
To allow users to query data, EMC created DQL and an invisible layer to translate any DQL statement into SQL. When I issue a DQL query for all documents, it performs all the database joins for me behind the scenes. Quite useful. This does lead to a lot of joins as your object model grows in depth, but I have heard that for those on SQL Server, the upgrade to SQL Server 2005 (supported as of Content Server 5.3 sp4) is well worth the upgrade from a searching performance perspective.
Queries Against Repeating Attributes
So, back to my dilemma. The original design created a custom object type, foo_lookup, that had one custom attribute that was repeating. The object, when created, would be named after a valid user group in LDAP and the repeating attribute would have the valid values that members of this group could select. Users could be in more than one group and the groups where hierarchical. As each higher group had more options and selected them, the lower groups lost their ability to edit the object’s metadata. However, everyone was seeing a blank in the lookup.
This appears to be an inherent problem with querying against repeating attributes. You go to the boards and you see people talk about it. I worked on how to get rid of it to no avail. Then I realized something. That wasn’t a blank, it was a NULL.
This table illustrates the default values for a standard dm_document in the dm_sysobject_r table assuming no optional metadata is stored. Note that there are two rows by default.
|r_object_id||i_position||authors||i_folder_id||r_version_label||[13 other fields]|
For every object, there is at least one repeating attribute table, dm_sysobject_r. For my custom object, I had a second repeating table, foo_lookup_r. If I had four values in foo_attribute, then I had four rows in both of those table for that object. The repeating attributes with less than four values had a NULL stored in the value. When Documentum retrieved the values through the DFC, it ignored the NULLs and brought back only the valid values.
If you run a DQL query against a repeating attribute that has less values than the max repeating attribute count, the NULLs are returned. However, why was I having problems when I only had one repeating attribute? Simple. One group, to which everyone belonged, had only one valid value. However, there were two rows in my repeating tables. I finally found the culprit, r_version_label. As shown above, every object has two version labels by default, the numeric label (1.0) and the label CURRENT. That gave me two rows from the start. So any query returning a repeating attribute only one entered value returned a NULL value as well.
This table illustrates the values for a my foo_lookup object the foo_lookup_r table.
You can see the NULL value that was the source of my consternation.
I cheated. I went to my single-valued lookup and duplicated the single entry. Instead of Value1 being in there once, I had Value1, Value1,replacing the NULL. When I queried with my lookup and used DISTINCT, I didn’t have any problems. A registered table would work more cleanly, and I think I have another way that will work better for my specific circumstance, but duplicating a valid value should work for everyone in this situation. It is simple, easy, and doesn’t require any design changes.
The lesson, nothings perfect, so remember to improvise.
[Edit: Be sure to check the comments for alternate approaches.]