EMC World 2011: Documentum Performance and Scalability, Part 2

This is the second half of Ed Bueche’s annual performance session. This is a standard must attend of every Momentum and I’m not going anywhere.

  • This talk focuses on mission critical apps and Data modeling
  • Containment models
    • Object in a folder, intuitive, cumbersome to model many-to-many and peer
    • dm_relations, intuitive for peer to peer, limited UI support
    • Virtual Documents, good UI, ordered, versions, Many-to-many or peer relationships aren’t easy.
    • Parent object to lightweight sysobject, good for master-detail relationships, space efficient, limited UI support, LW object restrictions
    • Multiple content objects,single sysobject with multiple content objects), Master detail, space efficient, extremely limited per content operation support.
  • Modeling Structured Data
    • Registered tables, simple, query, little UI support
    • Process variables, simple, hard to create optimal queries
    • Structured data types, allow efficient complex structured data in a workflow process, method for constructing an index is less than intuitive
    • CenterStage Data tables, users but can lead to an over-abundance of types
  • Structured Content
    • Single systObject/multiple content objects, lows rdbms foot
    • virtual document
    • Single content object/ multi-paged PDF document for linearized PDF with viewers. Random page access limited by type
  • Tips for RDBMS Indexes
    • Test app queries with production-like metadata volumes
    • Model shouldn’t be complete until queries have been tested
    • Aside from security models, indexes likely needed for application queries
    • If the DBA hasn’t been involved in confirming the index selection then slow production queries could result
  • Automatic indexing through xPlore, online Index redefinition in xPlore 1.1
  • Why not use xPlore for all searches? Not transactional. Database metadata queries are instantaneous. DM_relation and other persistent objects are not carried over
  • Establish RTO / RPO for deployment
    • The RTO (recover time) needs to be set, and may very by service. Viewing, important, search not as much, transformation less so…Feature base
    • RPO (recovery point) defines how much data you can lose in a failure. Short RPO implies either very frequent incremental backups or complete duplicate systems. For some, always defined in terms of the RPO of the server (like RPO).
    • Data loss on xPlore can be recovered from Content Server
  • Leverage loosely synchronized backups for environments for “hot backups”.
    • Tightly synchronized of everything makes a simple restore, typically have to do cold backup or pay $$$ with 3rd party products
    • Objects in database with lost content is not desired.
    • Using RPO on metadata, can remove orphaned content on file server.
    • Remember that indexes are easy to rebuild, but it takes time.
  • Test your HA/DR choices. Make sure that the plan works. Failover is not always transparent.
  • Plan for new and upcoming capabilities
    • D6.6 Method Servers now have better support for Active/Active HA, load balancers no longer needed for failover
  • xPlore allows Sparing, Active/Active, and Active/Passive
  • xPlore Sparing can be tricky if Primary node crashes, assumes shared storage environment
  • D7 target: rolling upgrade of patch releases, still evaluating with service packs
    • Assumes multi-node environments, Load balancers, and Proper Sizing
    • Tests and limitations to be published

Next it is time to relax and talk to people, followed by a trip to the Momentum party.


All information in this post was gathered from the presenters and presentation. It does not reflect my opinion unless clearly indicated (Italics in parenthesis). Any errors are most likely from my misunderstanding a statement or imperfectly recording the information. Updates to correct information are reflected in red, but will not be otherwise indicated.