EMC World 2009: Documentum Performance, Scalability, and Sizing, Part 2


This is the second of two sessions by Ed Bueche. His sessions are must-attends.  His sessions last year, part one and part two, are useful references, but as always, this year’s information takes precedence. Be sure to read the first session notes.

  • xDB Collections are just a group of documents and Libraries that may have indexes on them
    • Can be associated with its own physical storage
    • Libraries are hierarchical in nature
    • Indexes have a scope that includes one to many libraries
  • xDB 9.0 added new features
    • XQFT support
      • Proposed extension to XQuery
      • Logical full-text operators
      • Wildcard option
      • Any/all options
      • Positional filters
      • Score variables
    • Advanced Data Management
      • Read-Only Libraries
      • Library-Level Backup/Restore
      • Library-Level Attach/Detach (disposal or offline storage
    • Parallel queries
      • Execution can be parallelized, Unit of work = Library to probe
      • Exposed through Java concurrency package
    • Parallel query and Sort avoidance
      • By default XQuery returns results in document order
      • if the underlying index returns the results in scored order then sorting is avoided
      • Allows for efficient return of most relevant results across a large store
    • Low-level query result filtering
      • Useful for security filtering
      • Invoked after initial index probe or after initial document probe if query doesn’t use index
    • Node-level compression
      • Text and CDATA nodes can be selectively compressed
      • Can save significant storage space for large elements
      • Can be enabled at initial ingestion time of after ingest
      • compression penalty offset by space savings
      • Compression penalty minimized if done at initial ingest
  • xDB 10 release
    • Lucene & Index improvements
    • Multi-node
    • In-line with ESS release which should be GA by end of year
    • Lucene is replacing the native XDB full text index
      • Transactional support
      • Support for single and nulti-key queries
      • Index can service full text and value probes
    • Any front-end server can query any back-end page server
    • Read-only libraries can be accessed by multiple page servers at a time, as opposed to one at a time with read/write libraries
  • Enterprise Search Server Performance and Scalability
    • Built as a standalone search infrastructure, no Documentum dependencies
    • Built over xDB and Lucene
    • The Content Processing Service can by multi-threaded and spread over nodes, can be located by itself within a node
    • Vertical Scaling of Ingestion
      • ESS will support policy-based setup of sub-collections
      • Each sub-collection can be inserted into independent of the other
      • Queries will be unaware of sub-collections
      • Each ESS node supports multi-threaded ingestion
    • Faceted navigation
      • useful to “peer” into results without having to look at each one
      • Refine queries are issued by navigating through the meta-data groupings
      • Available in Webtop and CenterStage
      • Goal is to break past the limit of facet computation to Content Server consumed display window and expand computation to a much larger result set
      • Storing facet information in the index directly
      • Integrated security makes this possible as the facet calculation shouldn’t include documents the user cannot see
      • Full Documentum security supported, ACLs stored as XML representation in xDB
    • Ed just said BI in relation to ESS

Of to talk to Mark Lewis in the Momentum Lounge.

Disclaimer

All information in this post was gathered from the presenters and presentation. It does not reflect my opinion unless clearly indicated (Italics in parenthesis). Any errors are most likely from my misunderstanding a statement or imperfectly recording the information. Updates to correct information are reflected in red, but will not be otherwise indicated.

All statements about the future of EMC products and strategy are subject to change at any time due to a large variety of factors.