EMC World 2008: Implementing DFS Search Services

My Day Two thoughts will be coming up in a subsequent post. Right now, listening to Pierre-Yves Chevalier give some examples and demos of the Search Service in action. This is Marc Brette’s presentation, he apparently canceled at the last minute, so the Q&A may be a little weak (Pierre-Yves knows this stuff well from what I can tell).

Presentation and source-code will be available on the Developer Network
D6 DFS has federated search to search against both Documentum repositories and external repositories
D6.5 adds:
- Non-blocking search
- Clustering of Search Results
- Saved Searches
- Classification Service
The basic Search Service has execute and getRepository as capabilities (structured as methods in the client library)
- Can pass structured queries and DQL pass-through
- Result includes Query Status and a Data Package
- Results are stateless and cached on the server
The code for the first, simple, search is pretty simple. This is the client code making a DFS call.
- All examples are using the client libraries that Documentum provides and not something built directly from the WSDL. (Would prefer more WSDL interaction, but to be fair, a large chunk of developers in this room will either use the client libraries for developing with DFS or not at all).
- Called from a servlet in the user interface that is initiated from JSON in the client’s browser (I think I got that right)
- The servlet formats the response for JSON
Federated Search to external repositories requires Enterprise Content Integration (ECI) on the Documentum back-end
Nonblocking search is asynchronous and supports multiple calls to get results
- Each query has an id used as a key in the cache
- Cache policy is size and time based
- Each call must contain the full query definition in case the cache is gone and the query can be re-executed
- Cache is configured in DFS directly
- Each call will tell you in Query Status if it has completed executing and if there are more results to be returned. Also includes status of the different search sources. (Ah, XML)
Structured Queries are abstract and is independent of the Indexer and Content Sever syntax and features
- Full Text expressions and Property Expressions, combined in an Expression set in a boolean fashion
Clustering helps to organize search results
- Dynamic grouping of results into clusters
- based on result properties, not content
- linguistic rules
- Requires an SBO that comes with Webtop Extended Search, not any other product
- Supports hierarchical Clustering
- Can retrieve cluster and sub clusters
- Retrieve results by cluster
- Topics can be defined in code as the clustering strategy (Go Ed! Infodata alumni strike again!) Date is a pre-defined strategy
- Clusters can use multiple strategies
Can retrieve a list of the saved queries
Classification using Content Intelligence Services (CIS), or a pre-built taxonomy, to return results based upon the taxonomy in the system

Ran late this morning, so I am off in search of food and, wait for it….COFFEE!!! Debating on the next session, but I know I’m not doing an 11:30 as I have some demos to attend. Going to spend a lot of time in the booths this afternoon. Find me on the floor if you like.

Disclaimer

All information in this post was gathered from the presenters and presentation. It does not reflect my opinion unless clearly indicated (Italics in parenthesis). Any errors are most likely from my misunderstanding a statement or imperfectly recording the information. Updates to correct information are reflected in red, but will not be otherwise indicated.

All statements about the future of EMC products and strategy are subject to change due to a large variety of factors.

5 thoughts on “EMC World 2008: Implementing DFS Search Services”

James says:

Did anyone talk about integrating DFS search into a google appliance? I know that EMC almost never talks about integration unless it is with another EMC product. Would love to know why the Documentum community doesn’t keep them honest?

LikeLike

27 May 2008 at 6:03 am
Pie says:

Nobody talked about it in the sessions that I attended, but I know that it has been done before and wasn’t seeking that information. It uses the ECI product.

Partial list including Google Appliance:
http://www.emc.com/products/documentum-platform/eci-services-adapter-library.htm

Google Desktop announcement:
http://www.emc.com/about/news/press/us/2006/01172006-3811.htm

LikeLike

27 May 2008 at 8:23 am
Pingback: Implementing DFS Search Services
Mike Hancock says:

Hi Pie,

JSON is alternative to XML. Many javascript developers prefer it for exchanging data/messages across the net because it is much less verbose.

http://www.json.org/fatfree.html

My personal preference is contract-based development using WSDL/XML for the backend. And for client code I’m fine with any of WSDL, REST/POX, or REST/JSON.

LikeLike

29 May 2008 at 9:04 am
Pie says:

Thanks Mike. I am familiar with JSON, but I’m sure not everyone is. I didn’t take time to define it while I was busy typing that day.

If you were referring to my comment on using WSDL’s instead of the client libraries, that stems from the fact that my teams are usually interacting with Web Services from multiple systems. It is also referring to the code in the servlet, not the client. I prefer my teams to be consistent across the board since they are conversant with the technology.

-Pie

LikeLike

29 May 2008 at 9:58 am

Comments are closed.

Word of Pie

Ponderings on Life, the Universe, and Information

Menu

Widgets

Search

EMC World 2008: Implementing DFS Search Services

Disclaimer

5 thoughts on “EMC World 2008: Implementing DFS Search Services”

Disclaimer

Share this:

Related

5 thoughts on “EMC World 2008: Implementing DFS Search Services”