Thursday, September 18, 2008

An Inverse Inference Engine for high Precision Web Search

RES790 – DARPA “Inverse II”

The Phase I work has proved the precision and scalability of the inverse inference algorithm, and its ability to perform latent semantic analysis. In Phase II, we will extend the functionality of the algorithm to encompass cross-language document retrieval, tracking of document clusters in time, and fast hierarchical clustering of large document databases. The indexing structure will evolve from an information matrix to an information tensor. The information tensor will accommodate multidimensional term attributes like work position, part of speech, and taxonomical and syntactic tags. We will embed this richer indexing structure and all search functionality in the Oracle interMedia cartridge. New query operators will provide support for word n-grams, ordered phrases, term broadening, cross document entity tracking and extraction of entity relationships. We will also improve the performance of the soft hyperlink navigation tool. We will validate the precision of our search technology by participating in the TREC and CLEF competitions on a regular basis throughout the duration of the contract.

Statistical analysis software

No comments: