Analysis of Signature Generation Schemes for Multiterm Queries In Partitioned Signature File Environments

Files

Date

1993-05-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Our analysis explores the performance of three superimposed signature generation schemes as they are applied to a dynamic sigrtature file organization based on linear hashing: Linear Hashing with Superinzposed Signatures (LHSS). First scheme (SM) allows all terms set the same number of bits whereas the second and third methods (MMS and MMM) emphasize the terms with hlgh discriminatory power. In addition, M Mco nsiders the probaOiZity distribution of the number of query terms. The main contribution of the study is the combination of signature generation and signature file organization concepts together with the relaxation of the single term query and uniform frequency assumptions. The derivation of the performance evaluation formulas are provided as well as the analysis of various experimental settings. Results indicate that MMM outperforms the others as terms become more distinctive in their discriminatory power. MMM accomplishes the highest savings in retrieval eficiency for the high query weight case. We also discuss the applicability of the derivations to other partitioned signature organizations providing a detailed analysis of Fixed Prefix Partitioning (FPP) as an example. Finally, an appro.ximate perfortnance evaluation formula that works for both FPP and LHSS is modijied to account for the multiterm case.

Description

Keywords

Citation

This item appears in the following collections