-
Notifications
You must be signed in to change notification settings - Fork 17
Biword index
Biword considers every pair of a consecutive term in a document as phrase (Manning, 2009, p.76). Each of these biwords is a dictionary term.
For example:
“Stanford University”
“Student Stanford"
"Singer Adela"
"Singing bowl"
Biword is not perfect with the phrase that contains more than two words. In this case, the compound of all biwords should be a good solution as an example of Boolean biword index.
Using phrase queries of the biword index is fast indexing and less efficient query The inverted index is the list of words and the documents. It contains two main files of vocabulary and occurrences.
For example, using a list of words in Google search. Using inverted index is slower indexing and fast query
- It could apply to web search.
- It is the most search queries on a web search as many more queries are implicit phrase queries
- Biword index is not the standard solution but It is easy and understood by users.
- All biwords could be a part of the compound strategy with longer phrase queries such as Boolean byword queries
- Optimise speed and performance in finding relevant documents for the search query.
Reference:
Manning, C.D., Raghaven, P., & Schütze, H. (2009). An Introduction to Information Retrieval (Online ed.). Cambridge, MA: Cambridge University Press. Available at http://nlp.stanford.edu/IR-book/information-retrieval-book.html