Simple search engine for the ICS domain
- Map each term to a unique termID
- Map each doc to a unique docID (doc1: 1, doc2: 2, doc3: 3, doc4: 4)
- Map each docID to a list of termIDs based on terms contained in the doc
- Inverse index of term2termid above, that is map each termID to a term
- termID -> docID, term frequency
- termID -> docID, tf-idf
- termID -> document list
Parse JSON
- Setup/Test Jsoup