A search engine that takes keyword queries as input and retrieves a ranked list of relevant results as output. It scraps a few thousand pages from the seed Wiki page: List of Marvel Cinematic Universe films and uses Elasticsearch for a full-text search engine. On top of the elasticsearch framework, it has a search portal built with React.js and Node.js that allows to give the input query and show the retrieved results.
- Cleaning and pre-processing of the scrapped data
- Proper visualization of the ranked list of pages that hold the relevant answers
- Support for Okapi BM-25 and LM-Dirichlet scoring model
- Query keyword suggestions based on Levenshtein edit distance
- Support for both disjunctive and conjunctive keyword queries
- A configuration window for users to choose any of the scoring models and the number of results to show on the result page
Mayank Singla