Skip to content

Achademic search engine exploiting Neural Information Retrieval approaches

Notifications You must be signed in to change notification settings

simonebenitozzi/academic-search-engine

Repository files navigation

Academic Search Engine

This repository contains the code and documentation for our Information Retrieval project, focused on the "computer science" dataset. The goal of this project is to enhance the effectiveness of information retrieval from the dataset by implementing various retrieval models.

Dataset

The dataset used for this project can be found at the following link: Computer Science Dataset. It is based on the paper "A Multi-Domain Benchmark for Personalized Search Evaluation" (link to paper).

Project Structure

The repository is organized as follows:

  • IR_Project.ipynb: This Jupyter Notebook contains the code for the project. It includes the analysis, preprocessing, and indexing of the dataset. Additionally, it implements various retrieval models such as tfidf, BM25, Dirichlet Language Model, Neural Information Retrieval Model, and Neural Ranker.

  • IR_project_report_2022_2023.pdf: This PDF document provides a comprehensive report on the project, detailing the methodology, experiments, results, and conclusions.

  • Information Retrieval - project presentation.pptx: This PowerPoint presentation summarizes the key aspects of the project, including the problem statement, methodology, and findings.

Usage

To use the code in the Jupyter Notebook, follow these steps:

  1. Clone the repository to your local machine.
  2. Open the IR_Project.ipynb file using Jupyter Notebook or any compatible environment.
  3. Run the code cells sequentially to perform the dataset analysis, preprocessing, and indexing.
  4. Experiment with different retrieval models by executing the corresponding code sections.

Conclusion

This project aims to improve the retrieval of relevant information from the "computer science" dataset by implementing various retrieval models. By analyzing and preprocessing the dataset, creating an index, and employing both base and neural models, we strive to enhance the accuracy and relevance of search results. For more details, please refer to the project report and presentation provided in this repository.

Feel free to explore the code, experiment with different retrieval models, and provide feedback. We hope this project contributes to the field of information retrieval and inspires further advancements.

About

Achademic search engine exploiting Neural Information Retrieval approaches

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published