📽️ Search for Movie Plots

Baseline models for searching for movie plots from Wikipedia articles. Techniques include BM25 (lexical search), bi/cross-encoding (semantic search), and retrieval-augmented generation (RAG) using Mistal 7B through Fireworks.ai.

What's included?

3 notebooks
executive analysis writeup [ pdf | docx ]
requirements.txt

Objective

Develop a prototype for a search tool that helps users find relevant movies based on their queries.

Dataset

Use the following code to import the dataset, if needed.

from datasets import load_dataset  
ds = load_dataset("Coder-Dragon/wikipedia-movies", split='train[:1000]')

The dataset includes movie titles, plots, genres, actors, and other relevant imformation, mined from Wikipedia articles. For this experiment we will only focus on the first 1,000 movies, which are movies from the 1920s or earlier. We will also only focus on embedding and querying the movie titles and their plots.

Usage

You should be able to run the notebooks in Colab seamlessly. If there are dependency-related errors or if you'd like to run the notebooks locally, you can use the included requirements.txt.

The recommendation is to run the notebooks in the following order: semantic search, then reranker, then RAG. This is because this follows the order they were developed and the methods grow in complexity. Evaluation metrics are calculated in the notebooks where applicable.

Finally, feel free to review the executive analysis writeup [ pdf | docx ] for experiment findings and recommendations. An appendix is included with all experiment metrics and results neatly organized into tables.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
notebooks		notebooks
writeups		writeups
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📽️ Search for Movie Plots

What's included?

Objective

Dataset

Usage

About

Releases

Packages

Languages

ericphann/search-for-movie-plots

Folders and files

Latest commit

History

Repository files navigation

📽️ Search for Movie Plots

What's included?

Objective

Dataset

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages