diff --git a/README.md b/README.md index 92ac10c..75a5b7d 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ - [Requirements](#requirements) - [Installation](#installation) - [Usage](#usage) +- [Example: Textbook Q&A Generation](#example-textbook-qa-generation) - [Monitoring Dashboard](#monitoring-dashboard) - [Advanced Features](#advanced-features) - [Documentation](#documentation) @@ -83,6 +84,10 @@ initial_data = RawData(data="Some raw data") graph.run(initial_tasks=[(data_processor, initial_data)]) ``` +## Example: Textbook Q&A Generation + +PlanAI has been used to create a system for generating high-quality question and answer pairs from textbook content. This example demonstrates PlanAI's capability to manage complex, multi-step workflows involving AI-powered text processing and content generation. The application processes textbook content through a series of steps including text cleaning, relevance filtering, question generation and evaluation, and answer generation and selection. For a detailed walkthrough of this example, including code and explanation, please see the [examples/textbook](examples/textbook) directory. The resulting dataset, generated from "World History Since 1500: An Open and Free Textbook," is available in our [World History 1500 Q&A repository](https://github.com/provos/world-history-1500-qa), showcasing the practical application of PlanAI in educational content processing and dataset creation. + ## Monitoring Dashboard PlanAI includes a built-in web-based monitoring dashboard that provides real-time insights into your graph execution. This feature can be enabled by setting `run_dashboard=True` when calling the `graph.run()` method. diff --git a/examples/textbook/README.md b/examples/textbook/README.md new file mode 100644 index 0000000..7182278 --- /dev/null +++ b/examples/textbook/README.md @@ -0,0 +1,42 @@ +## Example: Textbook Question and Answer Generation + +PlanAI has been used to create a system for generating question and answer pairs from textbook content. This example demonstrates PlanAI's capabilities in processing educational material and automating complex workflows. + +### Project Overview + +The application processes textbook content to create question and answer pairs suitable for educational purposes or model training. It uses a series of AI-powered workers to: + +1. Clean and format text +2. Identify relevant content +3. Generate questions +4. Evaluate question quality +5. Generate and select answers + +The workflow is managed using the PlanAI framework, which allows for parallel processing of tasks while maintaining control over LLM API usage. + +### Key Components + +- **Text Cleaning (CleanText)**: Removes irrelevant content and improves text formatting. +- **Relevance Filtering (InterestingText)**: Identifies text chunks suitable for Q&A generation. +- **Question Generation (CreateQuestions)**: Produces multiple questions from each relevant text chunk. +- **Question Evaluation (QuestionEvaluationWorker)**: Assesses and improves question quality. +- **Answer Generation (QuestionAnswer)**: Creates multiple potential answers for each question. +- **Answer Evaluation (AnswerEvaluator)**: Selects the best answer from the generated options. +- **Output Handling (PrintOutput)**: Manages the final output of Q&A pairs. + +### Workflow + +1. The input text is divided into chunks and processed by the CleanText worker. +2. InterestingText worker filters out irrelevant content. +3. CreateQuestions generates multiple questions for each relevant chunk. +4. QuestionEvaluationWorker assesses each question and suggests improvements if needed. +5. QuestionAnswer generates two potential answers for each approved question. +6. AnswerEvaluator selects the best answer based on accuracy and clarity. +7. PrintOutput handles the final Q&A pairs, printing them and saving to a file. + +### Usage + +The application can be run from the command line, specifying the input file: + +```bash +python textbook_app.py --file path/to/your/textbook.pdf