Academic Text Simplification Model

Welcome to the Academic Text Simplification Model project. This repository contains the implementation and analysis of a machine learning model designed to simplify complex academic texts. The goal is to make academic content more accessible by transforming it into a simpler form while retaining the original meaning.

Introduction

Academic writing can be challenging for many readers due to its complex structure and specialized vocabulary. This project aims to simplify such texts using advanced natural language processing (NLP) techniques, making them more accessible to a broader audience including students, non-native English speakers, and individuals with reading difficulties.

Features

Data Preprocessing: Efficient methods for cleaning and preparing academic texts.
Model Training: Utilizes state-of-the-art NLP models for text simplification.
Evaluation Metrics: Comprehensive evaluation using metrics such as BLEU, ROUGE, and SARI.
User Interface: A simple interface for users to input text and receive simplified versions.

Installation

To get started with this project, follow the steps below:

Clone the repository:

git clone https://github.com/yourusername/academic-text-simplification.git
cd academic-text-simplification

Create a virtual environment:

python3 -m venv venv
source venv/bin/activate   # On Windows, use `venv\Scripts\activate`

Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

To use the academic text simplification model, follow these steps:

Prepare your dataset: Ensure your data is in the correct format as specified in the data/ directory.
Run the Jupyter notebook:
```
jupyter notebook Academic_Text_Simplification_Model.ipynb
```
This notebook includes code cells for preprocessing data, training the model, and evaluating its performance.
Input text for simplification: Use the provided user interface to input academic text and receive simplified output.

Data

The dataset used in this project includes a collection of academic texts along with their simplified versions. Detailed information about the dataset can be found in the data/ directory.

Model Architecture

The model leverages advanced NLP techniques, including transformer-based architectures such as BERT and GPT, to effectively simplify text. Key components include:

Tokenization: Breaking down text into manageable pieces.
Embedding: Representing text in numerical format.
Attention Mechanisms: Capturing contextual relationships within the text.
Decoder: Generating simplified text.

Evaluation

The model's performance is evaluated using several metrics:

BLEU: Measures the overlap between the generated text and the reference text.
ROUGE: Evaluates the quality of the summary by comparing it to reference summaries.
SARI: Specifically designed for text simplification tasks, measuring the simplicity, adequacy, and fluency of the generated text.

The results demonstrate the model's ability to produce simplified text that retains the original meaning while being easier to understand.

Contact

For any inquiries or further information, please contact:

Name: Daniyal Ahmed
Email: daniyalahmedmj@gmail.com
GitHub: yourusername

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Academic_Text_Simplification_Model.ipynb		Academic_Text_Simplification_Model.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Academic Text Simplification Model

Table of Contents

Introduction

Features

Installation

Usage

Data

Model Architecture

Evaluation

Contact

About

Releases

Packages

Languages

DaniyalAhmedKhan1234/Academic-Text-Simplification

Folders and files

Latest commit

History

Repository files navigation

Academic Text Simplification Model

Table of Contents

Introduction

Features

Installation

Usage

Data

Model Architecture

Evaluation

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages