Transformer Implementation

Overview

This project is completed as a self-chosen research assignment for the CSE 256 NLP course (Spring 2024) at the University of California, San Diego. The goal is to experiment with different parts (i.e. the encoder and the decoder) of the transformer architecture from scratch without using any existing transformer-related libraries (such as nn.MultiheadAttention) and complete the following tasks:

Part 1. Encoder with Classifier: Implement a transformer encoder and train it jointly with a feedforward classifier for a downstream task of predicting the speaker of a given speech segment.
Part 2. Decoder for Language Modeling: Implement a GPT-like transformer decoder, pretrain it on an autoregressive language modeling task, and report perplexity on speeches from different politicians.
Part 3. Exploration: Experiment with different parts of the architecture, such as positional encoding and sparse attention patterns, to improve the performance of the classifier or the decoder.

Directory Structure

PA2_code/
│
├── speechesdataset/
│ ├── test_CLS.tsv
│ ├── test_LM_hbush.txt
│ ├── test_LM_obama.txt
│ ├── test_LM_wbush.txt
│ ├── train_CLS.tsv
│ └── train_LM.txt
│
├── Attention maps (PDF)
│
├── dataset.py
├── main.py
├── tokenizer.py
├── transformer.py
└── utilities.py

Instructions to Run the Code

Prerequisites

Ensure you have the following libraries installed:

Python 3.7+
PyTorch
NLTK
NumPy
Matplotlib

Running the Code

Part 1: Encoder with Classifier

python main.py part1

Part 2: Decoder for Language Modeling Choose the intended test set by changing the input path before running

python main.py part2

Part 3: Exploration

python main.py part3

Code Implementation

Model initialization with hyperparameters, optimizer initialization, pretraining and evaluation are completed in main.py.

The implementation of the required tranformer and improved models are completed in transformer.py.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
speechesdataset		speechesdataset
README.md		README.md
atten_map_part1 (1).pdf		atten_map_part1 (1).pdf
atten_map_part1 (2).pdf		atten_map_part1 (2).pdf
atten_map_part1 (3).pdf		atten_map_part1 (3).pdf
atten_map_part1 (4).pdf		atten_map_part1 (4).pdf
atten_map_part2 (1).pdf		atten_map_part2 (1).pdf
atten_map_part2 (2).pdf		atten_map_part2 (2).pdf
atten_map_part2 (3).pdf		atten_map_part2 (3).pdf
atten_map_part2 (4).pdf		atten_map_part2 (4).pdf
dataset.py		dataset.py
main.py		main.py
tokenizer.py		tokenizer.py
transformer.py		transformer.py
utilities.py		utilities.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer Implementation

Overview

Directory Structure

Instructions to Run the Code

Prerequisites

Running the Code

Code Implementation

About

Releases

Packages

Languages

JasonShao55/NLP-Transformer-Implementation

Folders and files

Latest commit

History

Repository files navigation

Transformer Implementation

Overview

Directory Structure

Instructions to Run the Code

Prerequisites

Running the Code

Code Implementation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages