Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 540 Bytes

README.md

File metadata and controls

15 lines (8 loc) · 540 Bytes

AnnotatedGPT

This a simple version of GPT model with 300 lines of code.

The project is adapted from The Annotated Transformer.

The transformer is encoder-decoder architecture, but the GPT is decoder-only architecture. This is the main difference between these two projects. Other than that, most of the code is the same as that project.

The Decoder-Only Architecture

The Encoder-Decoder Architecture