Skip to content

Latest commit

 

History

History
20 lines (16 loc) · 1.03 KB

README.md

File metadata and controls

20 lines (16 loc) · 1.03 KB

Nano GPT-jax

An implementation of nanogpt in jax from scratch ( Other than Optax for optimization and Equinox for handling PyTrees ) based on Andrej Karpathy's Let's build GPT Lecture.

Usage

  • The Shakespeare dataset is in data folder. You only need to configure hyper-parameters in nanogpt-jax/train.py as per your test settings and then run :
$ python train.py

TODOS

  • Write DropOut Layers.
  • LayerNorm.
  • Apply weight initializers.
  • Implement Adam.

References