Understanding GPT

This repository contains a simple implementation of a GPT (Generative Pre-trained Transformer) model using PyTorch. The GPT model is designed for natural language processing tasks such as text generation. In this project, we create a simplified version of GPT, train it on a small dataset, and generate text based on a given prompt.

Introduction

The GPT model is a type of transformer model used for generating human-like text. It has applications in various NLP tasks, such as text completion, translation, and summarization. This implementation demonstrates the basic structure and functionality of a GPT model.

Features

Custom dataset handling for text inputs
Simplified GPT architecture with transformer blocks
Training loop with loss calculation and optimization
Text generation from a trained model

Installation

To run this code, you need to have Python 3.6 or higher installed. You also need the following libraries:

torch
transformers

You can install these libraries using pip:

pip install torch transformers

Usage

Clone this repository:

git clone https://github.com/yourusername/simple-gpt.git
cd simple-gpt

Run the script:

python understanding-gpt.py

Model Architecture

The model consists of the following components:

SimpleDataset: A custom dataset class to handle text inputs and tokenization.
GPTBlock: A single transformer block that includes multi-head self-attention and a feed-forward neural network.
SimpleGPT: The main GPT model class that stacks multiple GPTBlocks and includes token and position embeddings.

Training the Model

The train function handles the training process, including the forward pass, loss calculation, backpropagation, and optimization.

def train(model, dataloader, optimizer, criterion, epochs=5, device='cuda'):
    model.train()
    for epoch in range(epochs):
        total_loss = 0
        for input_ids, attention_mask in dataloader:
            input_ids, attention_mask = input_ids.to(device), attention_mask.to(device)
            optimizer.zero_grad()
            outputs = model(input_ids, attention_mask)
            shift_logits = outputs[..., :-1, :].contiguous()
            shift_labels = input_ids[..., 1:].contiguous()
            loss = criterion(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))
            loss.backward()
            optimizer.step()
            total_loss += loss.item()
        print(f"Epoch {epoch + 1}/{epochs}, Loss: {total_loss / len(dataloader)}")

Generating Text

The generate_text function allows you to generate text from a trained model given a prompt.

def generate_text(model, tokenizer, prompt, max_length=50, device='cuda'):
    model.eval()
    input_ids = tokenizer.encode(prompt, return_tensors='pt').to(device)
    generated = input_ids

    for _ in range(max_length):
        outputs = model(generated)
        next_token_logits = outputs[:, -1, :]
        next_token = torch.argmax(next_token_logits, dim=-1).unsqueeze(0)
        generated = torch.cat((generated, next_token), dim=1)
        if next_token.item() == tokenizer.eos_token_id:
            break

    generated_text = tokenizer.decode(generated[0], skip_special_tokens=True)
    return generated_text

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Feel free to modify and expand this README to suit your project's needs.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
understanding-gpt.ipynb		understanding-gpt.ipynb
understanding-gpt.py		understanding-gpt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Understanding GPT

Table of Contents

Introduction

Features

Installation

Usage

Model Architecture

Training the Model

Generating Text

Contributing

License

About

Releases

Packages

Languages

License

edtbl76/UnderstandingGPT

Folders and files

Latest commit

History

Repository files navigation

Understanding GPT

Table of Contents

Introduction

Features

Installation

Usage

Model Architecture

Training the Model

Generating Text

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages