Transformer Examples

This repository contains a collection of toy implementations and examples of key components from modern Transformer architectures. Each example is designed to be educational, well-documented, and easy to understand.

Components

Component	Description	Paper
Multi-Head Latent Attention (MLA)	A novel attention mechanism from DeepSeek V2 that uses latent queries to reduce KV cache and Rotary Position Embeddings	DeepSeek V2 Technical Report
Multi-Head Attention	The original attention mechanism from the Transformer paper	Attention Is All You Need
Relative Multi-Head Attention	Attention with relative position representations	Self-Attention with Relative Position Representations
Absolute Positional Encoding	Sinusoidal positional encoding from the original Transformer	Attention Is All You Need
Rotary Position Embedding	Enhanced positional encoding using rotation	RoFormer

Setup

Create and activate a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install the package in development mode:

pip install -e .

Install additional dependencies:

pip install -r requirements.txt

Getting Started

Each component has its own directory with:

Implementation code
Jupyter notebook with examples (and visualizations)

To run a notebook:

Make sure Jupyter is installed:

pip install jupyter

Start Jupyter:

jupyter notebook

In your browser, navigate to the component you want to explore (e.g., attention/mla_attention.ipynb)
Click on the notebook to open it
You can run cells individually by pressing Shift+Enter or run all cells from the Cell menu

For example, to explore Multi-Head Latent Attention from DeepSeek:

cd attention
jupyter notebook mla_attention.ipynb

Contributing

Contributions are welcome! If you'd like to add a new component or improve an existing one, please feel free to submit a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
attention		attention
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer Examples

Components

Setup

Getting Started

Contributing

License

About

Releases 1

Packages

Languages

License

chrisjob1021/transformer-examples

Folders and files

Latest commit

History

Repository files navigation

Transformer Examples

Components

Setup

Getting Started

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages