Skip to content

Commit 0ad75e1

Browse files
authored
Update readme (#6)
* update readme with setup * remove experiment parts of readme * add link to topoc experiments * remove experiment scripts
1 parent e60eeb1 commit 0ad75e1

File tree

3 files changed

+37
-428
lines changed

3 files changed

+37
-428
lines changed

README.md

+37-47
Original file line numberDiff line numberDiff line change
@@ -2,78 +2,68 @@
22

33
[TOPLOC](https://arxiv.org/abs/2501.16007) is a novel method for verifiable inference that enables users to verify that LLM providers are using the correct model configurations and settings. It leverages locality sensitive hashing for intermediate activations to detect unauthorized modifications.
44

5-
## Features
5+
For code used in our experiments, check out: https://github.com/PrimeIntellect-ai/toploc-experiments
6+
7+
### Installation
8+
9+
```bash
10+
pip install -U toploc
11+
```
12+
13+
### Features
614

715
- Detect unauthorized modifications to models, prompts, and precision settings
816
- 1000x reduction in storage requirements compared to full activation storage
917
- Validation speeds up to 100x faster than original inference
1018
- Robust across different hardware configurations and implementations
1119
- Zero false positives/negatives in empirical testing
1220

13-
## Key Components
14-
15-
### Proof Generation
16-
- Extracts top-k values from the last hidden state
17-
- Uses polynomial encoding for compact storage
18-
- Generates verifiable proof during inference
19-
20-
### Validation
21-
- Recalculates top-k features
22-
- Compares exponent and mantissa differences
23-
- Validates against predefined error thresholds
24-
25-
### Storage Requirements
26-
- Only 258 bytes per 32 new tokens
27-
- Compared to 262KB for full token embeddings (Llama-3.1-8B-Instruct)
28-
29-
## Integrations
30-
31-
### vLLM
32-
TOPLOC is integrated with vLLM for efficient inference and validation as part of this repository. The integration allows you to leverage vLLM's optimized inference pipeline while maintaining verification capabilities.
33-
34-
### SGLang
35-
We maintain a [fork of SGLang](https://github.com/PrimeIntellect-ai/sglang) that includes TOPLOC integration, enabling verifiable inference with SGLang's framework.
36-
37-
## How to use the code
38-
39-
### Installation
21+
### Development Setup
4022

23+
1. Install uv:
4124
```bash
42-
git clone https://github.com/PrimeIntellect/toploc.git
43-
pip install -r requirements.txt
25+
curl -LsSf https://astral.sh/uv/install.sh | sh
26+
source $HOME/.local/bin/env
4427
```
4528

46-
### Run Experiments
47-
48-
This is an example on running validation with Llama-3.1-8B-Instruct over the ultrachat dataset.
49-
50-
First, generate the polynomial encodings for the model using:
51-
29+
2. Setup virtual environment:
30+
This can take awhile because of the C extensions.
5231
```bash
53-
python vllm_generate_poly.py --model_name meta-llama/Llama-3.1-8B-Instruct --tp 1 --n_samples 4 --save_dir signatures --max_decode_tokens 512 --dataset_name stingning/ultrachat --dtype bfloat16
32+
uv venv --python 3.12
33+
source .venv/bin/activate
34+
uv sync --dev
5435
```
5536

56-
This should create a directory called `signatures` with the polynomial encodings for the model.
57-
58-
You can then run validation with:
37+
### Running Tests
5938

39+
Run the test suite:
6040
```bash
61-
python vllm_validate_poly.py --decode_model_name meta-llama/Llama-3.1-8B-Instruct --validate_model_name meta-llama/Llama-3.1-8B-Instruct --tp 1 --n_samples 4 --save_dir just4 --max_decode_tokens 512 --dataset_name stingning/ultrachat --dtype bfloat16 --attn flash
41+
uv run pytest tests
6242
```
6343

64-
If the verification passes, you should see:
65-
44+
Run single file:
45+
```bash
46+
uv run pytest tests/test_utils.py
6647
```
67-
VERIFICATION PASSED: Mantissa error mean: 0. below 10 and median: 0. below 8 and exp intersections: 100 below 90
48+
49+
Run single test:
50+
```bash
51+
uv run pytest tests/test_utils.py::test_get_fp32_parts
6852
```
6953

70-
And if it fails, you should see something like:
54+
### Code Quality
7155

56+
Install pre-commit hooks:
57+
```bash
58+
uv run pre-commit install
7259
```
73-
VERIFICATION FAILED: Mantissa error mean: 11.000000 above 10 or median: 10.000000 above 8 or exp intersections: 0 above 90
60+
61+
Run linting and formatting on all files:
62+
```bash
63+
pre-commit run --all-files
7464
```
7565

76-
## Citing
66+
# Citing
7767

7868
```bibtex
7969
@misc{ong2025toploclocalitysensitivehashing,

vllm_generate_poly.py

-146
This file was deleted.

0 commit comments

Comments
 (0)