feat: fundamental codebase refactor #156

b4rtaz · 2025-02-06T15:52:54Z

This PR introduces fundamental changes in Distributed Llama. Work in progress.

List of changes:

✅ Introduced an abstract neural network model with opcodes to describe neural network behavior
✅ Completely restructure the project
✅ Batch processing to support evaluation and prediction #138
✅ Speed up matmul operation for the evaluation (used sgemm llamafile)
✅ Improve tokenizer
✅ Fixed obvious memory leaks (detected by -fsanitize=address)
✅ ARM and AVX2 optimalisation for all op codes

Fixes:

Fixed a bug in rope scaling
Fixed a bug in the tokenizer that caused invalidly tokenized special tokens

b4rtaz added 24 commits February 4, 2025 23:03

nn.

c785fd5

ci.

e13e4f3

wvla fix.

c08751a

fix include.

9880c42

fix include.

37d62c0

fix include.

834b017

unique_ptr fix.

5f33bc7

include fix.

e9ca45a

fix include.

c3698ac

fix rms test.

4b5ba02

fix.

c92554f

fix.

5590577

build fix.

eb26560

fix build.

448e2a2

fix test.

a3f74f5

convertF16toF32 for arm.

540c199

rope fix.

482a8b6

continuous memory flag.

cd5b732

optimization.

8230191

merge.

789a7eb

move timer.

9ced6b8

sgemm.

7a6f380

include.

810093a

fix test.

ec0b27d

This was referenced Feb 6, 2025

build error in termux #154

Closed

It's slow. #152

Closed

b4rtaz added 4 commits February 7, 2025 14:47

fixes.

acaece0

silu arm.

f97fef6

debug.

52aa2d9

rename.

19613bb

b4rtaz added 28 commits February 11, 2025 16:56

fix.

0576a7f

fix.

577ef14

fix.

bbbf6f2

mul_F32 avx2.

b9fb50e

Merge commit 'bbbf6f2ae724b9a0992592ab61bc10a12e626b61' into feat/nn

3eb6b78

fix.

291122e

add_Q80_F32 avx2.

71a31bd

fix test.

45a3ed1

softmax_F32 avx2.

5d75f23

silu_F32 avx2.

5ee4cdc

matmul_Q80_Q40_F32 avx512.

e220039

fix avx512.

d8511a4

fix softmax_F32 avx2.

21b4619

Merge branch 'main' into feat/nn

6eb62a0

max threads error.

e17eaa2

fix avx512.

f60a435

fix.

29eb0ed

network non-blocking mode.

39a104f

Merge branch 'main' into feat/nn

37dc301

Merge branch 'main' into feat/nn

6b67aad

fix rope scaling.

9724186

fix memory leak.

683e143

fix memory leak.

46cac38

fix memory leak.

b90dd84

fix.

1f6b616

simplifying logic.

cd0203a

include fix.

2663230

fixes.

3d39a26

b4rtaz marked this pull request as ready for review February 12, 2025 22:59

b4rtaz merged commit 121bc8c into main Feb 12, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fundamental codebase refactor #156

feat: fundamental codebase refactor #156

b4rtaz commented Feb 6, 2025 •

edited

Loading

feat: fundamental codebase refactor #156

feat: fundamental codebase refactor #156

Conversation

b4rtaz commented Feb 6, 2025 • edited Loading

b4rtaz commented Feb 6, 2025 •

edited

Loading