We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi there
I'm busy converting llama 3 70b to the distributed format, but I get the following output:
Target float type: q40 Target file: D:\Meta-Llama-3-70B-Instruct-Distributed\dllama_original_q40.bin 💿 Chunking model 1/16... Unknown header key: ffn_dim_multiplier Unknown header key: multiple_of Unknown header key: norm_eps Unknown header key: head_size {'dim': 8192, 'ffn_dim_multiplier': 1.3, 'multiple_of': 4096, 'n_heads': 64, 'n_kv_heads': 8, 'n_layers': 80, 'norm_eps': 1e-05, 'vocab_size': 128256, 'rope_theta': 500000, 'head_size': 128.0, 'max_seq_len': 2048, 'arch_type': 11259136, 'n_experts': 0, 'n_active_experts': 0, 'hidden_dim': 28672} 🔶 Exporting tok_embeddings.weight torch.Size([16032, 65536])... Saved f32 tensor in 72.36s, 4202692608 bytes 🔶 Exporting layers.0.attention.wq.weight torch.Size([8192, 8192])... Saved q40 tensor in 15.90s, 37748736 bytes 🔶 Exporting layers.0.attention.wk.weight torch.Size([1024, 8192])... Saved q40 tensor in 1.99s, 4718592 bytes
Would it still work fine? Conversion process so far is really slow on my machine, should be done in a couple of hours
The text was updated successfully, but these errors were encountered:
Hello @DifferentialityDevelopment, yes it should be fine. The converter is slow, this is completely not optimized part yet.
Sorry, something went wrong.
No branches or pull requests
Hi there
I'm busy converting llama 3 70b to the distributed format, but I get the following output:
Target float type: q40
Target file: D:\Meta-Llama-3-70B-Instruct-Distributed\dllama_original_q40.bin
💿 Chunking model 1/16...
Unknown header key: ffn_dim_multiplier
Unknown header key: multiple_of
Unknown header key: norm_eps
Unknown header key: head_size
{'dim': 8192, 'ffn_dim_multiplier': 1.3, 'multiple_of': 4096, 'n_heads': 64, 'n_kv_heads': 8, 'n_layers': 80, 'norm_eps': 1e-05, 'vocab_size': 128256, 'rope_theta': 500000, 'head_size': 128.0, 'max_seq_len': 2048, 'arch_type': 11259136, 'n_experts': 0, 'n_active_experts': 0, 'hidden_dim': 28672}
🔶 Exporting tok_embeddings.weight torch.Size([16032, 65536])...
Saved f32 tensor in 72.36s, 4202692608 bytes
🔶 Exporting layers.0.attention.wq.weight torch.Size([8192, 8192])...
Saved q40 tensor in 15.90s, 37748736 bytes
🔶 Exporting layers.0.attention.wk.weight torch.Size([1024, 8192])...
Saved q40 tensor in 1.99s, 4718592 bytes
Would it still work fine?
Conversion process so far is really slow on my machine, should be done in a couple of hours
The text was updated successfully, but these errors were encountered: