You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello everyone!
I am currently doing a small research for my study on Sparse Transfer Learning and SparseML library is a good approach for my work. My topic is about applying sparse transfer learning on different architectures. Before that, Transformer needs to be made sparse (e.g. pruned with GMP ). Transformer architectures include encoder-only (e.g. BERT), decoder-only (e.g. GPT) and encoder-decoder (e.g. T5). For the first two architectures are technically possible based on your documentation on Github so I have looked at. For encoder-decoders it is still unclear to me if this is technically possible with SparseML. Theoretically you can just adjust/customize the recipe, but still unclear what do I have to do. what do i have to set in “params”? it's just “All Prunable” or how could you give an example recipe for T5 (encoder-decoder). That would be helpful for me. I look forward to your feedback and thank you in advance for your help!
The text was updated successfully, but these errors were encountered:
Hello everyone!
I am currently doing a small research for my study on Sparse Transfer Learning and SparseML library is a good approach for my work. My topic is about applying sparse transfer learning on different architectures. Before that, Transformer needs to be made sparse (e.g. pruned with GMP ). Transformer architectures include encoder-only (e.g. BERT), decoder-only (e.g. GPT) and encoder-decoder (e.g. T5). For the first two architectures are technically possible based on your documentation on Github so I have looked at. For encoder-decoders it is still unclear to me if this is technically possible with SparseML. Theoretically you can just adjust/customize the recipe, but still unclear what do I have to do. what do i have to set in “params”? it's just “All Prunable” or how could you give an example recipe for T5 (encoder-decoder). That would be helpful for me. I look forward to your feedback and thank you in advance for your help!
The text was updated successfully, but these errors were encountered: