Issue with Decoding Preprocessed Data #1691
-
I am using the axolotl library for preprocessing data, which is then processed and saved in the .arrow format. This process involves several data sources that are processed and integrated into a unified data structure. Data Configuration: yaml
axolotl: 0.4.1 Expectations: I expected the file to be saved in a valid Arrow format and that it could be successfully loaded and decoded for further use. Questions for Developers |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hello, I'm very sorry for the missing this post. Thank you for all the details. I'm leaving this message in case anyone else comes upon this issue as well. The way to load the processed data is using There's a short explanation here on how to do so https://axolotl-ai-cloud.github.io/axolotl/docs/input_output.html#check-the-prompts |
Beta Was this translation helpful? Give feedback.
Hello, I'm very sorry for the missing this post. Thank you for all the details. I'm leaving this message in case anyone else comes upon this issue as well.
The way to load the processed data is using
load_from_disk
.There's a short explanation here on how to do so https://axolotl-ai-cloud.github.io/axolotl/docs/input_output.html#check-the-prompts