Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor baseline long document classification approaches to subdirectories #27

Merged
merged 2 commits into from
Aug 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions models/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ Code to continue pretraining and fine-tune LMs from Hugging Face Transformers

This subdirectory contains code to continue pretraining and fine-tune LMs from Hugging Face Transformers using the [Transformers Python library](https://github.com/huggingface/transformers). The code assumes that LMs for continued pretraining and fine-tuning exist on your file-system at paths specified in `.yml`, but can be easily modified to load models over HTTP by replacing these paths with the corresponding LM names from Hugging Face. Click a model from their [models page](https://huggingface.co/models) and then check out their "Use in Transformers" tab, to see how to download a model over HTTP.

The `llama` and `long_roberta` directories were added later. These contain code for additional long document classification baselines. Compare these methods to long encoder-based LMs.

### Data Prep

Data should be provided as a [Hugging Face Dataset](https://huggingface.co/datasets). To create a Hugging Face Dataset, check out their documentation [here](https://huggingface.co/docs/datasets/index).
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading