Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the_od_bods/data/ with every pipeline run #2

Closed
1 of 2 tasks
KarenJewell opened this issue Oct 21, 2022 · 0 comments · Fixed by #3
Closed
1 of 2 tasks

Replace the_od_bods/data/ with every pipeline run #2

KarenJewell opened this issue Oct 21, 2022 · 0 comments · Fixed by #3

Comments

@KarenJewell
Copy link
Member

KarenJewell commented Oct 21, 2022

Is your feature request related to a problem? Please describe.
When the opendata.scot_pipeline is run, the /data folder in the_od_bods repo isn't replaced. This means the data/ folder is completely out of date unless someone manually replaces it.
This causes issues if publishers or sources are removed from the pipeline refresh, but the original aged data files remain in the folder. It also causes compatibility issues on contributors' local machines, especially if there are structural changes to data files expected downstream.

Describe the solution you'd like

  • Add a clean up step to the pipeline to nuke all existing csvs before we run all the scripts
  • Start committing to the od bods repo as well on the weekly sync to keep it up to date

Describe alternatives you've considered

  • Start writing all the csvs to a temp directory that goes away after pipeline run

Additional context
Rejected writing to temp directory because it’s a significant effort, don’t think it’s worth it if we have our eyes on issue OpenDataScotland/the_od_bods#163 already

Originally discussed on slack 21 Oct 2022: https://opendatascotland.slack.com/archives/C02HEHDL8AY/p1666356294060289

@KarenJewell KarenJewell transferred this issue from OpenDataScotland/the_od_bods Jan 7, 2023
@KarenJewell KarenJewell linked a pull request Jan 7, 2023 that will close this issue
9 tasks
@KarenJewell KarenJewell moved this from Todo to In Progress in Christmas Sprint 2022 Jan 7, 2023
@KarenJewell KarenJewell moved this from In Progress to Waiting Review in Christmas Sprint 2022 Jan 7, 2023
@github-project-automation github-project-automation bot moved this from Backlog to Done in Open Data Scotland 2024 Jan 26, 2023
@github-project-automation github-project-automation bot moved this from Waiting Review to Done in Christmas Sprint 2022 Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant