-
Notifications
You must be signed in to change notification settings - Fork 0
2024-12-05 - Green Software Playbooks agenda #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm going to be a few minutes late to the meeting. |
Create a suggestion for : "Move archived data to appropriate storage (maybe cold storage is enough for some archived data)" |
Will add a writeup on "minimizing the frequency of batch jobs" and determine how much this overlaps with "only load data where changes occurred, maybe think about event based triggers (only load delta, but also only start followup ETL processes, if some source data changed)" |
Green Software Playbooks – Data EngineeringImprovements to existing projectsMove data to appropriate storage (hot vs. cold storage)Analyse your existing data and move all rarely accessed data which still needs to be stored (e.g. for compliance or legal reasons) to cold storage. Green IT Advantages: By moving data to cold storage not only will the storage costs be reduced (though access costs increase), but it will also save energy as the servers on which the data is stored do not need to constantly be available. Considerations during setup of a new projectMove data to appropriate storage (hot vs. cold storage)Define a data strategy from the beginning of your project which defines which data storage option (hot vs. cold) should be used for which data. Green IT Advantages: By moving data to cold storage not only will the storage costs be reduced (though access costs increase), but it will also save energy as the servers on which the data is stored do not need to constantly be available. |
Green Software Playbooks – Data EngineeringOptimize the frequency of batch jobsBatch processing is useful at efficiently handling large volumes of data at once to optimize resource use. If possible these can be run at times when the electricity is cleanest. Additionally, by minimizing the number of batch jobs that need to be run you can reduce the overall amount of resources than need to be spent running these jobs. Improvements to existing projectsStart by examining how often these jobs are run and what times. Based on when your clients need the data see if it's possible to run these jobs when the energy is the cleanest. Depending on what these batch jobs are collecting, see if it's possible to reduce the number of times these jobs need to run. In particular, determine if these jobs can be set to run only when there are changes to pick up. Generally looking for ways to optimizing the batch processing will help. Considerations during setup of a new projectWhen setting up a new project make sure you have a clear understanding of your stakeholder's needs in receiving the data from the batch process. You can then determine how much flexibility you have in setting when these jobs are run and how often they need to be run. Running the jobs when the energy is cleanest, and minimizing their frequency, only running them when there are changes to pick up, can reduce electricity consumption. Green IT Advantages: Optimizing your batch jobs has a number of advantages. At the very least you can minimize the amount of electricity and hardware needed for the jobs. You can also ensure that the greenhouse gas emissions associated with the job are minimized by running these jobs when the electricity mix in the grid is cleanest. |
I've added the entry above for "minimizing the frequency of batch jobs", although I've modified it slightly and rolled in "only load data where changes occurred, maybe think about event based triggers (only load delta, but also only start followup ETL processes, if some source data changed)". Let me know if this captures the topic well. |
Uh oh!
There was an error while loading. Please reload this page.
Date
2024-12-05 - 15:00 UTC - See the time in your timezone https://everytimezone.com
Roll Call
Please add a comment to this issue during the meeting to denote attendance.
Any untracked attendees will be added by the GSF team below:
Previous Meeting
Notes from the previous meeting:
Agenda
Any Other Business
The text was updated successfully, but these errors were encountered: