Open
Description
We want to be able to run this updating job via cron job in kubernetes, but the way we have it set up currently with docker-compose to run a local postrges as a second service complicates that. Rather than trying to use a second k8s job for that, we should just refactor the level-2 branch to do all the initial loading of the parsed data into the same remote aws db we use for the final data. That way we also don't have to rebuild every time from pg_dump. So the new steps would be something like:
- upload parsed xml directly into the aws db (maybe under a different schema, though maybe this can just all be in the same)
- export the aws db tables to csv files directly into the s3 bucket https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/postgresql-s3-export.html
- download the address csv from s3, perform the geocoding, upload the new csv to s3
- upload the updated geocoded address csv in the s3 directly into the db (already doing this)
Metadata
Metadata
Assignees
Labels
No labels