Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address ECCC version data in Object Storage #45

Open
franTarkenton opened this issue Nov 10, 2023 · 0 comments
Open

Address ECCC version data in Object Storage #45

franTarkenton opened this issue Nov 10, 2023 · 0 comments

Comments

@franTarkenton
Copy link
Member

Background

The ECCC script pulls hourly data from the federal governments data mart, does some reformatting and ultimately creates the files in the object storage bucket into the following directory: RFC_DATA/ECCC/hourly/csv

The script is currently running every hour. Each time it runs it creates a new version in object storage.

Task

Modify the ECCC code and update it so that there can only ever be two versions. If there are more than two versions the oldest ones are autodeleted.

The best place to implement this is the upstream nr-objectstore-util lib. Configure it so that there is an argument for the put operations that defines the maximum number of versions you want to maintain. If not populated then doesn't do anything, and just creates a new version, however if you specify an arguement of version=2 then it will delete any versions that are older than the 2 newest ones.

@franTarkenton franTarkenton moved this to In Progress in RFC Backlog Dec 5, 2023
@franTarkenton franTarkenton moved this from In Progress to Sprint Backlog in RFC Backlog Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Sprint Backlog
Development

No branches or pull requests

1 participant