Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IBCDPE-946] Adds Optional Uploading #133

Merged
merged 6 commits into from
May 30, 2024
Merged

Conversation

BWMac
Copy link
Contributor

@BWMac BWMac commented May 29, 2024

Problem:

Every time the agora-data-tools processing pipeline is run, new output data files and GX report files are produced and uploaded to Synapse. This even happens when developers run the pipeline locally for testing. This causes an unnecessarily large number of files to be added to Synapse making it harder than it needs to be to find the output files from a particular run.

Solution:

Add an upload option to the CLI command and make it so that files are not uploaded by default. Additionally, add a platform argument to the CLI command so that the user can indicate whether their run is LOCAL or being executed on GITHUB or NEXTFLOW (Defaults to LOCAL).

The idea now is that when the pipeline is being run locally, it can be run exactly as before:

adt test_config.yaml

This command does not change at all because the default behavior is platform = LOCAL and upload = False

When we want file uploading to occur for testing purposes, we can make the following changes:

adt my_dev_config.yaml --upload

Note that I have provided a different configuration file this time. That is because in this instance I am a developer who needs to test the file uploading capabilities of the pipeline and I am providing a different configuration file with the destination and gx_folder fields pointing to a testing Synapse project that I have so that my test runs do not create new versions of the files in the Agora Synapse project.

Finally, for the Nextflow Tower execution, the command will look like (nf-agora PR):

adt {config} --upload --platform NEXTFLOW

and the GitHub Action command is now (this PR):

adt test_config.yaml --upload --platform GITHUB --token ${{secrets.SYNAPSE_PAT}}

Notes:

  • I considered using a destination argument where a developer could change the upload destination on Synapse without changing the configuration file, but this way is simpler to implement and sticks with our current paradigm where the configuration file dictates pipeline behavior.
  • Right now, the platform argument is only used to throw a warning if you have platform set to LOCAL and upload is True to remind users to not upload files from a local run unnecessarily. This will be used more extensively in IBCDPE-947.
  • Tests are updated/added as needed.
  • Documentation is updated as needed.
  • Apologies for the large formatting diff on CONTRIBUTING.md. The actual change I made is lines 78-88.

@BWMac BWMac marked this pull request as ready for review May 29, 2024 22:53
Copy link

sonarcloud bot commented May 29, 2024

Quality Gate Passed Quality Gate passed

Issues
9 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
9.6% Duplication on New Code

See analysis details on SonarCloud

Copy link
Contributor

@jaclynbeck-sage jaclynbeck-sage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I'm less good at knowing what to look for in the testing functions so maybe someone else can weigh in on that part of it.

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 LGTM! Excellent - just to confirm, the upload parameter just limits to the GX reports? Or is it also the output files?

@BWMac
Copy link
Contributor Author

BWMac commented May 29, 2024

@thomasyu888 It prevents both from being uploaded.

@BWMac BWMac merged commit 05eaf61 into dev May 30, 2024
9 checks passed
@BWMac BWMac deleted the bwmac/run_linking_prototyping branch May 30, 2024 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants