General disclaimer This repository was created for use by CDC programs to collaborate on public health related projects in support of the CDC mission. GitHub is not hosted by the CDC, but is a third party website used by CDC and its partners to share information and collaborate on software. CDC use of GitHub does not imply an endorsement of any one particular service, product, or enterprise.
The Public Health Data Infrastructure (PHDI) projects are part of the Pandemic-Ready Interoperability Modernization Effort (PRIME), a multi-year collaboration between CDC and the U.S. Digital Service (USDS) to strengthen data quality and information technology systems in state and local health departments. Under the PRIME umberalla the PHDI project seeks to develop tools, often reffered to as Building Blocks, that State, Tribal, Local, and Territorial public health agencies (STLTs) can use to better handle the public health data they recieve. The purpose of this repository is to implement the Building Blocks devloped from the PHDI SDK on Google Cloud Platform (GCP). This will allow users to easily begin using these Building Blocks in their own GCP environment. For more information on using this repository beyond what is contained in this document please refer to our Getting Started doc.
To deploy this pipeline to your own Google Cloud environment, follow these steps.
Be sure to replace all instances of myuser
in GitHub URLs with your user or organization name.
-
Install the GitHub CLI (optional)
-
Fork this repository into your personal or organization account
-
Clone your newly forked repository to your local machine by running:
git clone https://github.com/myuser/phdi-google-cloud.git
-
Navigate to the new repository directory with:
cd phdi-google-cloud
-
Authenticate the gcloud CLI by running:
for Unix based systems
./quick-start.sh
for Windows based systems
quick-start.ps1
-
Follow these steps to set the secrets output by the previous step in your repository.
-
Setup a storage bucket for Terraform state by running the GitHub Action at this URL:
https://github.com/myuser/phdi-google-cloud/actions/workflows/terraformSetup.yaml -
Create an environment named
dev
in your repository at this URL:
https://github.com/myuser/phdi-google-cloud/settings/environments/new -
Deploy to your newly created
dev
environment by running the GitHub Action at this URL, selectingdev
as the environment input:
https://github.com/myuser/phdi-google-cloud/actions/workflows/deployment.yaml -
Success! You should now see resources in your GCP project ready for data ingestion.
There are primarily four major components to this repository.
The PHDI Building Blocks are implemented as Google Cloud Functions. Google Cloud Functions are GCP's version of serverless functions, similar to Lamabda in Amazon Web Services (AWS) and Azure Functions in Mircosoft Azure. Severless function provide a relatively simple way to run services with modest runtime duration, memory, and compute requirements in the cloud. Since they are serverless, GCP abstracts all aspects of the underlying infrastructure allowing us to simply write and excute our Building Blocks without worrying about the computers they run on. The cloud-functions
directory contains Python source code for Google Cloud Functions that implement Building Blocks from the PHDI SDK.
Since the Building Blocks are designed to be composable users may want to chain serveral together into pipelines. We use the Google Workflow resource to define processes that require the use of multiple Building Blocks. These workflows are defined using YAML configuration files found in the worklows
directory.
Every resource required to use the Building Blocks and pipelines implemented in this respostory are defined using Terraform. This makes it simple for users to deploy all of the functionality provided in this repository to their own GCP environments. The Terraform code can be found in the terraform
directory.
In order to ensure high code quality and reliability we have implemented a Continuous Integation (CI) pipeline consisting of a suite of tests all new contributions must pass before they are merged into main
. We have also built a Continuous Deployment (CD) pipeline that automatically deploys the code in the repositiory to linked GCP environments when changes are made. The combined CI/CD pipeline is implemented with GitHub Actions in the .github
directory.
Target users of this system include:
- Public Health Departments
- Epidemiologists who rely on health data to take regular actions
- Senior stakeholders who make executive decisions using aggregate health data
- IT teams who have to support epidemiologists and external stakeholders integrating with the PHD
- PHDs may include state, county, city, and tribal organizations
- CDC
- Employees and contractors working on CDC projects with access to a GCP environment and interest in using PHDI Building Blocks
- CDC GitHub Open Project Request Form [Requires a CDC Office365 login, if you do not have a CDC Office365 please ask a friend who does to submit the request on your behalf. If you're looking for access to the CDCEnt private organization, please use the GitHub Enterprise Cloud Access Request form.]
- Open Practices
- Rules of Behavior
- Thanks and Acknowledgements
- Disclaimer
- Contribution Notice
- Code of Conduct
This repository constitutes a work of the United States Government and is not subject to domestic copyright protection under 17 USC § 105. This repository is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication. All contributions to this repository will be released under the CC0 dedication. By submitting a pull request you are agreeing to comply with this waiver of copyright interest.
This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication. All contributions to this project will be released under the CC0 dedication. By submitting a pull request or issue, you are agreeing to comply with this waiver of copyright interest and acknowledge that you have no expectation of payment, unless pursuant to an existing contract or agreement.
This repository contains only non-sensitive, publicly available data and information. All material and community participation is covered by the Disclaimer and Code of Conduct. For more information about CDC's privacy policy, please visit http://www.cdc.gov/other/privacy.html.
Anyone is encouraged to contribute to the repository by forking and submitting a pull request. (If you are new to GitHub, you might start with a basic tutorial.) By contributing to this project, you grant a world-wide, royalty-free, perpetual, irrevocable, non-exclusive, transferable license to all users under the terms of the Apache Software License v2 or later.
All comments, messages, pull requests, and other submissions received through CDC including this GitHub page may be subject to applicable federal law, including but not limited to the Federal Records Act, and may be archived. Learn more at http://www.cdc.gov/other/privacy.html.
This repository is not a source of government records, but is a copy to increase collaboration and collaborative potential. All government records will be published through the CDC web site.
Please refer to CDC's Template Repository for more information about contributing to this repository, public domain notices and disclaimers, and code of conduct.