The htsget-lambda crate is a cloud-based implementation of htsget-rs. It uses AWS Lambda as the ticket server, and AWS S3 as the data block server.
This is an example that deploys htsget-lambda using aws-cdk. It is deployed as an AWS HTTP API Gateway Lambda proxy integration. The stack uses RustFunction in order to integrate htsget-lambda with API Gateway. It also has the option to use a JWT authorizer with AWS Cognito as the issuer. The JWT authorizer automatically verifies JWT tokens issued by Cognito. Routing for the server is done using AWS Route 53.
The CDK code in this directory constructs a CDK app from HtsgetLambdaStack
, and uses a settings file under bin/settings.ts
. To configure the deployment, change these settings in
bin/settings.ts
:
These are general settings for the CDK deployment.
Name | Description | Type |
---|---|---|
config |
The location of the htsget-rs server config. This must be specified. This config file configures the htsget-rs server. See htsget-config for a list of available server configuration options. | string |
domain |
The domain name for the Route53 Hosted Zone that the htsget-rs server will be under. This must be specified. A hosted zone with this name will either be looked up or created depending on the value of lookupHostedZone? . |
string |
Deployment options related to the authorizer. Note that this option allows specifying an AWS JWT authorizer. The JWT authorizer automatically verifies tokens issued by a Cognito user pool. | HtsgetJwtAuthSettings |
|
subDomain? |
The domain name prefix to use for the htsget-rs server. Together with the domain , this specifies url that the htsget-rs server will be reachable under. Defaults to "htsget" . |
string |
s3BucketResources |
The buckets to serve data from. If this is not specified, this defaults to [] . This affects which buckets are allowed to be accessed by the policy actions which are ["s3:List*", "s3:Get*"] . Note that this option does not create buckets, it only gives permission to access them, see the createS3Buckets option. This option must be specified to allow htsget-rs to access data in buckets that are not created in this stack. |
string[] |
lookupHostedZone? |
Whether to lookup the hosted zone with the domain name. Defaults to true . If true , attempts to lookup an existing hosted zone using the domain name. Set this to false if you want to create a new hosted zone with the domain name. |
boolean |
createS3Bucket? |
Whether to create a test bucket. Defaults to true. Buckets are created with RemovalPolicy.RETAIN . The correct access permissions are automatically added. |
boolean |
bucketName? |
The name of the bucket created using createS3Bucket . The name defaults to an automatically generated CDK name, use this option to override that. This option only has an affect is createS3Buckets is true. |
string |
copyTestData? |
Whether to copy test data into the bucket. Defaults to true. This copies the example data under the data directory to those buckets. This option only has an affect is createS3Buckets is true. |
boolean |
copyExampleKeys? |
Whether to create secrets corresponding to C4GH public and private keys that can be used with C4GH storage. This copies the private and public keys in the data directory. Note that private keys copied here are visible in the CDK template. This is not considered secure and should only be used for test data. Real secrets should be manually provisioned or created outside the CDK template. Defaults to false. Secrets are created with RemovalPolicy.RETAIN . |
boolean |
secretArns? |
The Secrets Manager secrets which htsget-rs needs access to. This affects the permissions that get added to the Lambda role by policy actions target secretsmanager:GetSecretValue . Secrets specified here get added as resources in the policy statement. Permissions are automatically added if copyExampleKeys is specified, even if this option is set to [] . |
string[] |
features? |
Additional features to compile htsget-rs with. Defaults to [] . s3-storage is always enabled. |
string[] |
These settings are used to determine if the htsget API gateway endpoint is configured to have a JWT authorizer or not.
Name | Description | Type |
---|---|---|
public |
Whether this deployment is public. If this is true then no authorizer is present on the API gateway and the options below have no effect. |
boolean |
jwtAudience? |
A list of the intended recipients of the JWT. A valid JWT must provide an aud that matches at least one entry in this list. | string[] |
cogUserPoolId? |
The cognito user pool id for the authorizer. If this is not set, then a new user pool is created. No user pool is created if public is true. |
string |
The HtsgetSettings
are passed into HtsgetLambdaStack
in order to change the deployment config. An example of a public instance deployment
can be found under bin/htsget-lambda.ts
. This uses the config/public_umccr.toml
server config. See htsget-config for a list of available server configuration options.
- aws-cli should be installed and authenticated in the shell.
- Node.js and npm should be installed.
- Rust should be installed.
- Zig should be installed. Zig can be installed by running
cargo lambda build
at least once.
After installing the basic dependencies, complete the following steps:
- Login to AWS and define
CDK_DEFAULT_*
env variables (if not defined already). You must be authenticated with your AWS cloud to run this step. - Install cargo-lambda, as it is used to compile artifacts that are uploaded to aws lambda.
- Define which configuration to use for htsget-rs as stated in the configuration section.
Below is a summary of commands to run in this directory:
cargo install cargo-lambda
## Install zig if not already installed.
#cd .. && cargo lambda build && cd deploy
export CDK_DEFAULT_ACCOUNT=$(aws sts get-caller-identity --query Account --output text)
export CDK_DEFAULT_REGION=$(aws configure get region)
npm install
Important
The default deployment is designed to work out of the box. A bucket with a CDK-generated name is created with test
data from the data
directory. All deployment settings can be tweaked using the settings.ts
.
The only option that must be specified in the domain
, which determines the domain name to serve htsget-rs at.
CDK should be bootstrapped once, if this hasn't been done before:
npx cdk bootstrap
Then to deploy the stack, run:
npx cdk deploy
Warning
By default this deployment will create a public instance of htsget-rs. Anyone will be able to query the server
without authorizing unless you modify the HtsgetJwtAuthSettings
settings.
When the deployment is finished, the htsget endpoint can be tested by querying it. If a JWT authorizer is configured,
a valid JWT token must be obtained in order to access the endpoint. This token should be obtained from AWS Cognito using
the configured audience parameters. Then curl
can be used to query the endpoint:
curl -H "Authorization: <JWT Token>" "https://<htsget_domain>/reads/service-info"
With a possible output:
{
"id": "",
"name": "",
"version": "",
"organization": {
"name": "",
"url": ""
},
"type": {
"group": "",
"artifact": "",
"version": ""
},
"htsget": {
"datatype": "reads",
"formats": ["BAM", "CRAM"],
"fieldsParametersEffective": false,
"TagsParametersEffective": false
},
"contactUrl": "",
"documentationUrl": "",
"createdAt": "",
"UpdatedAt": "",
"environment": ""
}
The Lambda function can also be run locally using cargo-lambda. From the root project directory, execute the following command.
cargo lambda watch
Then in a separate terminal session run.
cargo lambda invoke htsget-lambda --data-file data/events/event_get.json
Examples of different Lambda events are located in the data/events
directory.
There are example deployments using Docker under the examples directory. These include a LocalStorage
deployment
and a MinIO deployment.