diff --git a/content/docs/reference/best-practices/terraform-best-practices.md b/content/docs/reference/best-practices/terraform-best-practices.md index c2eb49fdb..a3b7dc1ad 100644 --- a/content/docs/reference/best-practices/terraform-best-practices.md +++ b/content/docs/reference/best-practices/terraform-best-practices.md @@ -17,9 +17,232 @@ and borrow on the many other helpful resources like those by [HashiCorp](https:/ See our general [Best Practices](/category/best-practices/) which also apply to Terraform. -# Language +## Variables -## Use indented `HEREDOC` syntax +### Use upstream module or provider variable names where applicable + +When writing a module that accepts `variable` inputs, make sure to use the same names as the upstream to avoid confusion and ambiguity. + +### Use all lower-case with underscores as separators + +Avoid introducing any other syntaxes commonly found in other languages such as CamelCase or pascalCase. For consistency we want all variables +to look uniform. This is also inline with the [HashiCorp naming conventions](https://www.terraform.io/docs/extend/best-practices/naming.html). + +### Use positive variable names to avoid double negatives + +All `variable` inputs that enable/disable a setting should be formatted `...._enabled` (e.g. `encryption_enabled`). It is acceptable for +default values to be either `false` or `true`. + +### Use description field for all inputs + +All `variable` inputs need a `description` field. When the field is provided by an upstream provider (e.g. `terraform-aws-provider`), use same wording as the upstream docs. + +### Use sane defaults where applicable + +Modules should be as turnkey as possible. The `default` value should ensure the most secure configuration (E.g. with encryption enabled). + +### Use `nullable = false` where appropriate + +When passing an argument to a resource, passing `null` means to use the default +value. Prior to Terraform version 1.1.0, passing `null` to a module input +set that value to `null` rather than to the default value. + +Starting with Terraform version 1.1.0, variables can be declared as +`nullable = false` which: + +1. Prevents the variable from being set to `null`. +2. Causes the variable to be set to the default value if `null` is passed in. + +You should always use `nullable = false` for all variables which should never +be set to `null`. This is particularly important for lists, +maps, and objects, which, if not required, should default to empty values (i. +e. `{}` or `[]`) rather than `null`. It can also be useful to set strings +to default to `""` rather than `null` and set `nullable = false`. This will +simplify the code since it can count on the variable having a non-null value. + +The default `nullable = true` never needs to be explicitly set. Leave +variables with the default `nullable = true` if a null value is acceptable. + +### Use feature flags, list, or map inputs for optional functionality + +All Cloud Posse modules should respect the [null-label](https://github.com/cloudposse/terraform-null-label) +`enabled` feature flag, and when `enabled` is `false`, create no resources +and generate null outputs (or, in the case of output lists, maps, and objects, +empty values may be acceptable to avoid having other modules consuming the +outputs fail due to having a null rather than empty value). + +Optional functionality should be toggled in either of 2 ways: + +1. Use of a feature flag. Specifically, an input variable of type + `bool` with a name ending in `_enabled`. Use this mechanism if the + option requires no further configuration, e.g. `iam_role_enabled` or + `s3_bucket_enabled`. Feature flags should always be `nullable = false`, + but the default value can be `true` or `false` as desired. +2. If an optional feature requires further configuration, use a `list` or `map` + input variable, with an empty input disabling the option and non-empty + input providing configuration. In this case, only use a separate feature + flag if the list or map input may still cause problems due to relying on + computed problems during the plan phase. See [Count vs. For Each](/reference/terraform-in-depth/terraform-count-vs-for-each) + and [Terraform Errors When Planning](reference/terraform-in-depth/terraform-unknown-at-plan-time) + for more information. +3. It is never acceptable for an optional feature of the _module_ to be + toggled by the value of a string or number input variable, due to the + issues explained in [Terraform Errors When Planning](reference/terraform-in-depth/terraform-unknown-at-plan-time). + However, if an optional feature of a _resource_ may be toggled by such an + input if that is the behavior of the resource and the input has the same + name as the resource argument. + +### Use objects with optional fields for complex inputs + +When a module requires a complex input, use an object with optional fields. +This provides documentation and plan-time validation while avoiding type +conversion errors, and allows for future expansion without breaking changes. +Make as many fields as possible optional, provide defaults at every level of +nesting, and use `nullable = false` if possible. + +Reserve `type = any` for exceptional cases where the input is highly +variable and/or complex, and the module is designed to handle it. For +example, the configuration of a [Datadog synthetic test](https://registry.terraform.io/providers/DataDog/datadog/latest/docs/resources/synthetics_test) +is both highly complex and [the Cloud Posse module](https://github.com/cloudposse/terraform-datadog-platform/tree/main/modules/synthetics) +accepts both an object derived from the `synthetics_test` resource schema or +an object derived from the JSON output of the Datadog API. In this rare case, +attempting to maintain a type definition would not only be overly complex, +it would slow the adoption of future additions to the interface, and so +`type = any` is appropriate. + +### Prefer a single object over multiple simple inputs for related configuration + +When reviewing Cloud Posse modules as examples, you may notice that they +often use a large number of input variables of simple types. This is because +in the early development of Terraform, there was no good way to define +complex objects with defaults. However, now that Terraform supports complex +objects with field-level defaults, we recommend using a single object input +variable with such defaults to group related configuration. This makes the +interface easier to understand and use. + +For example, prefer: + +```hcl +variable "eip_timeouts" { + type = object({ + create = optional(string) + update = optional(string) + delete = optional(string, "30m") + })) + default = {} + nullable = false +} +``` + +rather than: + +```hcl +variable "eip_create_timeout" { + type = string + default = null +} +variable "eip_update_timeout" { + type = string + default = null +} +variable "eip_delete_timeout" { + type = string + default = "30m" +} +``` + +### Use custom validators to enforce custom constraints + +Use the `validation` block to enforce custom constraints on input variables. +A custom constraint is one that, if violated, would **_not_** otherwise +cause an error, but would cause the module to behave in an unexpected way. + +For example, if the module takes an optional IPv6 CIDR block, you might +receive that in a variable of `list(string)` for reasons explained +[here](#use-feature-flags-list-or-map-inputs-for-optional-functionality). +Use a custom validator to ensure that the list has at most one element, because +if it has more than one, the module will ignore the extra elements, while a +reasonable person might expect them to be used in some way. At the same time, is it not necessary to use a custom validator to enforce +that the input is a valid IPv6 CIDR block, because Terraform will already +do that for you. + +This is perhaps better illustrated by an example of a pseudo-enumeration. +Terraform does not support real enumerations, so they are typically +implemented as strings with a limited number of acceptable values. For example, +Say you have a resource that takes a `frequency` input that is a string that +must be either `DAILY` or `WEEKLY`. Even though the value of the string is very +restricted, you should not use a custom validator to enforce that for two reasons. + +1. Terraform (technically, the Terraform resource provider) will already + enforce that the string is one of those two values and produce an + informative error message if it is not. +2. If you use a custom validator to enforce that the string is one of those + two values, and then a later version of the resource adds a new option to the + enumeration, such as "HOURLY", the custom validator will prevent the + module from using the new value, even though the module woudl function + perfectly were it not for the validator. This adds work and delay to the + adoption of underlying enhancements to the resource, without providing + enough benefit to be worth the extra effort. + +### Use variables for all secrets with no `default` value and mark them "sensitive" + +All `variable` inputs for secrets must never define a `default` value. This ensures that `terraform` is able to validate user input. +The exception to this is if the secret is optional and will be generated for the user automatically when left `null` or `""` (empty). + +Use `sensitive = true` to mark all secret variables as sensitive. This ensures that the value is not printed to the console. + +## Outputs + +### Use description field for all outputs + +All outputs must have a `description` set. The `description` should be based on (or adapted from) the upstream terraform provider where applicable. +Avoid simply repeating the variable name as the output `description`. + +### Use well-formatted snake case output names + +Avoid introducing any other syntaxes commonly found in other languages such as CamelCase or pascalCase. For consistency, we want all variables +to look uniform. It also makes code more consistent when using outputs together with terraform [`remote_state`](https://www.terraform.io/docs/providers/terraform/d/remote_state.html) to access those settings from across modules. + +### Never output secrets + +Secrets should never be outputs of modules. Rather, they should be written to secure storage such as AWS Secrets Manager, AWS SSM Parameter Store with KMS encryption, or S3 with KMS encryption at rest. Our preferred mechanism on AWS is using SSM Parameter Store. Values written to SSM are easily +retrieved by other terraform modules, or even on the command-line using tools like [chamber](https://github.com/segmentio/chamber) by Segment.io. + +We are very strict about this in our components (a.k.a root modules), the +top-most module, because these sensitive outputs are easily leaked in CI/CD +pipelines (see [`tfmask`](https://github.com/cloudposse/tfmask) for masking +secrets in output only as a last resort). We are less sensitive to this in +modules that are typically nested inside of other modules. + +Rather than outputting a secret, you may output plain text indicating where +the secret is stored, for example `RDS master password is in SSM parameter +/rds/master_password`. You may also want to have another output just for the +key for the secret in the secret store, so the key is available to other +programs which may be able to retrieve the value given the key. + +:::warning +Regardless of whether the secret is output or not, the fact that a secret is +known to Terraform means that its value is stored in plaintext in the Terraform +state file. Storing values in SSM Parameter Store or other places does not +solve this problem. Even if you store the value in a secure place using some +method other than Terraform, if you read it into Terraform, it will be +stored in plaintext in the state file. + +To keep the secret out of the state file, you must both store and retrieve +the secret outside of Terraform. This is a limitation of Terraform that [has +been discussed](https://github.com/hashicorp/terraform/issues/516) +practically since Terraform's inception, so a near-term solution is unlikely. +::: + +### Use symmetrical names + +We prefer to keep terraform outputs symmetrical as much as possible with the upstream resource or module, with exception of prefixes. This reduces the amount of entropy in the code or possible ambiguity, while increasing consistency. Below is an example of what **not* to do. The expected output name is `user_secret_access_key`. This is because the other IAM user outputs in the upstream module are prefixed with `user_`, and then we should borrow the upstream's output name of `secret_access_key` to become `user_secret_access_key` for consistency. + +![Terraform outputs should be symmetrical](/assets/terraform-outputs-should-be-symmetrical.png) + +## Language + +### Use indented `HEREDOC` syntax Using `<<-EOT` (as opposed to `<0.12.26` or `>= 0.13, < 0.15`. Always use `>=` and enforce Terraform versions in your environment by controlling which CLI you use. -## Use encrypted S3 bucket with versioning, encryption and strict IAM policies +### Use encrypted S3 bucket with versioning, encryption and strict IAM policies -We recommend not commingling state in the same bucket. This could cause the state to get overridden or compromised. Note, the state contains cached values of all outputs. Wherever possible, keep stages 100% isolated with physical barriers (separate buckets, separate organizations) +We recommend not commingling state in the same bucket as other data. This could +cause the state to get overridden or compromised. Note, the state contains +cached values of all outputs. Consider isolating sensitive areas like +production configuration and audit trails (separate buckets, separate +organizations). **Pro Tip:** Using the [`terraform-aws-tfstate-backend`](https://github.com/cloudposse/terraform-aws-tfstate-backend) to easily provision buckets for each stage. -## Use Versioning on State Bucket +### Use Versioning on State Bucket -## Use Encryption at Rest on State Bucket +### Use Encryption at Rest on State Bucket -## Use `.gitignore` to exclude terraform state files, state directory backups and core dumps +### Use `.gitignore` to exclude terraform state files, state directory backups and core dumps ``` .terraform @@ -236,7 +394,7 @@ We recommend not commingling state in the same bucket. This could cause the stat *.tfstate.backup ``` -## Use `.dockerigore` to exclude terraform statefiles from builds +### Use `.dockerigore` to exclude terraform statefiles from builds Example: @@ -244,17 +402,7 @@ Example: **/.terraform* ``` -# Root Module - -# Data Sources - -# Resources - -# AWS Specific - -# Naming Conventions - -## Use a programmatically consistent naming convention +### Use a programmatically consistent naming convention All resource names (E.g. things provisioned on AWS) must follow a consistent convention. The reason this is so important is that modules are frequently composed inside of other modules. Enforcing consistency increases the likelihood that modules can @@ -264,17 +412,17 @@ To enforce consistency, we require that all modules use the [`terraform-null-lab With this module, users have the ability to change the way resource names are generated such as by changing the order of parameters or the delimiter. While the module is opinionated on the parameters, it's proved invaluable as a mechanism for generating consistent resource names. -# DNS Infrastructure +## DNS Infrastructure -## Use lots of DNS zones +### Use lots of DNS zones Never mingle DNS from different stages or environments in the same zone. -## Delegate DNS zones across account boundaries +### Delegate DNS zones across account boundaries Delegate each AWS account a DNS zone for which it is authoritative. -## Distinguish between branded domains and service discovery domains +### Distinguish between branded domains and service discovery domains Service discovery domains are what services use to discover each other. These are seldom if ever used by end-users. There should only be one service discovery domain, but there may be many zones delegated from that domain. @@ -283,37 +431,32 @@ Branded domains are the domains that users use to access the services. These are There may be many branded domains pointing to a single service discovery domain. The architecture of the branded domains won't mirror the service discovery domains. -# Module Design +## Module Design -## Small Opinionated Modules +### Small Opinionated Modules We believe that modules should do one thing very well. But in order to do that, it requires being opinionated on the design. Simply wrapping terraform resources for the purposes of modularizing code is not that helpful. Implementing a specific use-case of those resource is more helpful. -## Composable Modules +### Composable Modules Write all modules to be easily composable into other modules. This is how we're able to achieve economies of scale and stop re-inventing the same patterns over and over again. -## Use `variable` inputs +## Module Usage -Modules should accept as many parameters as possible. Avoid using inputs of `type = object` since they are harder to document. Of course, -this is not a hard rule and sometimes objects just make the most sense. Just be weary of the ability for tools like [`terraform-docs`](https://github.com/segmentio/terraform-docs) to be able to -generate meaningful documentation. - -# Module Usage - -## Use Terraform registry format with exact version numbers +### Use Terraform registry format with exact version numbers There are many ways to express a module's source. Our convention is to use Terraform registry syntax with an explicit version. ``` source = "cloudposse/label/null" - version = "0.22.0" + version = "0.25.0" ``` -The reason to pin to an explicit version rather than a range like `>= 0.22.0` is that any update is capable of breaking something. Any changes to your infrastructure should be implemented and reviewed under your control, not blindly automatic based on when you deployed it. +The reason to pin to an explicit version rather than a range like `>= 0.25. +0` is that any update is capable of breaking something. Any changes to your infrastructure should be implemented and reviewed under your control, not blindly automatic based on when you deployed it. :::info @@ -326,3 +469,4 @@ source = "git::https://github.com/cloudposse/terraform-null-label.git?ref=tags/0 Note that the `ref` always points explicitly to a `tags` pinned to a specific version. Dropping the `tags/` qualifier means it could be a branch or a tag; we prefer to be explicit. ::: +