Merge branch 'current' into remove-ui-elements

dbt-labs · Oct 24, 2024 · 39cd989 · 39cd989
2 parents ebbfc83 + 24107bb
commit 39cd989
Show file tree

Hide file tree

Showing 54 changed files with 521 additions and 328 deletions.
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -9,6 +9,7 @@ To learn more about the writing conventions used in the dbt Labs docs, see the [
 - [ ] I have reviewed the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines.
 - [ ] The topic I'm writing about is for specific dbt version(s) and I have versioned it according to the [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and/or [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content) guidelines.
 - [ ] I have added checklist item(s) to this list for anything anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch."
+- [ ] The content in this PR requires a dbt release note, so I added one to the [release notes page](https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes).
 <!--
 PRE-RELEASE VERSION OF dbt (if so, uncomment):
 - [ ] Add a note to the prerelease version [Migration Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)

diff --git a/...st-practices/how-we-build-our-metrics/semantic-layer-3-build-semantic-models.md b/...st-practices/how-we-build-our-metrics/semantic-layer-3-build-semantic-models.md
@@ -241,7 +241,9 @@ measures:
 
 ## Reviewing our work
 
-Our completed code will look like this, our first semantic model!
+Our completed code will look like this, our first semantic model! Here are two examples showing different organizational approaches:
+
+<Expandable alt_header="Co-located approach">
 
 <File name="models/marts/orders.yml" />
 
@@ -288,6 +290,68 @@ semantic_models:
         description: The total tax paid on each order.
         agg: sum
 ```
+</Expandable>
+
+<Expandable alt_header="Parallel sub-folder approach">
+
+<File name="models/semantic_models/sem_orders.yml" />
+
+```yml
+semantic_models:
+  - name: orders
+    defaults:
+      agg_time_dimension: ordered_at
+    description: |
+      Order fact table. This table is at the order grain with one row per order.
+
+    model: ref('stg_orders')
+
+    entities:
+      - name: order_id
+        type: primary
+      - name: location
+        type: foreign
+        expr: location_id
+      - name: customer
+        type: foreign
+        expr: customer_id
+
+    dimensions:
+      - name: ordered_at
+        expr: date_trunc('day', ordered_at)
+        # use date_trunc(ordered_at, DAY) if using BigQuery
+        type: time
+        type_params:
+          time_granularity: day
+      - name: is_large_order
+        type: categorical
+        expr: case when order_total > 50 then true else false end
+
+    measures:
+      - name: order_total
+        description: The total revenue for each order.
+        agg: sum
+      - name: order_count
+        description: The count of individual orders.
+        expr: 1
+        agg: sum
+      - name: tax_paid
+        description: The total tax paid on each order.
+        agg: sum
+```
+</Expandable>
+
+As you can see, the content of the semantic model is identical in both approaches. The key differences are:
+
+1. **File location**
+   - Co-located approach: `models/marts/orders.yml`
+   - Parallel sub-folder approach: `models/semantic_models/sem_orders.yml`
+
+2. **File naming**
+   - Co-located approach: Uses the same name as the corresponding mart (`orders.yml`)
+   - Parallel sub-folder approach: Prefixes the file with `sem_` (`sem_orders.yml`)
+
+Choose the approach that best fits your project structure and team preferences. The co-located approach is often simpler for new projects, while the parallel sub-folder approach can be clearer for migrating large existing projects to the Semantic Layer.
 
 ## Next steps
 

diff --git a/website/docs/best-practices/how-we-mesh/mesh-2-who-is-dbt-mesh-for.md b/website/docs/best-practices/how-we-mesh/mesh-2-who-is-dbt-mesh-for.md
@@ -23,9 +23,6 @@ Is dbt Mesh a good fit in this scenario? Absolutely! There is no other way to sh
 - Onboarding hundreds of people and dozens of projects is full of friction! The challenges of a scaled, global organization are not to be underestimated. To start the migration, prioritize teams that have strong dbt familiarity and fundamentals. dbt Mesh is an advancement of core dbt deployments, so these teams are likely to have a smoother transition. 
 
   Additionally, prioritize teams that manage strategic data assets that need to be shared widely. This ensures that dbt Mesh will help your teams deliver concrete value quickly.
-- Bi-directional project dependencies -- currently, projects in dbt Mesh are treated like dbt resources in that they cannot depend on each other. However, many teams may want to be able to share data assets back and forth between teams. 
-
-  We've added support for [enabling bidirectional dependencies](/best-practices/how-we-mesh/mesh-3-structures#cycle-detection) across projects. <Lifecycle status="beta"/>
 
 If this sounds like your organization, dbt Mesh is the architecture you should pursue. ✅
 

diff --git a/website/docs/best-practices/how-we-mesh/mesh-3-structures.md b/website/docs/best-practices/how-we-mesh/mesh-3-structures.md
@@ -66,7 +66,7 @@ Since the launch of dbt Mesh, the most common pattern we've seen is one where pr
 
 Users may need to contribute models across multiple projects and this is fine. There will be some friction doing this, versus a single repo, but this is _useful_ friction, especially if upstreaming a change from a “spoke” to a “hub.” This should be treated like making an API change, one that the other team will be living with for some time to come. You should be concerned if your teammates find they need to make a coordinated change across multiple projects very frequently (every week), or as a key prerequisite for ~20%+ of their work.
 
-### Cycle detection <Lifecycle status="beta"/>
+### Cycle detection
 
 import CycleDetection from '/snippets/_mesh-cycle-detection.md';
 

diff --git a/website/docs/best-practices/how-we-mesh/mesh-5-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-5-faqs.md
@@ -215,7 +215,7 @@ There’s model-level access within dbt, role-based access for users and groups
 
 First things first: access to underlying data is always defined and enforced by the underlying data platform (for example, BigQuery, Databricks, Redshift, Snowflake, Starburst, etc.) This access is managed by executing “DCL statements” (namely `grant`). dbt makes it easy to [configure `grants` on models](/reference/resource-configs/grants), which provision data access for other roles/users/groups in the data warehouse. However, dbt does _not_ automatically define or coordinate those grants unless they are configured explicitly. Refer to your organization's system for managing data warehouse permissions.
 
-[dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) support [role-based access control (RBAC)](/docs/cloud/manage-access/enterprise-permissions#how-to-set-up-rbac-groups-in-dbt-cloud) that manages granular permissions for users and user groups. You can control which users can see or edit all aspects of a dbt Cloud project. A user’s access to dbt Cloud projects also determines whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application via the UI or by integrating with an identity provider.
+[dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) support [role-based access control (RBAC)](/docs/cloud/manage-access/about-user-access#role-based-access-control-) that manages granular permissions for users and user groups. You can control which users can see or edit all aspects of a dbt Cloud project. A user’s access to dbt Cloud projects also determines whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application via the UI or by integrating with an identity provider.
 
 [Model access](/docs/collaborate/govern/model-access) defines where models can be referenced. It also informs the discoverability of those projects within dbt Explorer. Model `access` is defined in code, just like any other model configuration (`materialized`, `tags`, etc).
 

diff --git a/website/docs/docs/build/cumulative-metrics.md b/website/docs/docs/build/cumulative-metrics.md
@@ -18,21 +18,21 @@ Note that we use the double colon (::) to indicate whether a parameter is nested
 
 <VersionBlock firstVersion="1.9">
 
-| Parameter | <div style={{width:'350px'}}>Description</div> | Type |
-| --------- | ----------- | ---- |
-| `name` | The name of the metric. | Required |
-| `description` | The description of the metric. | Optional |
-| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required |
-| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required |
-| `type_params` | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`. | Required |
-| `type_params::cumulative_type_params` | Allows you to add a `window`, `period_agg`, and `grain_to_date` configuration. Nested under `type_params`. | Optional |
-| `cumulative_type_params::window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional |
-| `cumulative_type_params::grain_to_date` | Sets the accumulation grain, such as `month`, which will accumulate data for one month and then restart at the beginning of the next. This can't be used with `window`. | Optional |
-| `cumulative_type_params::period_agg` | Specifies how to aggregate the cumulative metric when summarizing data to a different granularity. Can be used with grain_to_date. Options are <br /> - `first` (Takes the first value within the period) <br /> - `last` (Takes the last value within the period <br /> - `average` (Calculates the average value within the period). <br /> <br /> Defaults to `first` if no `window` is specified. | Optional |
-| `type_params::measure` | A dictionary describing the measure you will use. | Required |
-| `measure::name` | The measure you are referencing. | Optional |
-| `measure::fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional |
-| `measure::join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional |
+| Parameter   | <div style={{width:'350px'}}>Description</div>   | Type      |
+|-------------|---------------------------------------------------|-----------|
+| `name`  | The name of the metric.       | Required  |
+| `description`       | The description of the metric.     | Optional  |
+| `type`    | The type of the metric (cumulative, derived, ratio, or simple).       | Required  |
+| `label`     | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`).  | Required  |
+| `type_params`    | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`.  | Required  |
+| `type_params::measure`   | The measure associated with the metric. Supports both shorthand (string) and object syntax. The shorthand is used if only the name is needed, while the object syntax allows specifying additional attributes. | Required  |
+| `measure::name`    | The name of the measure being referenced. Required if using object syntax for `type_params::measure`.  | Optional  |
+| `measure::fill_nulls_with`     | Sets a value (for example, 0) to replace nulls in the metric definition.    | Optional  |
+| `measure::join_to_timespine` | Boolean indicating if the aggregated measure should be joined to the time spine table to fill in missing dates. Default is `false`. | Optional  |
+| `type_params::cumulative_type_params`     | Configures the attributes like `window`, `period_agg`, and `grain_to_date` for cumulative metrics. | Optional  |
+| `cumulative_type_params::window`      | Specifies the accumulation window, such as `1 month`, `7 days`, or `1 year`. Cannot be used with `grain_to_date`.   | Optional  |
+| `cumulative_type_params::grain_to_date`   | Sets the accumulation grain, such as `month`, restarting accumulation at the beginning of each specified grain period. Cannot be used with `window`. | Optional  |
+| `cumulative_type_params::period_agg`  | Defines how to aggregate the cumulative metric when summarizing data to a different granularity: `first`, `last`, or `average`. Defaults to `first` if `window` is not specified. | Optional  |
 
 </VersionBlock>
 
@@ -45,15 +45,34 @@ Note that we use the double colon (::) to indicate whether a parameter is nested
 | `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required |
 | `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required |
 | `type_params` | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`. | Required |
-| `window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional  |
+| `window` | The accumulation window, such as `1 month`, `7 days`, or `1 year`. This can't be used with `grain_to_date`. | Optional  |
 | `grain_to_date` | Sets the accumulation grain, such as `month`, which will accumulate data for one month and then restart at the beginning of the next. This can't be used with `window`. | Optional |
 | `type_params::measure` | A list of measure inputs | Required |
-| `measure:name` | The measure you are referencing. | Optional  |
+| `measure:name` | The name of the measure being referenced. Required if using object syntax for `type_params::measure`.  | Optional  |
 | `measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero).| Optional |
 | `measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional |
 
 </VersionBlock>
 
+<Expandable alt_header="Explanation of type_params::measure">
+
+The`type_params::measure` configuration can be written in different ways:
+- Shorthand syntax &mdash;  To only specify the name of the measure, use a simple string value. This is a shorthand approach when no other attributes are required.
+  ```yaml
+  type_params:
+    measure: revenue
+  ```
+- Object syntax &mdash; To add more details or attributes to the measure (such as adding a filter, handling `null` values, or specifying whether to join to a time spine), you need to use the object syntax. This allows for additional configuration beyond just the measure's name.
+
+  ```yaml
+  type_params:
+    measure:
+      name: order_total
+      fill_nulls_with: 0
+      join_to_timespine: true
+  ```
+</Expandable>
+
 ### Complete specification
 The following displays the complete specification for cumulative metrics, along with an example:
 

diff --git a/website/docs/docs/build/measures.md b/website/docs/docs/build/measures.md
@@ -102,7 +102,7 @@ semantic_models:
     description: A record of every transaction that takes place. Carts are considered  multiple transactions for each SKU.
     model: ref('schema.transactions')
     defaults:
-      agg_time_dimensions: metric_time
+      agg_time_dimension: transaction_date
 
 # --- entities ---
     entities:
@@ -167,7 +167,7 @@ semantic_models:
         
 # --- dimensions ---
     dimensions:
-      - name: metric_time
+      - name: transaction_date
         type: time
         expr: date_trunc('day', ts) # expr refers to underlying column ts
         type_params:
@@ -204,15 +204,15 @@ semantic_models:
     description: A subscription table with one row per date for each active user and their subscription plans. 
     model: ref('your_schema.subscription_table')
     defaults:
-      agg_time_dimension: metric_time 
+      agg_time_dimension: subscription_date
 
     entities:
       - name: user_id
         type: foreign
         primary_entity: subscription_table
 
     dimensions:
-      - name: metric_time
+      - name: subscription_date
         type: time
         expr: date_transaction
         type_params:

diff --git a/website/docs/docs/build/metricflow-commands.md b/website/docs/docs/build/metricflow-commands.md
@@ -28,18 +28,28 @@ In dbt Cloud, run MetricFlow commands directly in the [dbt Cloud IDE](/docs/clou
 
 For dbt Cloud CLI users, MetricFlow commands are embedded in the dbt Cloud CLI, which means you can immediately run them once you install the dbt Cloud CLI and don't need to install MetricFlow separately. You don't need to manage versioning because your dbt Cloud account will automatically manage the versioning for you.
 
-<!--remove when fixed -->
-Note: The **Defer to staging/production** [toggle](/docs/cloud/about-cloud-develop-defer#defer-in-the-dbt-cloud-ide) button doesn't apply when running Semantic Layer commands in the dbt Cloud IDE.  To use defer for Semantic layer commands in the IDE, toggle the button on and manually add the `--defer` flag to the command. This is a temporary workaround and will be available soon.
 </TabItem>
 
 <TabItem value="core" label="MetricFlow with dbt Core">  
 
 You can install [MetricFlow](https://github.com/dbt-labs/metricflow#getting-started) from [PyPI](https://pypi.org/project/dbt-metricflow/). You need to use `pip` to install MetricFlow on Windows or Linux operating systems:
 
+<VersionBlock lastVersion="1.7">
+
 1. Create or activate your virtual environment `python -m venv venv`
 2. Run `pip install dbt-metricflow`
   * You can install MetricFlow using PyPI as an extension of your dbt adapter in the command line. To install the adapter, run `python -m pip install "dbt-metricflow[your_adapter_name]"` and add the adapter name at the end of the command. For example, for a Snowflake adapter run `python -m pip install "dbt-metricflow[snowflake]"`
 
+</VersionBlock>
+
+<VersionBlock firstVersion="1.8">
+
+1. Create or activate your virtual environment `python -m venv venv`
+2. Run `pip install dbt-metricflow`
+  * You can install MetricFlow using PyPI as an extension of your dbt adapter in the command line. To install the adapter, run `python -m pip install "dbt-metricflow[adapter_package_name]"` and add the adapter name at the end of the command. For example, for a Snowflake adapter run `python -m pip install "dbt-metricflow[dbt-snowflake]"`
+
+</VersionBlock>
+
 **Note**, you'll need to manage versioning between dbt Core, your adapter, and MetricFlow.
 
 Something to note, MetricFlow `mf` commands return an error if you have a Metafont latex package installed. To run `mf` commands, uninstall the package.

diff --git a/website/docs/docs/build/metricflow-time-spine.md b/website/docs/docs/build/metricflow-time-spine.md
@@ -463,9 +463,22 @@ For example, if you use a custom calendar in your organization, such as a fiscal
 
 - This is useful for calculating metrics based on a custom calendar, such as fiscal quarters or weeks. 
 - Use the `custom_granularities` key to define a non-standard time period for querying data, such as a `retail_month` or `fiscal_week`, instead of standard options like `day`, `month`, or `year`.
-- Ensure the the `standard_granularity_column` is a date time type.
 - This feature provides more control over how time-based metrics are calculated.
 
+<Expandable alt_header="Data types and time zone considerations">
+
+When working with custom calendars in MetricFlow, it's important to ensure:
+
+- Consistent data types &mdash; Both your dimension column and the time spine column should use the same data type to allow accurate comparisons. Functions like `DATE_TRUNC` don't change the data type of the input in some databases (like Snowflake). Using different data types can lead to mismatches and inaccurate results.
+
+  We recommend using `DATETIME` or `TIMESTAMP` data types for your time dimensions and time spine, as they support all granularities. The `DATE` data type may not support smaller granularities like hours or minutes.
+
+- Time zones &mdash; MetricFlow currently doesn't perform any timezone manipulation. When working with timezone-aware data, inconsistent time zones may lead to unexpected results during aggregations and comparisons.
+
+For example, if your time spine column is `TIMESTAMP` type and your dimension column is `DATE` type, comparisons between these columns might not work as intended. To fix this, convert your `DATE` column to `TIMESTAMP`, or make sure both columns are the same data type.
+
+</Expandable>
+
 ### Add custom granularities
 
 To add custom granularities, the Semantic Layer supports custom calendar configurations that allow users to query data using non-standard time periods like `fiscal_year` or `retail_month`. You can define these custom granularities (all lowercased) by modifying your model's YAML configuration like this: