Skip to content

Commit

Permalink
fix website build issue
Browse files Browse the repository at this point in the history
  • Loading branch information
lmouhib committed Sep 23, 2024
1 parent fb6ea74 commit 88b7962
Show file tree
Hide file tree
Showing 2 changed files with 84 additions and 92 deletions.
24 changes: 8 additions & 16 deletions website/docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The sections below will take you through the steps of creating the CDK applicati
```bash
mkdir dsf-example && cd dsf-example
```
```mdx-code-block

<Tabs>
<TabItem value="typescript" label="TypeScript" default>

Expand All @@ -42,11 +42,10 @@ mkdir dsf-example && cd dsf-example
```
</TabItem>
</Tabs>
```


We can now install DSF on AWS:

```mdx-code-block

<Tabs>
<TabItem value="typescript" label="TypeScript" default>
Expand All @@ -71,13 +70,13 @@ We can now install DSF on AWS:
```
</TabItem>
</Tabs>
```


### Create a data lake storage

We will now use [***DataLakeStorage***](constructs/library/02-Storage/03-data-lake-storage.mdx) to create a storage layer for our data lake on AWS.

```mdx-code-block

<Tabs>
<TabItem value="typescript" label="TypeScript" default>

Expand Down Expand Up @@ -133,14 +132,13 @@ We will now use [***DataLakeStorage***](constructs/library/02-Storage/03-data-la

</TabItem>
</Tabs>
```


### Create the EMR Serverless Application and execution role

We will now use [***SparkEmrServerlessRuntime***](constructs/library/Processing/spark-emr-serverless-runtime). In this step we create an EMR Serverless application, create an execution IAM role, to which we will grant read write access to the created S3 bucket.


```mdx-code-block
<Tabs>
<TabItem value="typescript" label="TypeScript" default>

Expand Down Expand Up @@ -173,14 +171,12 @@ We will now use [***SparkEmrServerlessRuntime***](constructs/library/Processing/
storage.grantReadWrite(executionRole);

```

```mdx-code-block

</TabItem>
<TabItem value="python" label="Python">

In `dsf_example/dsf_example_stack.py`
```python
# Use DSF on AWS to create Spark EMR serverless runtime
spark_runtime = dsf.processing.SparkEmrServerlessRuntime(
self, "SparkProcessingRuntime", name="WordCount",
Expand All @@ -205,17 +201,15 @@ We will now use [***SparkEmrServerlessRuntime***](constructs/library/Processing/
# Provide access for the execution role to write data to the created bucket
storage.grant_read_write(processing_exec_role)
```

</TabItem>
</Tabs>
```

### Output resource IDs and ARNs

Last we will output the ARNs for the role and EMR serverless app, the ID of the EMR serverless application. These will be passed to the AWS cli when executing `StartJobRun` command.
```mdx-code-block

<Tabs>
<TabItem value="typescript" label="TypeScript" default>
In `lib/dsf-example-stack.ts`
Expand All @@ -224,23 +218,21 @@ Last we will output the ARNs for the role and EMR serverless app, the ID of the
new cdk.CfnOutput(this, "EMRServerlessApplicationARN", { value : runtimeServerless.application.attrArn });
new cdk.CfnOutput(this, "EMRServelessExecutionRoleARN", { value : executionRole.roleArn });
new cdk.CfnOutput(this, "BucketURI", { value : `s3://${storage.bucketName}` });
```
</TabItem>

<TabItem value="python" label="Python">

In `dsf_example/dsf_example_stack.py`
```python

CfnOutput(self, "EMRServerlessApplicationId", value=spark_runtime.application.attr_application_id)
CfnOutput(self, "EMRServerlessApplicationARN", value=spark_runtime.application.attr_arn)
CfnOutput(self, "EMRServelessExecutionRoleARN", value=processing_exec_role.role_arn)
CfnOutput(self, "BucketURI", value=f"s3://{storage.bucket_name}")
```
</TabItem>
</Tabs>
```

## Deploy the CDK app

If this is the first time you deploy an AWS CDK app into an environment (account/region), you can install abootstrap stack”.
Expand Down
Loading

0 comments on commit 88b7962

Please sign in to comment.