Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Step-37-Redshift-and-Step-42-lambda-power-tuning #22

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions step37-redshift/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Amazon Redshift

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers.

The first step to create a data warehouse is to launch a set of nodes, called an Amazon Redshift cluster. After you provision your cluster, you can upload your data set and then perform data analysis queries. Regardless of the size of the data set, Amazon Redshift offers fast query performance using the same SQL-based tools and business intelligence applications that you use today.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
*.js
!jest.config.js
*.d.ts
node_modules

# CDK asset staging directory
.cdk.staging
cdk.out
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
*.ts
!*.d.ts

# CDK asset staging directory
.cdk.staging
cdk.out
125 changes: 125 additions & 0 deletions step37-redshift/redshift-with-dynamodb-as-a-datasource/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Amazon Redshift

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers.

## SetUp Cluster

To set up a Redshift cluster, define a Cluster. It will be launched in a VPC. You can specify a VPC, otherwise one will be created. The nodes are always launched in private subnets and are encrypted by default.

```typescript
const vpc = new ec2.Vpc(this, "VPC");

const cluster = new redshift.Cluster(this, "RedshiftCluster", {
masterUser: {
masterUsername: "admin",
},
vpc,
roles: [role],
});
```

By default, the master password will be generated and stored in AWS Secrets Manager.

A default database named default_db will be created in the cluster. To change the name of this database set the defaultDatabaseName attribute in the constructor properties.

By default, the cluster will not be publicly accessible. Depending on your use case, you can make the cluster publicly accessible with the publiclyAccessible property.

## Connecting to Cluster

To control who can access the cluster, use the .connections attribute. Redshift Clusters have a default port, so you don't need to specify the port:

```typescript
cluster.connections.allowFromAnyIpv4(ec2.Port.allTraffic());
```

## Give Permission to Redshift to Access S3 Objects

```typescript
const role = new Role(this, "redshift", {
assumedBy: new ServicePrincipal("redshift.amazonaws.com"),
});

const policy = new PolicyStatement({
effect: Effect.ALLOW,
actions: ["dynamodb:*", "ec2:*"],
resources: ["*"],
});

role.addToPolicy(policy);
```

## Create a dynamodb Table

```typescript
const lollyTable = new ddb.Table(this, "lollyTable", {
billingMode: ddb.BillingMode.PAY_PER_REQUEST,
partitionKey: {
name: "id",
type: ddb.AttributeType.STRING,
},
});
```

and add some data in the database
e.g id: '123', name: 'name', age: 123

## To Connect to cluster

- Go to Secret manager and retrieve database name and database password

- After that go to redshift -> query Editor and connect to your database using those credentials

## Create a Table in that Database

To create a Table Run

1.

````sql
CREATE SCHEMA myinternalschema

````

2.


````sql

CREATE TABLE myinternalschema.event(
name varchar(200),
age integer not null,
city varchar(200));

````

3.
````sql

COPY myinternalschema.event FROM 's3://aws-redshift-spectrum-sample-data-us-east-1/spectrum/event/allevents_pipe.txt'
iam_role ‘REPLACE THIS PLACEHOLDER WITH THE IAM ROLE ARN'
readratio 50;

````
4.Then check your data using
````sql
SELECT * FROM myinternalschema.event
LIMIT 10;

````


# Welcome to your CDK TypeScript project!

This is a blank project for TypeScript development with CDK.

The `cdk.json` file tells the CDK Toolkit how to execute your app.

## Useful commands

* `npm run build` compile typescript to js
* `npm run watch` watch for changes and compile
* `npm run test` perform the jest unit tests
* `cdk deploy` deploy this stack to your default AWS account/region
* `cdk diff` compare deployed stack with current state
* `cdk synth` emits the synthesized CloudFormation template
````
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/usr/bin/env node
import 'source-map-support/register';
import * as cdk from '@aws-cdk/core';
import { BasicRedshiftStack } from '../lib/basic_redshift-stack';

const app = new cdk.App();
new BasicRedshiftStack(app, 'BasicRedshiftStack');
12 changes: 12 additions & 0 deletions step37-redshift/redshift-with-dynamodb-as-a-datasource/cdk.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"app": "npx ts-node --prefer-ts-exts bin/basic_redshift.ts",
"context": {
"@aws-cdk/core:enableStackNameDuplicates": "true",
"aws-cdk:enableDiffNoFail": "true",
"@aws-cdk/core:stackRelativeExports": "true",
"@aws-cdk/aws-ecr-assets:dockerIgnoreSupport": true,
"@aws-cdk/aws-secretsmanager:parseOwnedSecretName": true,
"@aws-cdk/aws-kms:defaultKeyPolicies": true,
"@aws-cdk/aws-s3:grantWriteWithoutAcl": true
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
module.exports = {
roots: ['<rootDir>/test'],
testMatch: ['**/*.test.ts'],
transform: {
'^.+\\.tsx?$': 'ts-jest'
}
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import * as cdk from "@aws-cdk/core";
import * as redshift from "@aws-cdk/aws-redshift";
import * as ec2 from "@aws-cdk/aws-ec2";
import * as ddb from "@aws-cdk/aws-dynamodb";
import {
Effect,
PolicyStatement,
Role,
ServicePrincipal,
} from "@aws-cdk/aws-iam";

export class BasicRedshiftStack extends cdk.Stack {
constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);

// The code that defines your stack goes here
// create a dynamodb table
const lollyTable = new ddb.Table(this, "lollyTable", {
billingMode: ddb.BillingMode.PAY_PER_REQUEST,
partitionKey: {
name: "id",
type: ddb.AttributeType.STRING,
},
});

//providing rights to redshift to access dynamodb elements
const role = new Role(this, "redshift", {
assumedBy: new ServicePrincipal("redshift.amazonaws.com"),
});

///Attaching DynamoDb access to policy
const policy = new PolicyStatement({
effect: Effect.ALLOW,
actions: ["dynamodb:*", "ec2:*"],
resources: ["*"],
});

//granting IAM permissions to role
role.addToPolicy(policy);

// creating a vpc for redshift cluster
const vpc = new ec2.Vpc(this, "VPC");

const cluster = new redshift.Cluster(this, "RedshiftCluster", {
masterUser: {
masterUsername: "admin",
},
vpc,
roles: [role],
});

// granting permission to access from anyport
cluster.connections.allowFromAnyIpv4(ec2.Port.allTraffic());
}
}
Loading