Set Up a Custom Snowflake Log Search Warehouse

George Alpizar
George Alpizar
  • Updated

Overview

You can use this document to learn how to set up a custom Snowflake log search warehouse. 


Step 1: Create an S3 Bucket 

You must create an S3 Bucket (or prefix) to upload agent archive logs.  

To learn how to create a bucket, please review the this article from AWS.

Note the name of the newly created bucket; you will need this information in a later step. 


Step 2: Create an S3 Archiving Integration

After you create a Snowflake integration, you must create an S3 archive integration to send logs to your S3 location. 

  1. In the Edge Delta App, on the left-side navigation, click Data Pipeline, and then click Integrations
  2. Under Add Integration, locate and select the AWS S3 archiving integration.
  3. Enter your desired parameters, and then click Save.
    • To learn more about the parameters that you can configure, see AWS S3 Archiving Output and Integration.  
    • Note the name of the integration; you will need this information in a later step. 
    • After you create an S3 archiving integration, you must add the integration into an existing configuration, specifically under the workflow settings. 
  4. On the left-side navigation, click Data Pipeline, and then select Agent Settings
  5. Locate the desired configuration, then under Actions, click the vertical ellipses, and then click Edit
  6. In the YAML file, you must add the newly created S3 integration as an archive (under output) and as a destination (under workflow). Review the following example for the archive-snowflake integration.
    • outputs:
        triggers:
            
        streams:
          - integration_name: datadog-dev
          - integration_name: elastic-dev
      
        archives:
          - integration_name: archive-snowflake
    •   archive-workflow:
          input_labels:
           - mobile-app-service
          destinations:
           - archive-snowflake
  7. Click Save

Step 3: Create a Database, Schema, and Table 

  1. To create a database, review the following command:
    • create database "warehouse_integration_test" comment = "test for custom snowflake scenario";
  2. To create a schema, review the following command:
    • create schema "warehouse_integration_test"."test_schema" with managed access comment = "test schema";
  3. To create a table, review the following command:
    • create table "test_table" (
      		host string ,
      		source_name string ,
      		source_type string ,
      		tag string ,
      		timestamp int ,
      		date DATETIME,
      		raw string
      );

Step 4: Create a Role with Handle Permissions and Assign a User 

  1. To create a role, review the following command:
    • create role "read_only" Comment = "read only role for warehouse integration"
  2. To add permissions to the role, review the following commands: 
    • grant usage
        on warehouse COMPUTE_WH
        to role "read_only";
    • grant usage
        on database "warehouse_integration_test"
        to role "read_only";
    • grant usage
        on schema "warehouse_integration_test"."test_schema"
        to role "read_only";
    • grant select on all tables 
        in schema "warehouse_integration_test"."test_schema"
        to role "read_only";
  3. To create a user, review the following command: 
    • create user "read_only_user" password='Test1234' default_role = "read_only";
  4. To assign the user to the role, review the following command:
    • grant role "read_only" to user "read_only_user";

Step 5: Create an IAM Policy

You must create an IAM policy that gives access to the newly created bucket (or prefix). 

  1. Access the AWS console, and then navigate to the IAM console. 
  2. In the left-side navigation, click Policies.  
  3. Click Create policy
  4. Click JSON
  5. Copy and paste the policy below. You must update the policy to include the name of the newly created S3 bucket (or prefix). 
  6. Click Next: Tags, and then click Next: Review
  7. Complete the missing fields.
    • Note the name of the policy; you will need this information in a later step. 
  8. Click Create Policy
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
              "s3:GetObject",
              "s3:GetObjectVersion"
            ],
            "Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::<bucket>",
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "<prefix>/*"
                    ]
                }
            }
        }
    ]
}

Step 6: Create an IAM role

  1. Access the AWS console, and then navigate to the IAM console. 
  2. In the left-side navigation, click Roles.  
  3. Click Create role
  4. Under Trusted entity type, select the AWS account box, and then below, select Another AWS account
  5. For Account ID, enter a 12-digit number.
  6. Under Options, select Require external ID, and then enter an external ID.   
  7. Click Next
  8. Under Permissions policies, locate and select the newly created policy.
  9. Click Next
  10. Enter a role name and description, and then click Create role
  11. In the summary page, note the Role ARN value; you will need this information in a later step. 

Step 7: Create a Storage Integration

  1. To create a storage integration, review the following command.
    • For storage_aws_role_arn, enter the Role ARN value you copied earlier.
    • create storage integration "warehouse_storage_integration"
        type = external_stage
        storage_provider = s3
        enabled = true
        storage_allowed_locations = ('s3://<bucket name>/<optional prefix>/')
        storage_aws_role_arn = '<arn of the role  you created on the previous step>';
      
  2. To describe a storage integration, review the following command: 
    • desc integration "warehouse_storage_integration";
  3. Note the following values; you will need this information in a later step:
    Value Description
    STORAGE_AWS_IAM_USER_ARN This value is the AWS IAM user created for your Snowflake account, such as arn:aws:iam::123456789001:user/abc1-b-self1234.
    STORAGE_AWS_EXTERNAL_ID This value is the external ID that is needed to establish a trust relationship. 

Step 8: Modify the IAM Role

You must modify the IAM role to give Snowflake permission. 

  1. In the AWS console, access the newly created role. 
  2. Click the Trust relationships tab.
  3. Click Edit trust relationship
  4. Update the policy document with the output values you recorded earlier (STORAGE_AWS_IAM_USER_ARN and STORAGE_AWS_EXTERNAL_ID). 
  5. Review the following example: 
    • {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
              "AWS": "<snowflake_user_arn>"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
              "StringEquals": {
                "sts:ExternalId": "<snowflake_external_id>"
              }
            }
          }
        ]
      }
      

Step 9: Create a Stage and Pipe

  1. To create a stage, review the following command:
    • create stage "warehouse_stage"
      url = 's3://<bucket name>/<optional prefix>'
      storage_integration = "warehouse_storage_integration";
  2. To create a pipe, review the following command: 
    • create pipe "warehouse_integration_test"."test_schema"."test_pipe" auto_ingest=true as
        copy into "warehouse_integration_test"."test_schema"."test_table" from (
          select
            parse_json($1):host,
            parse_json($1):source_name,
            parse_json($1):source_type,
            parse_json($1):tag,
            parse_json($1):timestamp,
            to_timestamp(parse_json($1):timestamp::string),
            parse_json($1):raw as raw
      	from
           @"warehouse_integration_test"."test_schema"."warehouse_stage"   
        ) file_format = (type = 'JSON');
  3. To obtain the ARN of the SQS queue for the stage in the notification_channel column, run the following command: 
    • show pipes;

Step 10: Configure Event Notifications 

To configure an event notification for your S3 bucket, please review the this article from AWS.

Additionally, during this process, complete the following fields:

Field Description  
Name Enter the name of the event notification, such as Auto-ingest Snowflake.
Events Select the ObjectCreate (All) option.
Send to Select SQS Queue from the drop-down menu.
SQS Select Add SQS queue ARN from the drop-down menu.
SQS queue ARN Enter the SQS queue name you obtained from the show pipes output.

Step 11: Create a Snowflake Integration 

  1. In the Edge Delta App, on the left-side navigation, click Data Pipeline, and then click Integrations
  2. Under Add Integration, locate and select Snowflake
  3. Complete the missing fields.
    • For Endpoint, enter your Snowflake account locator. For example, if your account locator is ESL22010, then for endpoint, you would enter ESL22010.snowflakecomputing.com
  4. Click Save
    • After you create the integration, you can access the Log Search page of the Edge Delta App and select your integration from the Select a Source drop-down menu.

Share this document