Skip to content

Deploy an additional Data Storage Unit

This manual explains the process for deploying an additional Data Storage Unit (DSU) within Sidra. The initial step involves setting up multiple subnets, which is essential for supplying the API endpoint with the details required for the new DSU deployment.

1. Configure subnets

In a Sidra installation, within the primary Resource Group (RG), there is a Virtual Network (VNET) resource. This resource will be named using the format <deploymentPrefix>-<environmentId>-vnet (example: sds-test-vnet).

In the Settings section, the Subnets panel lists all the subnets of the VNET. By clicking on + Subnet, an Azure form will be pop-up to create a new subnet.

subnet_configuration

On the following sections the required subnets and their configurations will be listed.

1.1 Databricks private subnet

The new DSU requires a new subnet on the VNET, the Databricks cluster private subnet, used by the Core/Sidra Service.

The subnet configuration must take into account:

  1. Service Endpoints:

    • Microsoft.KeyVault
    • Microsoft.Storage
  2. Subnet Delegation:

    • Microsoft.Databricks/workspaces

1.2. Network Security Group for private subnet

A new Network Security Group (NSG) resource is required to be associated to the private subnet.

For the creation, the NSG must be on the RG where the VNET is. Create the NSG resource and name it as: <deploymentPrefix>-<environmentId>-vnet-nsg-dbw-cluster-<location> (example name: sds-test-vnet-nsg-dbw-cluster-ne).

1.2.1 Associate subnet to a NSG

Once the NSG is already created, on the resource Settings go to the Subnets section. Here the association of the NSG with the previously created private subnet can be performed.

associate_subnet_nsg subnet_configuration

1.3 Databricks public subnet

The new DSU requires also a second new subnet, the Databricks cluster public subnet, used by the Core/Sidra Service.

The subnet configuration must take into account:

  1. Service Endpoints:

    • Microsoft.KeyVault
    • Microsoft.Storage
  2. Subnet Delegation:

    • Microsoft.Databricks/workspaces

1.4 Network Security Group for public subnet

A second new NSG resource is required to be associated to the public subnet.

For the creation, the NSG must be on the RG where the VNET is. Create the NSG resource and name it as: <deploymentPrefix>-<environmentId>-vnet-nsg-dbw-host-<location> (example name: sds-test-vnet-nsg-dbw-host-ne).

Once the NSG resource has been deployed, repeat steps specified on the previous section to associate the subnet to the NSG.

1.5 Check existing VNET subnets

Sidra by default deploys some subnets on the VNET resource and they must have the following configurations to be used with the additional DSU.

1.5.1 Default subnet

This is the default subnet used by the additional DSU, it might be the one already used in the first DSU or a new one called app.

The subnet's configuration must take into account the selection of these Service Endpoints:

  • Microsoft.KeyVault
  • Microsoft.Web
  • Microsoft.Storage
  • Microsoft.CognitiveServices

1.5.2 Storage subnet

This is the storage subnet of the new DSU, it might be the one already used in the first DSU or a new one and is called storage.

The subnet's configuration must take into account the selection of these Service Endpoints:

  • Microsoft.KeyVault
  • Microsoft.Storage

2. Request Supervisor API to deploy the additional DSU

Deploy the additional DSU through the next endpoint filling the specified body as it follows in the example:

# POST /api/Plugin/dsu
{
"resource_name_suffix": "dsuN",
"dsu_resource_group_name_suffix": "dsuN",
"databricks_private_subnet": "databricks_cluster_dsuN",
"databricks_public_subnet": "databricks_host_dsuN",
"default_subnet": "app",
"storage_subnet": "storage",
"resource_group_location": "westeurope"
}

There are required and optional parameters which will be explained in the following sections.

2.1 Endpoint required parameters

These are the descriptions for each property that needs to be filled out:

  • resource_name_suffix:
  • Also, is used as suffix for the resource names within the RG following the pattern: <deploymentPrefix>-<environmentId>-<resourceIdName>-<resource_name_suffix> (Example: sds-test-kva-dsu2).
  • As maximum, 4 characters are allowed.

  • dsu_resource_group_name_suffix: DSU Resource group name suffix. In order to provide a name for the new RG, this prefix will be appended to the primary RG name with the following pattern <primaryRGName>-<dsu_resource_group_name_suffix> (example: Sidra.Test.DSU2).

  • databricks_private_subnet: Name of the private subnet created for the Databricks cluster of the new DSU (previously configured on the linked section).

  • databricks_public_subnet: Name of the public subnet created for the Databricks cluster of the new DSU (previously configured on the linked section).

  • default_subnet: Name of the checked subnet on the previously configured default subnet section.

  • storage_subnet: Name of the checked subnet on the previously configured storage subnet section.

2.2 Endpoint optional parameters

Other optional parameters could be added:

  • resource_group_location: DSU Resource group location. By default the new RG for the DSU will be the same as the primary RG. Specifying this parameter the default option will be override.

Sidra API endpoints related with Plugins can be checked in the related documentation page.


Last update: 2024-04-08