Skip to content

Deploy a Data Product

The deployment of Data Products requires their integration into Sidra's Virtual Network (VNet). This step is essential to ensure seamless operation and connectivity within the network infrastructure.

The configuration of the Data Product will vary depending on whether Sidra was set up with the default VNet settings or integrated into an existing VNet. In cases where Sidra is installed within an external VNet, it is required to first configure the subnets for the Databricks resource. This step ensures the successful deployment of the Data Product via the Sidra Web Manager, which will require the names of these subnets.

However, if Sidra was installed using the default VNet configuration, this initial step is unnecessary, and one can proceed directly to the creation of a Data Domain:

1. Configure subnets in the existing Sidra VNet

In a Sidra installation, within the primary Resource Group (RG), there is a Virtual Network (VNet) resource. This resource will be named using the format <deploymentPrefix>-<environmentId>-vnet (example: sds-test-vnet).

In the Settings section, the Subnets panel lists all the subnets of the VNet. By clicking on + Subnet, an Azure form will be pop-up to create a new subnet. The example below displays the two subnets already created, required for the Data Product deployment.

subnet_configuration

On the following sections the required subnets and their configurations will be listed.

1.1 Databricks private and public subnets

The Data Product requires new subnets on the VNet:

  • The Databricks cluster private subnet. The name may be <dataproductname>_databricks_cluster.
  • The Databricks cluster public subnet. The name may be <dataproductname>_databricks_host.

Both subnets configuration must take into account:

  1. Service Endpoints:

    • Microsoft.KeyVault
    • Microsoft.Storage
  2. Subnet Delegation:

    • Microsoft.Databricks/workspaces

1.2 Associate subnets to the Sidra Network Security Groups

By clicking on each of the NSGs, go to Settings section and Subnets option. Here the association between the Sidra NSGs and the previously created private and public subnets can be performed.

associate_subnet_nsg subnet_configuration

1.3 Check existing VNet subnets

The rest of resources of the Data Product are connected to the VNet through the existing subnets of Sidra.

2. Create a Data Domain

  1. In Sidra Web Manager, access to the Data Product section, click on New Data Product and select the Data Product type to deploy, in this case, Data Domain:

    Data Domain Created

  2. Configure Data Product section contains the following fields:

    • Data Product name. Required field. It must be a name that represents the purpose of the Data Product. Character length is limited to between 1 and 15 characters.
    • Data Product description. Optional field for the description of the Data Product. Character length is limited to between 1 and 200 characters.
    • Resource name prefix. Required field. It names the resources created for this Data Product (<prefix>-<environment>-<resourceTypeSufix>). Do not start the value with numbers or special characters. Character length is limited to 4 characters. For example, DP01.
    • Resource deployment environment. Required field. Environment in which the Data Product will be deployed, for example: qa, dev, test, prod, etc. Character length is limited to 4 characters.
    • Data Product resource group name. Required field. It refers to the new Data Product resource group that will be created during the Data Product deployment with all the resources. Character length is limited to between 1 and 15 characters. Although for our example the name is Data.Domain.DP01, a recommended naming convention is Sidra.DP.<DPName> to enhance identification purposes.
  3. If Sidra was installed using the default VNet configuration this step can be skipped. Otherwise, set up in the next section Configure Data Product VNet the Databricks private subnet and Databricks public subnet created previously.

  4. Check your new Data Product in Data Products section in Sidra Web.

    Data Product Created

3. Check your Data Product in the Azure Portal

After the above process, check your resources groups. Approximately 15 minutes after the creation in Sidra Web, all resources are completely provisioned within their respective resource groups. The structure is as follows:

  • Data.Domain.DP01

    data product rg

  • Data.Domain.DP01-dp01-qa-dbr (the managed resource group by the Azure Databricks resource)