Deploy a Data Product¶
The deployment of Data Products requires their integration into Sidra's Virtual Network (VNet). This step is essential to ensure seamless operation and connectivity within the network infrastructure.
The configuration of the Data Product will vary depending on whether Sidra was set up with the default VNet settings or integrated into an existing VNet. In cases where Sidra is installed within an external VNet, it is required to first configure the subnets for the Databricks resource. This step ensures the successful deployment of the Data Product via the Sidra Web Manager, which will require the names of these subnets.
However, if Sidra was installed using the default VNet configuration, this initial step is unnecessary, and one can proceed directly to the creation of a Data Domain:
1. Configure subnets in the existing Sidra VNet¶
In a Sidra installation, within the primary Resource Group (RG), there is a Virtual Network
(VNet) resource. This resource will be named using the format <deploymentPrefix>
-<environmentId>
-vnet (example: sds-test-vnet).
In the Settings
section, the Subnets
panel lists all the subnets of the VNet. By clicking on + Subnet
, an Azure form will be pop-up to create a new subnet. The example below displays the two subnets already created, required for the Data Product deployment.
On the following sections the required subnets and their configurations will be listed.
1.1 Databricks private and public subnets¶
The Data Product requires new subnets on the VNet:
- The Databricks cluster private subnet. The name may be
<dataproductname>_databricks_cluster
. - The Databricks cluster public subnet. The name may be
<dataproductname>_databricks_host
.
Both subnets configuration must take into account:
-
Service Endpoints
:- Microsoft.KeyVault
- Microsoft.Storage
-
Subnet Delegation
:- Microsoft.Databricks/workspaces
1.2 Associate subnets to the Sidra Network Security Groups¶
By clicking on each of the NSGs, go to Settings
section and Subnets
option. Here the association between the Sidra NSGs and the previously created private and public subnets can be performed.
1.3 Check existing VNet subnets¶
The rest of resources of the Data Product are connected to the VNet through the existing subnets of Sidra.
2. Create a Data Domain¶
-
In Sidra Web Manager, access to the Data Product section, click on New Data Product and select the Data Product type to deploy, in this case, Data Domain:
-
Configure Data Product
section contains the following fields:Data Product name
. Required field. It must be a name that represents the purpose of the Data Product. Character length is limited to between 1 and 15 characters.Data Product description
. Optional field for the description of the Data Product. Character length is limited to between 1 and 200 characters.Resource name prefix
. Required field. It names the resources created for this Data Product (<prefix>-<environment>-<resourceTypeSufix>
). Do not start the value with numbers or special characters. Character length is limited to 4 characters. For example,DP01
.Resource deployment environment
. Required field. Environment in which the Data Product will be deployed, for example:qa
,dev
,test
,prod
, etc. Character length is limited to 4 characters.Data Product resource group name
. Required field. It refers to the new Data Product resource group that will be created during the Data Product deployment with all the resources. Character length is limited to between 1 and 15 characters. Although for our example the name isData.Domain.DP01
, a recommended naming convention isSidra.DP.<DPName>
to enhance identification purposes.
-
If Sidra was installed using the default VNet configuration this step can be skipped. Otherwise, set up in the next section
Configure Data Product VNet
the Databricks private subnet and Databricks public subnet created previously. -
Check your new Data Product in Data Products section in Sidra Web.
3. Check your Data Product in the Azure Portal¶
After the above process, check your resources groups. Approximately 15 minutes after the creation in Sidra Web, all resources are completely provisioned within their respective resource groups. The structure is as follows: