How to create a new Sidra Data Product¶
This tutorial provides information about how to create, deploy and setup a Data Product from scratch. The overall steps to follow are:
- Step 1. Get all the required information - the parameter values - covered in Prerequisites section below.
- Step 2. Create the Data Product with the .NET SDK tooling: install the application template, then create a new project based on the template.
- Step 3. Populate the deployment variables, so the deployed app would be integrated with its relative Sidra Core and DSU deployments.
- Step 4. Setup the Data Factory pipelines to load the data from DSU into the Data Product, and thus keep the data in sync.
Prerequisites¶
Creating a Data Product means essentially creating a new .NET application, based on a Sidra-specific project template.
Sidra provides some Visual Studio templates to create the basic solution for a Data Product. Among the Sidra-provided Visual Studio templates are the following:
- Basic Data Product template, available in the NuGet package
Sidra.DotNetCore.ClientAppTemplate
. - DataLab Data Product template, available in the NuGet package
Sidra.DotNetCore.DataLabAppTeplate
.
These .NET app project templates also includes Continuous Integration/Continuous Deployment (CI/CD) pipelines or pipeline templates. These pipelines/templates would need to have their parameters populated; then the deployment pipeline must be run to have the Azure resources created and configured.
For this, the main prerequesites are:
- Having the .NET SDK installed on the machine - version 3.1 or later - is a prerequisite.
- You will also need the Git client installed.
- A Sidra installation is needed, along with its Azure DevOps space, and a Git repository for the Data Product that you will be creating.
Below, you'll find the values that are required for following this tutorial, at each of its stages. When checking the code, please replace the values inside curly braces {}
or angulare braces <>
with the proper variables defined below.
Step 1. The Git repository¶
- Make sure you have - or create - a Git repository in Azure DevOps, where the Data Product code will be uploaded, and from where the deployment pipelines will be run.
- Consider the Data Product naming. The folder name where the app will be created is relevant: it will determine the name of the solution and some of the Azure elements.
- Create a folder on the local disk where the Azure DevOps repository will be cloned. Clone the repository into your local machine, inside the created folder.
Note that there will be one specific branch per environment on the Azure DevOps repository where we want to deploy a Data Product (e.g. dev, test, prod).
Step 2. The Azure Resource Group¶
The Azure resources needed by the Data Product will be created in a Resource Group, in the same Azure subscription where the Sidra's Core and DSU Resource Groups are found. The Resource Group should follow any potential convention name applicable.
The Resource Group should have permissions with the required Azure AD applications that Sidra needs, and which are created at Sidra installation time through the CLI tool. In theory, these should be inherited from the current subscription:
- VSTS application: The Azure AD App Registration which ends on
.VSTS
as Owner/Contributor. - Manager application: The Azure AD App Registration which ends in
.Manager
as Contributor
Step 3. The .NET SDK project template¶
This is a one-time step; once completed, all subsequent Data Products that we will create from this template will use the .NET project template.
If a previous version of the template exists, you will need to uninstall it first.
1. Uninstall previous version(s), if applicable¶
-
Quickly check if there is any Sidra Data Product template installed, along your .NET SDK:
-
If there is a Sidra app project template, you can check the details, such as version number, using:
-
Uninstall a template by specifying its full namespace name:
2. Install the wanted version of project template¶
The template code project can be available as a NuGet feed (for example for the Sidra-provided Data Products DataLabs
or ClientApp
). This template code project can also be in a customer Azure DevOps repository, in Sidra project. A Sidra engineer will provide the NuGet/MyGet source URL for the project template; it will look like the --nuget-source
parameter below. The {FeedId}
parameter of that URL needs to be replace with its actual value.
Also, note that there are multiple project templates for Sidra Data Products. The names (and versions) of such project template will be provided by a Sidra engineer.
From this template project, an actual Data Product code project (instantiation of the template) will be created with just a dotnet
command, as seen below.
Obtaining the Data Product template (downloading and installation)
- The templates may be located within the
Sidra.DotNetCore
Git repository and released as NuGet packages. -
Assuming that the NuGet feed contains the NuGet package with the template, the template can be installed by running the following command in a CMD or PowerShell window:
where:-
NuGet Package Id
: should be the NuGet Package Id for a specific template adding the wanted version. Note that if the version is not specified, last one would be installed. -
NuGet Feed Url
: should be the URL of the NuGet feed.
-
- In this case, if the template code project is in an Azure DevOps repository, we will need to first
clone
such repository into our laptop. -
Once this is done, we will execute the following command:
where:PATH_TO_CLIENT_APP_TEMPLATE
: this is the local path where the repository that we just cloned is.
The installation of a new VS template will be done for a specific .NET Core SDK version. The template will be installed in the .NET Core SDK version configured for the path where the installation command is executed which can be checked with this command:
Once the templates are installed, a project can be created based on them. To list the available downloaded templates the following command could be executed:
A .NET Core template could include parameters to populate the solution created. There is a set of default parameters available that can be listed using the command:
Additionally, every template can define its own list of additional parameters which can be shown using the following command:Step 4. Create the Data Product on your disk¶
-
Navigate to the folder where the Git repository has been cloned selecting the right branch, in the Step 1.
-
Create a new .NET app project, based on the selected Sidra Data Product project template. Using a command-line like:
where:C:\> CD ExampleAnalysisClient C:\ExampleAnalysisClient> dotnet new sidra-app ` --appName "<App-Name>" ` --serviceConnectionName "<Service-Connection-Name>" ` --environment "<Environment-Name>" ` --force
App Name
: Internal application name; note how there are only alphanumerical characters allowed. Example:ExampleAnalysisClient
. Affects, for example, the deployment variable group name of the app.Service Connection Name
: Name of the Service Connection in Azure DevOps.Environment
: The name of the environment where the Data Product will be deployed. Maximum 5 characters. CommonlyDev
,Test
, orProd
.
-
If the command has been successfully executed, the result should be:
Data Products solution structure¶
When the solution created by the Data Product template is opened in Visual Studio, the code content should look like this list of components. Depending on the Data Product template there may be additional or some different components. This one is for the Basic Data Product (template sidra-app
). Generally, there should be a set of folders, the main source code folder (src
), plus other additional files.
Under the src
folder we can see the different projects of the .NET solution:
-
A deployment project named
Solution name.Deployment
, e.g.Sidra.Dev.Client.Deployment
. This is the same project that can be seen in the solutions for Sidra Core, for the deployment scripts of all the infrastructure in the Data Product. This project includes more or less PowerShell scripts depending on the complexity of the infrastructure deployed. -
A Sync webjob project named
Solution name.DataFactory.Sync
. This webjob is used to transparently synchronize metadata information between Core and the Data Product database. This is the mechanism by which the Data Product is made aware of the Assets ingested into Sidra Core. -
A DatabaseBuilder webjob project named
Solution name.Webjobs.DatabaseBuilder
. This is the same project that can be seen in the Sidra Core solutions, and the main purpose is for creating the needed tables and seed data in the Data Product database. -
A DataFactoryManager webjob project named
Solution name.Webjobs.DataFactoryManager.Client
. This is a project very similar to the DataFactoryManager project that can be seen in the Sidra Core solutions. The main purpose is for running the processes that will synchronize the database Data Factory metadata with real Azure Data Factory infrastructure. -
A Database project named
Solution name.Database
. This project contains the Data Product database. -
A WebApi project named
Solution name.WebApi
, e.g. Sidra.Dev.Client.WebApi. This project contains the Data Product API logic.
Below is a depiction of such project structure:
Apart from the src
folder, other important files present are:
azure-pipelines.yml
. This is the YAML file for build and release orchestration (see below section). It will be used to create a valid YAML Azure DevOps pipeline. This file is used in order to compile the solution and generate the required artifacts in the release steps. This YAML may as well require additional YAML files contained in theTemplates
folder.*.props
. These are the required .NET Core files that include references to the packages used to build the solution. They ensure that the last Sidra packages versions are used..sln
. The .NET solution file.
Step 5. Deployment variables¶
Sidra Data Product pipelines are designed to perform installation and deployment of the Data Products in a highly automated way, following a Continuous Integration/Continuous Deployment (CI/CD) process.
The deployment project described in this page usually includes the following:
-
One or several .ps1 PowerShell scripts. There will be generally one orchestrator script and there may be several other PowerShell scripts to deploy specific infrastructure components.
-
There may be an environment data .psd1 file with parameters about the infra components and settings. These settings are then internally used to populate an Azure DevOps variable group, which will be required by the release process. Depending on the Sidra version when the Data Product template was created, the deployment may not need to use the .psd1 and just require configuring the Azure DevOps variable group to work. The deployment may use this file.
Versions
For Sidra version 1.9 (2021.R2) new templates are created without the need of such .psd1 file (it will be empty), but directly using a DevOps variable group populated with such parameters.
Deploying a Data Product essentially means running its deployment pipeline, after the Git repository of the created Data Product is pushed into the Azure DevOps space in the respective branch where the Data Product needs to be deployed.
The deployment pipeline is parameterized with a variable group, in the Azure DevOps space. These deployment pipeline variables should be prepared before attempting to run the deployment pipeline.
The Data Product templates help automating the creation of these variables. The templates contain PowerShell scripts for creating and populating the variable groups with the required parameters. The way these PowerShell scripts work depends on the type of Data Product template. This specific template of Data Product is explained below.
PowerShell script¶
-
For our illustrated
sidra-app
- abbreviatingSidra.DotNetCore.ClientAppTemplate
- the created app contains a script that needs to be editted to contain all variables that the deployment needs and executed. The script will be located right in the folder where the app was created at Step 4, look forConfigureDevOps.ps1
:Some of the variables may contain sensitive info, such as passwords. We highly recommend that you do not commit to the Git repository those files that contain secrets.
-
Optionally move the above script file from the local Data Product folder to another location on the disk, outside the Git repository (see the security note above).
-
Then execute the scripts from that location. The command used for the execution is:
where:.\ConfigureDevOps.ps1 -organization https://dev.azure.com/ORGANIZATION -project PROJECT -environments ENV
-
ORGANIZATION
is the name of the organization in Azure DevOps. -
PROJECT
is the name of the Azure DevOps project, usuallySidra
, but may be a different name. -
ENV
is the name of the environment where we are deploying, e.g.,dev
,test
,prod
.
-
-
Look for the line where the deployment pipeline variable group is created, it contains:
-
Around (or in) that line, the deployment variables need to be populated. Edit to populate the variables with their values by creating the variable group:
where:-
NAME
is the name of the variable group. We do not need to indicate that as it will be created automatically from the environment. -
VAR_LIST
is the list of variables, where we will specify the values.
As introduced above, the continuous integration process requires to configure a build+release using a YAML file (
azure-pipelines.yml
).Info
-
For reference on how this line works, you can consult the inline help of the command:
-
In the parameters addendum below, find more details about how to obtain the deployment values or what they mean.
-
-
Once the variable values are populated, you may execute the PowerShell script. The script in itself requires some parameters too. If the last (optional) one -
environments
- is not provided, then the variables will be created for the three different default environments:Dev
,Test
, andProd
. You may execute the script for a specific environment using a command like:
Step 6. Upload into DevOps¶
-
Now that the application was created in the local copy of the Git repository, and that the deployment pipeline has been properly parameterized, the changes can be committed and the commit can be pushed to the Azure DevOps Git repository.
-
If this is done from the command-line instead of using Visual Studio or some other tooling:
Step 7. Execute deployment pipeline¶
The newly uploaded Data Product has an Azure DevOps deployment pipeline defined. Once this is run, the needed resources for the application are created.
The steps for the execution of this deployment pipeline are detailed below.
Note for old versions
For versions of the Data Product template still using the old mechanism of the environment data file (.psd1) we will still need to manually create an Azure DevOps variable group, even if it is empty and it will be later automatically populated by the deployment script. In order to do this we need to manually create a variable group with a dummy variable in it. Without doing this, the deployment will fail as the variable group needs to be manually created and cannot be done as part of a deployment script. In order to do this, we need to go to Azure DevOps > Pipelines > Library
and click on the button to create a new variable group:
The name of this variable group will be ENVIRONMENT.APP_NAME
. For example in the case of DataLab for a dev environment it would be dev.DataLab
. Once created, we enter into the variable group and we add a variable named Dummy
with value Dummy
:
Steps¶
-
Step 1. Creating the environment that the application will use is a must due to, otherwise, when the pipeline is launched, while executing the release, we will get an error not finding said environment.
-
Step 2. Now, the generated solution can be uploaded at the Git repository on DevOps, and later, associate the CD pipeline to it. From the repository, click on
Set up build
:
- Step 3. Select the option
Existing Azure Pipelines YAML
file:
- Step 4. The branch must be selected according to the environment where the installation is been setting up, as well as the path to the YAML file with the template pipeline, as seen below:
- Step 5. Confirm the changes and execute the pipeline to proceed with the deployment.
Thus, each time that there is a push of code to the repository, the build will be launched. If it is required any change in the build, the YAML file can be modified and the codes pushed to the repository.
The complete continuous workflow for the CI/CD when making changes to the Data Product would then be:
- The developer will push the source code changes to the Git repository.
- The build and release pipeline will be automatically triggered by the changes in the repository. The release will use the artifacts generated by the build to deploy the infrastructure and the metadata for the solution.
Note
Although everything is well configured, first time that the pipeline is launched one or more permissions will be requested to approve, for example:
- The pipeline must use the variables group indicated.
- The pipeline must use the environment created.
Step 8. Access permissions¶
Once the Data Product is in place, you will be able to see the created resource group in the Azure portal. In order to access the data from the data lake, access permissions need to be configured for this Data Product. The easiest way to set those permissions is to use the management UI that Sidra comes with:
- Go to front-end administrative application - the Sidra Manager console - and navigate to the Authorizations section.
- Among users and principals, identify the newly created Data Product; in our example, ExampleAnalysisClient.
- Give to the Data Product the permissions needed to access the Entities from the Data Lake. This is done by checking the boxes with the respective Entities.
Post-deployment tasks¶
Loading data: SQL database¶
-
There is an SQL database project inside the new created Data Product solution. An example script creates a
Staging
schema and puts inside the tables with the same structure defined in the Core database, that means, the same column types and order that the Entity which is defined for its Attributes.Convention: The staging table should be named with the following format:
<providerName>_<entityName>
. -
For adding the required business logic to generate the production tables on the Data Product, you need to call this logic inside an orchestrator stored procedure. You can create a stored procedure inside the staging schema with the following structure and complete it depending on the Data Product logic.
CREATE PROCEDURE [staging].[orchestrator]
@pipelineRunId NVARCHAR(100)
AS
SELECT 1
-- TODO: POPULATE WITH LOGIC
RETURN 0
GO
Example
The following code is just an example depicting the structure of the orchestrator stored procedure to use in the Data Product.
CREATE PROCEDURE [staging].[orchestrator] (@pipelineRunId NVARCHAR(100))
AS
BEGIN
DECLARE @IdPipeline INT = (SELECT TOP 1 IdPipeline FROM [Sidra].[ExtractPipelineExecution] WHERE [PipelineRunId] = @pipelineRunId)
DECLARE @IdLoad INT;
EXECUTE @IdLoad = [Sidra].[LoadProcessLog] 1, @IdPipeline, NULL, @pipelineRunId;
DECLARE @AssetOrder INT = 1;
DECLARE @MaxAssetOrder INT;
SELECT @MaxAssetOrder = COALESCE(MAX([AssetOrder]), 0) FROM [Sidra].[ExtractPipelineExecution] WHERE [PipelineRunId] = @pipelineRunId
BEGIN TRY
IF @MaxAssetOrder > 0
BEGIN
WHILE (@AssetOrder <= @MaxAssetOrder)
BEGIN
--- YOUR CODE HERE
--- YOUR CODE HERE
--- YOUR CODE HERE
SET @AssetOrder=@AssetOrder + 1
END
EXECUTE [Sidra].[LoadProcessLog] 2, @IdPipeline, @IdLoad, @pipelineRunId;
EXECUTE [Sidra].[UpdateAssetLoadStatus] @pipelineRunId, 101;
END
IF @MaxAssetOrder = 0
BEGIN
EXECUTE [Sidra].[LoadProcessLog] 3, @IdPipeline, @IdLoad, @pipelineRunId;
END
END TRY
BEGIN CATCH
DECLARE @Message nvarchar(4000) = ERROR_MESSAGE()
DECLARE @ErrorSeverity INT = ERROR_SEVERITY();
DECLARE @ErrorState INT = ERROR_STATE();
EXECUTE [Sidra].[LoadProcessLog] 0, @IdPipeline, @IdLoad, @pipelineRunId;
EXECUTE [Sidra].[UpdateAssetLoadStatus] @pipelineRunId, 102;
RAISERROR (@Message, @ErrorSeverity, @ErrorState);
END CATCH
END
Loading data: Pipeline definition¶
The below steps involve creating the actual instance of the pipeline from a Sidra-provided template in the Data Product database. This operation can be done through an SQL script run on the Data Product. For this:
-
Create a script that will be inside Solution name.Webjobs.DatabaseBuilder/Scripts/ClientContext folder in the project and name it with the format
<ORDER NUMBER>_
. Ensure that Data Product has read permissions on the Entities to be ingested and to include manually the EntityPipeline relationship between the pipeline and those Entities. Ensure as well that this file is marked with "Copy Always" in the build properties.The purpose of this script is to create a pipeline that would handle the data movement between the DSU and this Data Product, dropping the content into the Staging tables. As the convention was used, the pipeline will do this automatically.
For further information about how pipelines work, refer to Sidra Client pipelines and Add a new pipeline for a Data Product.
-
The pipeline will be created from one of following templates. In the case of DSUExtractionStorageAndSQLWithSQLAutoscalePipelineTemplate, the pipeline will automatically scale up the database when runnning the ingestion process to the plan specified in the Execution Parameters. After completing the ingestion, the pipeline will scale down the database to the original plan. On the other hand, DSUExtractionStorageAndSQLWithSQLAutoscalePipelineTemplate will not perform any scaling operation during the ingestion process.
Pipeline creation templates
-------------------- -- Variables -------------------- DECLARE @IdProvider INT = (SELECT Id FROM [Sidra].[Provider] WHERE ProviderName = '<PROVIDER NAME>') DECLARE @ExtractPipelineTemplateId INT = ( SELECT [Id] FROM [Sidra].[PipelineTemplate] WHERE [ItemId] = 'BF367329-ABE2-42CD-BF4C-8461596B961C' ) DECLARE @ExtractPipelineItemId UNIQUEIDENTIFIER = newid() DECLARE @IdPipelineSyncBehaviour INT = 1 -- Usually 1 or 4. See documentation for details --0 Ignored --1 LoadAllNewDates --2 LoadFirstNewDate --3 LoadLastNewDate --4 LoadPendingUpToValidDate -------------------- -- Cleanup -------------------- DELETE FROM [Sidra].[EntityPipeline] WHERE [IdPipeline] IN (SELECT [Id] FROM [Sidra].[Pipeline] WHERE [ItemId] = @ExtractPipelineItemId) DELETE FROM [Sidra].[Pipeline] WHERE [ItemId] = @ExtractPipelineItemId -------------------- -- Pipeline creation -------------------- INSERT INTO [Sidra].[Pipeline] ([ItemId], [Name], [ValidFrom], [ValidUntil], [IdTemplate], [LastUpdated], [LastDeployed], [IdDataFactory], [IsRemoved], [IdPipelineSyncBehaviour], [Parameters], [ExecutionParameters]) VALUES (@ExtractPipelineItemId, N'ExtractScalingDB', GETUTCDATE(), NULL, @ExtractPipelineTemplateId, GETUTCDATE(), NULL, 1, 0, 1, N'{}', N'{ "storedProcedureName": "[staging].[orchestrator]", "scaleUpCapacity":"<CAPACITY E.G. 50>", "scaleUpTierName":"<TIER NAME E.G. S2>" }') DECLARE @ExtractPipelineId INT = ( SELECT [Id] FROM [Sidra].[Pipeline] WHERE [ItemId] = @ExtractPipelineItemId ) -------------------- -- EntityPipeline relation creation -------------------- -- Relate pipeline with all the Entities from a Provider which Data Product has access to INSERT INTO [Sidra].[EntityPipeline]([IdEntity], [IdPipeline], [IsMandatory]) SELECT Id AS [IdEntity], @ExtractPipelineId, 1 FROM [Sidra].[Entity] WHERE IdProvider = @IdProvider
PROVIDER NAME
,CAPACITY E.G. 50
andTIER NAME E.G. S2
must be replaced by their corresponding values.-------------------- -- Variables -------------------- DECLARE @IdProvider INT = (SELECT Id FROM [Sidra].[Provider] WHERE ProviderName = '<PROVIDER NAME>') DECLARE @ExtractPipelineTemplateId INT = ( SELECT [Id] FROM [Sidra].[PipelineTemplate] WHERE [ItemId] = '19C95A0E-3909-4299-AEE1-15604819E2B0' ) DECLARE @ExtractPipelineItemId UNIQUEIDENTIFIER = newid() DECLARE @IdPipelineSyncBehaviour INT = 1 -- Usually 1 or 4. See documentation for details --0 Ignored --1 LoadAllNewDates --2 LoadFirstNewDate --3 LoadLastNewDate --4 LoadPendingUpToValidDate -------------------- -- Cleanup -------------------- DELETE FROM [Sidra].[EntityPipeline] WHERE [IdPipeline] IN (SELECT [Id] FROM [Sidra].[Pipeline] WHERE [ItemId] = @ExtractPipelineItemId) DELETE FROM [Sidra].[Pipeline] WHERE [ItemId] = @ExtractPipelineItemId -------------------- -- Pipeline creation -------------------- INSERT INTO [Sidra].[Pipeline] ([ItemId], [Name], [ValidFrom], [ValidUntil], [IdTemplate], [LastUpdated], [LastDeployed], [IdDataFactory], [IsRemoved], [IdPipelineSyncBehaviour], [Parameters], [ExecutionParameters]) VALUES (@ExtractPipelineItemId, N'ExtractScalingDB', GETUTCDATE(), NULL, @ExtractPipelineTemplateId, GETUTCDATE(), NULL, 1, 0, 1, N'{}', N'{ "storedProcedureName": "[staging].[orchestrator]" }') DECLARE @ExtractPipelineId INT = ( SELECT [Id] FROM [Sidra].[Pipeline] WHERE [ItemId] = @ExtractPipelineItemId ) -------------------- -- EntityPipeline relation creation -------------------- -- Relate pipeline with all the Entities from a Provider which Data Product has access to INSERT INTO [Sidra].[EntityPipeline]([IdEntity], [IdPipeline], [IsMandatory]) SELECT Id AS [IdEntity], @ExtractPipelineId, 1 FROM [Sidra].[Entity] WHERE IdProvider = @IdProvider
PROVIDER NAME
must be replaced by its corresponding value.Some key aspects about this script are:
-
Complementary to this script,
[Staging].[Orchestrator]
stored procedure must be already created. In fact, this information is used in both scripts in the Pipeline creation as one part of the Execution Parameters:"storedProcedureName": "[staging].[orchestrator]"
. -
For this pipeline to be executed automatically in Data Factory, reading permissions must be assigned previously to the application over those Entities to be ingested from Core to the client database.
-
Once the script for the pipeline creation has been executed and permissions have been given to the application over the Entities, the
DataFactory webjob
(in the App Service of the Data Product) will have to be executed to deploy the pipeline in Azure Data Factory.-
DataFactory webjob execution
In order to execute DataFactory webjob, we should click on the App Service resource, inside the Data Product Resource Group. Once there, at the left side menu, within the section
Settings
, select the option Webjobs:The list of webjobs should appear, and then, DataFactory webjob could be executed by clicking on it, and the clicking on Run, as shown in the image below:
-
-
-
A new pipeline will be created with the ItemId.
- For the pipeline to be automatically triggered from ADF, the relation between the Entities to be ingested and the pipeline itself must be established in EntityPipeline. This is required, together with the read permissions on the Entities. In other words, if the Data Product has permissions on an Entity or group of Entities, but the relation with the pipeline is not recorded in EntityPipeline, those Entities will not be included as part of the ingestion, and viceversa. The above script is creating relationships between the pipeline and all the Entities from the Provider that it has access to.
Delete Data Products¶
To delete the Data Products once no longer needed, the next steps should be applied:
- Remove the Resource Group created in Step 2.
-
Cleanup the client registration from the Sidra Core; otherwise, with a next deployment, a "409" error will occur because an application with same name would already exist.
Client registration cleanup
DELETE FROM [Auth].[ApiKeys] WHERE [Name] = 'ExampleAnalysisClient' DELETE FROM [Auth].[RoleSubjects] WHERE [SubjectId] IN (SELECT [Id] FROM [Auth].[Subjects] WHERE [Name] LIKE '%ExampleAnalysisClient%') DELETE FROM [Auth].[Subjects] WHERE [Name] LIKE '%ExampleAnalysisClient%' DELETE FROM [Apps].[App] WHERE [Name] = 'ExampleAnalysisClient' DELETE FROM [Auth].[Applications] WHERE [Name] = 'ExampleAnalysisClient'
Update Data Products¶
To update a Data Product properly, below, an example is shown for an upgrading to the version 1.9.2, being initially in the 1.9. The template chosen will be for Data Labs Data Product.
-
To download the Data Product template in the wanted version to do the upgrading, next command will be executed:
-
Now, we will have to generate the solution from said template, same way that was made for the installation process. This must be done in the Git repository in Azure DevOps, where the solution of the application is located. If the repository has not been cloned yet, it must be cloned and be placed in the wanted branch to update.
-
Next command will be executed to create the files that the downloaded template generates:
dotnet new sidra-datalab ` --appName "DataLab" ` --serviceConnectionName "ServiceConnectionName" ` –-Environment "YourEnvironment" ` --force
Where:
appName
: it must correspond with the name of th original application, in this example, DataLab.serviceConnectionName
: same service connection used in Sidra Core's environment, in this case, sidradev.Environment
: environment where the Data Product is placed.
-
Once last command has been executed, new modifications will have been applied to the new version of the application. In this step it is necessary to review the modifications with Git, specially the ones in the Database project avoiding thus excluding files that should be in that project.
-
These modifications must be uploaded to the Git repository, and then, automatically, DevOps pipeline will be launched completing the upgrading. This is an example of the files updated:
In upgradings of version 1.9 onwards, it must be verified that the updating of the file azure-pipelines.yml
is generating a pipeline definition with one unique environment (the specified in the command of the solution's creation), instead of three. An example of the result is shown as follows:
Addendum: Variables¶
Deploying a Sidra Data Product includes the deployment of its needed Azure resources. This is done by running deployment pipelines from the Azure DevOps space. These pipelines need various parameters: the deployment variables.
Below are the deployment variables needed by this specific Data Product type - sidra-app
, Basic SQL Data Product-, meaning Sidra.DotNetCore.ClientAppTemplate
. Other Data Product templates (types) may need other deployment variables, though some are common.
Deployment variables | Description |
---|---|
SubscriptionId | Retrieve it from the Azure Portal, in Subscriptions area. |
TenantId | This value can also be taken from Azure Portal, from the Directories + Subscriptions section. |
LogAnalyticsWorkspaceId | Obtain it from the Log Analytics workspace resource in the Core resource group, from the 'Properties' blade, accessible with the left side menu. Look for 'Workspace ID'. |
ApplicationInsightsInstrumentationKey | Obtain it from the Application Insights resource in the Core resource group, from the 'Properties' blade, accessible with the left side menu. Look for 'Instrumentation Key'. Also visible in the 'Overview' blade. |
CoreSqlServerName | Name of the SQL Server resource deployed within the Core resource group. Only the name of the resource itself is needed, not the full Server name (that ends in database.windows.net ). For example, if the server name is sds-core-dev-sdb.database.windows.net , the SQL Server resource name is only sds-core-dev-sdb . |
CoreWebAppName | App Service resource name for Web API, within the Core resource group. Only the name of the resource itself is needed, not the full host name (that ends in azurewebsites.net ). For example, if the host name is sds-core-dev-wst-api.azurewebsites.net , the App Service resource name is only sds-core-dev-wst-api . |
BaleaWebAppName | App Service resource name for Balea server, within the Core resource group. Only the name of the resource itself is needed, not the full host name (that ends in azurewebsites.net ). For example, if the host name is sds-core-dev-wst-bs.azurewebsites.net , the App Service resource name is only sds-core-dev-wst-bs . |
ManagerApplicationId | Application ID of the Manager front-end application, from the Azure Active Directory > App Registrations. This can also be obtained from the Key Vault resource deployed on Sidra's Core resource group. The value is stored as a secret named AADApplications--Manager--ApplicationId . |
ManagerApplicationSecret | Password - or App Registration Secret - for the Manager front-end application. This can be obtained from the Key Vault resource deployed on Sidra's Core resource group. The value is stored as a secret named AADApplications--Manager--Password . |
ManagerServicePrincialId | Service Principal ID of the Manager front-end application, from the Azure Active Directory. This can be obtained from the Key Vault resource deployed on Sidra's Core resource group. The value is stored as a secret named AADApplications--Manager--Password . |
DevOpsClientId | Client Id for the VSTS application registered in Azure AD. This can be obtained from the Key Vault resource deployed on Sidra's Core resource group. The value is stored as a secret named AADApplications--VSTS--ServicePrincialId . |
DevOpsClientSecret | Password - or Client Secret - for the VSTS application registered in Azure AD. This can be obtained from the Key Vault resource deployed on Sidra's Core resource group. The value is stored as a secret named AADApplications--VSTS--Password . |
SidraAPIClientID | Client Id needed to consume the Web API, within the Core resource group. This can be obtained from the Key Vault resource deployed on Sidra's Core resource group. The value is stored as a secret named ClientSettings--coreClient--ClientId . |
SidraAPIClientSecret | Password - or Client Secret - needed to consume the Web API, within the Core resource group. This can be obtained from the Key Vault resource deployed on Sidra's Core resource group. The value is stored as a secret named ClientSettings--coreClient--ClientSecret . |
IdentityServerEndpoint | Endpoint for Identity Server App Service resource deployed on Sidra's Core resource group. It can be obtained from 'Overview' blade, labelled with 'URL'. Example: https://sds-core-dev-wst-is.azurewebsites.net |
IdentityServerScope | Scopes for Identity Server App Service. This can be obtained from the Key Vault resource deployed on Sidra's Core resource group. The value is stored as a secret named ClientSettings--ClientUrls--IdentityServerScope . |
ProjectAssemblyName | Assembly name for the project. Minimum length of 5 characters. |
Environment | Sidra environment in which the Data Product will be deployed; for example: dev / test / prod , etc). |
SharedBatchAccountAccessKey | Unless there exists a Shared Batch Account, this value can be an empty string. To get the access key for a Batch Account, if it exists, go to the Azure resource within the 'Section' blade; either 'Primary Access Key' or 'Secondary Access Key' will do. |
DeployBatchAccount | $False or $True . |
UseElasticPool | $False or $True . |
InstallationSize | Can be one of S , M , L or XL . These tags represent the size of the deployed resources. |
ClientDatabaseName | Name the Azure SQL Database that will be created for this Data Product. |
ResourceGroupName | It refers to the new Data Product resource group that will be created during the Data Product deployment. This will be a newly created Resource Group that will hold the Data Product and its resources. Note: It is recommended to have a separate Resource Group for each Data Product. |
ResourceGroupLocation | The Azure Region - i.e. westeurope - where the Data Product resource group will be created. |
EmailForSecurityAlerts | E-mail address set to receive potential security alerts. |
EnvironmentDescription | Description of the Sidra environment in which the Data Product will be deployed. |
ApplicationName | Name for the Data Product. Minimum length required is 5 characters. It will influence the name of the deployment variable group and deployment pipeline stage. |
ShortProjectName | Identifier for the project. This value will be used as a prefix to compound the resource names, together with the ShortProductName, among other values such us the environment, acronym to identify the resource, etc. For example, the project name sds will generate resources for Data Product with name like sds-<ShortProductName>-<Environment>-<ResourceAcronym> . Maximum length of 5 chars. |
ShortProductName | Identifier for the product. It is used just after the ShortProjectName in the resource names. For example, the value bca , together with the ShortProjectName sds , will generate resource names like sds-bca-<Environment>-<ResourceAcronym> . A complete example for this could be the client SQL Server resource in the dev environment: sds-bca-dev-sdb . Maximum length of 5 chars. |
DeploymentOwner | E-mail address of the person doing the Data Product deployment. |
ClientSkuName | It can be passed as an empty string or it can be a particular SKU name; example: standard . |
ClientSkuCapacity | It can be passed as an empty string or it can be a particular SKU capacity. |
ClientSizeInGB | Client database size in GB; example: 250 . |