How to add a new provider

A provider is a Sidra element that represents a collection of entities. It can be related to a unique data source or several of them that together conform a logical unit. Providers are stored in the Provider table in the metadata database.

There are two ways to create a new provider:

  • Create a SQL script to insert the new provider in the database.
  • Use the Sidra API endpoint to add new providers.

The method used will depend, for example, on having access to the source code of the solution -where the SQL scripts must be placed to be executed- or having to integrate with a third-party tool. In the latter case, the Sidra API is the approach recommended.

Provider information

This is the information about the providers that must be included when a new provider is stored in the metadata database:

Column Description
Id [Required] Provider identifier
ProviderName [Required] Name of the provider. Spaces are not allowed and it is usually wrote in camel case, e.g. MyNewProvider
DatabaseName [Required] Name of the database in the metastore of the Data Storage Unit (DSU). Spaces are not allowed and it is usually named following this convention: "dw_xyz", where "dw" stands for Data Warehouse
Description [Optional] Description of the provider
Owner [Optional] Identification of the person responsible of the provider, it can be a name, an email...
IdDataLake [Required] Id of the DSU -previously called DataLake- where the entities of the provider will be stored
ParentSecurityPath [Optional] The security path of the parent following the metadata model hierarchy.
CreationDate [Required] Date of creation of the provider

Add provider using a SQL script

Add a SQL script following the naming conventions to the Scripts\CoreContext folder or to the place configured in the DatabaseBuilder from where it retrieves the scripts. A sample of the script can be found below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- DECLARE
DECLARE @Id_Provider INT = 
(
    SELECT COALESCE(MAX([Id]) +1, 1)
    FROM [DataIngestion].[Provider]     
)
DECLARE @Id_Dsu INT = 
(
    SELECT TOP 1 [Id] 
    FROM [DataIngestion].[DataLake] 
    WHERE [Name] = 'Name of the DSU where the entities of the provider will be stored'
)

-- ROLLBACK
DELETE FROM [DataIngestion].[Provider]
WHERE [Id] = @Id_Provider

-- SCRIPT
SET IDENTITY_INSERT [DataIngestion].[Provider] ON 
INSERT [DataIngestion].[Provider] (
    [Id],
    [ProviderName], 
    [DatabaseName], 
    [Description], 
    [Owner], 
    [IdDataLake],
    [ParentSecurityPath],
    [CreationDate]) 
VALUES (
    @Id_Provider,
    N'MyNewProvider', 
    N'dw_newprovider', 
    N'Description of MyNewProvider', 
    N'John Doe - jdoe@email.com', 
    @Id_Dsu,
    @Id_Dsu,
    GETUTCDATE())
SET IDENTITY_INSERT [DataIngestion].[Provider] OFF

Add provider using Sidra API

Sidra API requires requests to be authenticated, the section How to use Sidra API explains how to create an authenticated requests. For the rest of the document, it is going to be supposed that Sidra API is deployed in the following URL:

1
https://core-mycompany-dev-wst-api.azurewebsites.net

This is the sequence of requests requiered to create a new provider, some of the request are used to gather information about the provider, if that information is already available, those requests will not be necessary.

Step 1. Get the identifier of the Data Storage Unit (DSU)

Before creating a provider, it is required to know the Id of the DSU where the data of the entities associated to the provider will be stored. If the identifier is already known, this step can be skipped. If not, this is the request retrieve all the DSUs.

Request

1
GET https://core-mycompany-dev-wst-api.azurewebsites.net/api/datalake/datalakes?api-version=1.0

It will return an object with the list of DSUs, including their Ids.

Response

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
{
    "TotalItems": "{TOTAL NUMBER OF ITEMS}",
    "Items": [{
            "id": 1,
            "name": "MyDSU",
            "resourceGroupName": "{RESOURCE GROUP NAME}",
            "clusterName": "{CLUSTER NAME}",
            "idClusterType": "{ID CLUSTER TYPE}",
            "idLocation": "{LOCATION ID}",
        }
    ]
}

From the previous response, the Id of the DSU can be retrived and used in the next step to populate the field idDataLake.

Step 2. Create a new provider

By using the Sidra API some of the fields of the provider are automatically populated -like Id, ParentSecurityPath or CreationDate-. Once the rest of the information of the provider is gathered, it can be created using this request.

Request

1
POST https://core-mycompany-dev-wst-api.azurewebsites.net/api/metadata/providers?api-version=1.0

And adding the following content as part of the body.

Request body

1
2
3
4
5
6
7
{
    "providerName": "MyNewProvider",
    "databaseName": "dw_newprovider",
    "owner": "John Doe - jdoe@email.com",
    "description": "Description of MyNewProvider",
    "idDataLake": 1
}

The response will return the Id of the Provider created, which can be used, for example, for adding a new entity.

Response

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
{
    "id": 10,
    "tags": [{
            "name": "string"
        }
    ],
    "providerSize": 0,
    "dataLakeId": 1,
    "creationDate": "2020-01-22T18:03:45.938Z",
    "owner": {
        "name": "John Doe - jdoe@email.com",
        "picture": "string"
    },
    "providerName": "MyNewProvider",
    "databaseName": "dw_newprovider",
    "description": "Description of MyNewProvider",
    "idDataLake": 1
}