How to create a custom activity in Sidra

As commented in Custom activities overview, Sidra provides accelerators to easy the creation of new custom activities -in this document, as in most of the documentation, the term 'custom activity' will refer to the custom application, not to the Azure Data Factory (ADF) activity-:

  • The Visual Studio template available in PlainConcepts.Sidra.DotNetCore.CustomActivityTemplate generates the Visual Studio solution for a custom activity.
  • The NuGet package PlainConcepts.Sidra.DataFactory.CustomActivities.Common -shorten as the Common package- provides a set of base classes and helpers to read and manage the information that the custom activity receives from ADF.

Custom activity template

The section Create new solution explains how to install a new Visual Studio template and use it to generate a Visual Studio solution.

The template for generating a custom activity does not receive any additional parameter. The generic parameter --name can be used to configure the name of the output folder which will be also used as name of the custom activity.

For example, assuming that the template for custom activities is already installed, the following command can be used to create a custom activity named 'CA'.

1
2
3
dotnet new sidra-ca --name CA

The template "Plain Concepts Sidra Custom Activity" was created successfully.

If the solution is opened with Visual Studio, the content showed in the Solution Explorer will be:

tutorial-ca-solution-explorer

The previous solution contains two projects:

  • PlainConcepts.Sidra.DataFactory.CustomActivities.CA is a .NET Core class library project that will include the code that will be executed as custom activity. It will be called CA project for shorten.
  • PlainConcepts.Sidra.DataFactory.CustomActivities.CA.Tests is a .NET Core unit testing project using xUnit and Moq to unit test the CA project.

The CA project contains some boilerplate code and a reference to the Common package. The project is prepared to just configure the parameters and add the specific logic of the custom activity.

IMPORTANT

At this stage, the solution will not build because it is missing some .props files. The reason is that the previous projects are meant to be added to an existing project solution -it could be a Core solution or a Client app solution- that already contains those .props files.

Classes and helpers for custom activities

The CA project includes a set of classes that handles the JSON files activity.json, linkedServies.json and datasets.json which provide the input parameters to the activity. Those parameters can be classified into the 'referenceObjects' and the 'extendedProperties'. It also defines a set of base parameters that are available for all the custom activities.

Using those classes, it is very easy to access all the parameters not being necessary to write the code to deserialize the JSONs and accessing its properties every time that a new custom activity is created.

For example, having the following activity.json:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
{
    "name": "MyOwnCustomActivity",
    "type": "Custom",
    "policy": {
        "timeout": "7.00:00:00",
        "retry": 0,
        "retryIntervalInSeconds": 30,
        "secureOutput": false,
        "secureInput": false
    },
    "typeProperties": {
        "command": "powershell.exe -nologo -noprofile -command & { Add-Type -A 'System.IO.Compression.FileSystem'; [IO.Compression.ZipFile]::ExtractToDirectory('PlainConcepts.Sidra.DataFactory.CustomActivities.CA.zip', 'Deploy'); } ; c:\\dotnet\\dotnet Deploy\\PlainConcepts.Sidra.DataFactory.CustomActivities.CA.dll",
        "resourceLinkedService": {
            "referenceName": "AzureStorageLinkedService",
            "type": "LinkedServiceReference"
        },
        "folderPath": "/path_to_custom_activity",
        "extendedProperties": {
            "tableName": "Employee",
            "projectsNumber": "2",
            "role": "developer",
            "AzureKeyVaultLinkedServiceName": "MyAzureKeyVaultLinkedService",
            "AzureSqlLinkedServiceName": "MyAzureSqlLinkedService",
            "AzureSqlTableDatasetName": "MyAzureSqlTable"
        },
        "referenceObjects": {
            "linkedServices": [{
                    "referenceName": "MyAzureSqlLinkedService",
                    "type": "LinkedServiceReference"
                }, {
                    "referenceName": "MyAzureKeyVaultLinkedService",
                    "type": "LinkedServiceReference"
                }
            ],
            "datasets": [{
                    "referenceName": "MyAzureSqlTable",
                    "type": "DatasetReference"
                }
            ]
        }
    },
    "linkedServiceName": {
        "referenceName": "AzureBatchLinkedService",
        "type": "LinkedServiceReference"
    }
}

It can be seen that it includes the following parameters:

  • Six 'extendedProperties': tableName, projectsNumber, role, AzureKeyVaultLinkedServiceName, AzureSqlLinkedServiceName and AzureSqlTableDatasetName
  • Three 'referenceObjects': two linked services MyAzureSqlLinkedService and MyAzureKeyVaultLinkedService and a dataset MyAzureSqlTable

The code snips below show how to configure the provided classes to get access to the previous parameters.

Constants

The Constants class is used to define the names of the parameters. It can also be used to define default values in case a parameter wants to be treated as optional.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
public class Constants
{
    public const string TABLE_NAME_PARAMETER = "tableName";
    public const string PROJECTS_NUMBER_PARAMETER = "projectsNumber";
    public const string ROLE_PARMETER = "role";

    public const string AZURE_SQL_LINKED_SERVICE_NAME_PARMETER = "AzureSqlLinkedServiceName";
    public const string AZURE_SQL_TABLE_DATASET_NAME_PARMETER = "AzureSqlTableDatasetName";

    public const string DEFAULT_ROLE = "developer";
}

ActivityParameters

The ActivityParameters class is used to:

  • Define the C# properties that will receive the values of the parameters.
  • Assign default values to the previous properties.
  • Validate the values.

The ActivityParameters inherits from BaseParameter which is provided by the Common package.

BaseParameters

The BaseParameter class defines:

  • A virtual method called Validate, which must be overwritten in case it is necessary to validate the parameters passed to the activity. For example checking that mandatory parameters are present.
  • Some common parameters already defined to all custom activities:
    • AzureKeyVaultLinkedServiceName is the name of the linked service that references the Azure Key Vault that stores the sensitive information used by other linked services. For example a linked service to a SQL Server can store the connection string in the aforementioned Azure Key Vault.
    • IsKeyVaultEnabled is a flag that represents if the Key Vault is enabled. It is mandatory, so, if there are referenced objects that store sensitive data and the Key Vault is not enabled, the custom activity execution will throw an error.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public class ActivityParameters : BaseParameter
{
    // Add properties with the appropriate type for storing the 'ExtendedProperties'
    public string TableName { get; set; }
    public int ProjectsNumber { get; set; }
    public string Role { get; set; }

    // Add properties for storing the names of the 'ReferenceObjects'
    public string AzureSqlLinkedServiceName { get; set; }
    public string AzureSqlTableDatasetName { get; set; }

    public ActivityParameters()
    {
        // Add default values
        Role = Constants.DEFAULT_ROLE;
    }

    public override void Validate()
    {
        base.Validate();

        // Add validation of the values
        if (string.IsNullOrWhiteSpace(TableName))
        {
            throw new ArgumentException($"{Constants.TABLE_NAME_PARAMETER} parameter cannot be null nor empty.");
        }

        if (ProjectsNumber < 1 || ProjectsNumber > 5)
        {
            throw new ArgumentException($"{Constants.PROJECTS_NUMBER_PARAMETER} parameter must be an integer between 1 and 5.");
        }
    }
}

ActivityParametersResolver

The ActivityParametersResolver class is used to map and cast the raw values received from ADF in the activity.json file into the C# properties defined in ActivityParameters.

It inherits from ParameterResolver class which needs an object that implements ICustomActivityReferenceProvider for it instantiation. The dependency injector is configured to provide an implementation of ICustomActivityReferenceProvider called CustomActivityReferenceProvider that is included in the Common package.

ParameterResolver

The ParameterResolver class defines two methods:

  • Resolve must be overwritten. It creates an instance of ActivityParameters with all parameters mapped.
  • GetValue allows to obtain a parameter from the activity.json file by its name.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
public class ActivityParametersResolver : ParameterResolver
{
    public ActivityParametersResolver(ICustomActivityReferenceProvider customActivityReferenceProvider)
    : base(customActivityReferenceProvider)
    { }

    public override T Resolve<T>(string[] args)
    {
        var parameters = base.Resolve<ActivityParameters>(args);

        // Read the ExtendedProperties raw values by name.
        var tableNameRaw = GetValue(args, Constants.TABLE_NAME_PARAMETER);
        var projectsNumberRaw = GetValue(args, Constants.PROJECTS_NUMBER_PARAMETER);
        var roleRaw = GetValue(args, Constants.ROLE_PARMETER);
        var azureSqlLinkedServiceNameRaw = GetValue(args, Constants.AZURE_SQL_LINKED_SERVICE_NAME_PARMETER);
        var azureSqlTableDatasetNameRaw = GetValue(args, Constants.AZURE_SQL_TABLE_DATASET_NAME_PARMETER);

        // Cast the raw values to the appropriate type and asign
        if (!string.IsNullOrWhiteSpace(tableNameRaw))
        {
            parameters.TableName = tableNameRaw;
        }

        if (!string.IsNullOrWhiteSpace(projectsNumberRaw) && int.TryParse(projectsNumberRaw, out int projectsNumber))
        {
            parameters.ProjectsNumber = projectsNumber;
        }

        if (!string.IsNullOrWhiteSpace(roleRaw))
        {
            parameters.Role = roleRaw;
        }

        if (!string.IsNullOrWhiteSpace(azureSqlLinkedServiceNameRaw))
        {
            parameters.AzureSqlLinkedServiceName = azureSqlLinkedServiceNameRaw;
        }

        if (!string.IsNullOrWhiteSpace(azureSqlTableDatasetNameRaw))
        {
            parameters.AzureSqlTableDatasetName = azureSqlTableDatasetNameRaw;
        }

        return parameters as T;
    }
}

Activity

The Activity class contains the main code to be executed by the custom activity. It inherits from BaseActivity, which is a generic type and needs to be typed with an instance of BaseParameter.

BaseActivity

The BaseActivity defines:

  • Execute is virtual method that must be overwritten to include the specific business logic of the custom activity.
  • Methods to obtain the referenced objects by their names. For example, using the method GetAzureSqlDatabase and passing as parameter the AzureSqlLinkedService it retrieves the AzureSqlDatabase that can be used to connect to the database.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
public abstract class BaseActivity<T> where T : BaseParameter
    {
        protected readonly ICustomActivityReferenceProvider _customActivityReferenceProvider;
        protected readonly ILogger<BaseActivity<T>> _logger;
        protected readonly IConfiguration _configuration;

        public BaseActivity(ICustomActivityReferenceProvider customActivityReferenceProvider, ILogger<BaseActivity<T>> logger, IConfiguration configuration);

        public virtual void Execute(T parameters);
        protected AzureDatabricks GetAzureDatabricks(T parameters, string azureLinkedService);
        protected AzureSqlDatabase GetAzureSqlDatabase(T parameters, string azureLinkedService);
        protected AzureStorage GetAzureStorage(T parameters, string azureLinkedService);
        protected BatchAccount GetBatchAccount(T parameters, string azureLinkedService);
        protected string GetClusterType(string linkedServiceName);
        protected AzureDatabricks GetDatabricks(T parameters, string azureLinkedService);
        protected HDInsightCluster GetHdInsightCluster(T parameters, string azureLinkedService);
        protected HttpServer GetHttpServer(T parameters, string azureLinkedService);
        protected AzurePostGresSqlDatabase GetProgresDatabase(T parameters, string azureLinkedService);
        protected SftpServer GetSftpServer(T parameters, string azureLinkedService);
    }
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
internal class Activity : BaseActivity<ActivityParameters>
{
    public Activity(
        ICustomActivityReferenceProvider customActivityReferenceProvider,
        ILogger<Activity> logger,
        IConfiguration configuration)
        : base(customActivityReferenceProvider, logger, configuration)
    {

    }

    public override void Execute(ActivityParameters parameters)
    {
        base.Execute(parameters);

        _logger.LogInformation("Starting Creating or updating CA");

        // Add here the specific logic for the custom activity
        // For example, getting the connection string for the SQL Server referenced by the AzureSqlLinkedServiceName
        var sqlServerDatabase = GetAzureSqlDatabase(parameters, parameters.AzureSqlLinkedServiceName);
        var connectionString = sqlServerDatabase.ConnectionString;

        _logger.LogInformation("Finished CA");
    }
}

ActivityExecutor

The ActivityExecutor class provides the entrypoint to execute the custom activity which is the Execute method. It will be requested by the custom activity host project.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
public class ActivityExecutor
{
    public static void Execute(IConfiguration configuration,
        Action<ILoggingBuilder> loggingBuilder,
        string[] args)
    {
        var parameters = ResolveParameters(configuration, args);
        var serviceProvider = DependencyInjector.GetServiceCollection(configuration, loggingBuilder, parameters).BuildServiceProvider();
        var activity = serviceProvider.GetService<Activity>();

        activity.Execute(parameters);
    }

    public static ActivityParameters ResolveParameters(IConfiguration configuration, string[] args)
    {
        var parameters = DependencyInjector.ResolveParameters<ActivityParameters>(args, configuration);
        parameters.Validate();
        return parameters;
    }
}

DependencyInjector

The DependencyInjector class configures the dependency injector. If the custom activity does not use any linked service, the configuration provided by the Visual Studio template should be enough but if it uses any linked service then some additional configuration is necessary.

Azure Key Vault configuration

The secrets that linked services use to connect to the data sources -for example, the connection string to connect to a database in an Azure SQL linked service- should be stored in Azure Key Vault.

The custom activity needs to access to that Azure Key Vault in order to retrieve the secrets, so it will need an Azure Key Vault a linked service for that purpose. The name of that Azure Key Vault linked service will be configured in the AzureKeyVaultLinkedServiceName which is the parameter inherited by the BaseParameter class.

The code below shows the additional configuration to use the Azure Key Vault linked service:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
class DependencyInjector : BaseDependencyInjector
{
    internal static T ResolveParameters<T>(
        string[] args,
        IConfiguration configuration,
        Action<ILoggingBuilder> loggingBuilder = null) where T : BaseParameter, new()
    {
        var serviceProvider = GetServiceCollection(configuration, loggingBuilder).BuildServiceProvider();
        var parametersResolver = serviceProvider.GetService<ParameterResolver>();
        var parameters = parametersResolver.Resolve<T>(args);

        return parameters;
    }

    /// <summary>
    /// Gets the ServiceCollection with the dependencies.
    /// </summary>
    /// <returns></returns>
    internal static IServiceCollection GetServiceCollection(IConfiguration configuration,
        Action<ILoggingBuilder> loggingBuilder,
        ActivityParameters parameters = null)
    {
        var services = CreateServiceCollection(configuration, loggingBuilder);

        services.AddTransient<ICustomActivityDefinitionProvider, CustomActivityDefinitionProvider>();
        services.AddTransient<ICustomActivityReferenceProvider, CustomActivityReferenceProvider>();
        services.AddTransient<ParameterResolver, ActivityParametersResolver>();
        services.AddTransient<Activity, Activity>();

        // Configure these interfaces to use the AzureKeyVaultLinkedServiceName
        services.AddTransient<IConfigurationRepository, ConfigurationRepository>();
        services.AddTransient<IActiveDirectoryCredentialFactory, ADSecurityCredentialsProvider>();
        services.AddTransient<ITokenCredentialProvider, TokenCredentialProvider>();
        services.AddTransient<IAzureKeyVaultHelper, AzureKeyVaultHelper>();

        // Configure this DbContext if the custom activity is used in Core, use ClientContext if used in Client Apps 
        services.AddDbContext<CoreContext>(options =>
        {
            options.UseSqlServer(configuration.GetConnectionString(nameof(CoreContext)), setup =>
            {
                setup.MigrationsAssembly(typeof(CoreContext).Assembly.FullName);
            });
        });

        return services;
    }
}

Additionally it will be necessary to add a reference to the NuGet package PlainConcepts.SIDRA.DotNetCore.Common.

Integrating CA projects

CA projects can be managed in two different ways depending on if they are going to be included in the set of custom activities provided by Sidra or if they are going to be used exclusively in a specific Sidra installation.

Included in Sidra

Those CA projects that cover an activity generic enough to be used by several Sidra installations can be included in the set of custom activities provided by Sidra. The source code of those CA projects will be added to the Sidra repository and exposed as NuGet packages so any Sidra installation could use them by means of a host project as commented in Custom apps in Sidra.

Specific Sidra installation

The CA projects too specific or tight to a Sidra installation can be used directly. In that case, the source code of those CA projects will be added to the specific installation solution and a couple of changes must be performed:

  1. Change the type of the CA project to .NET Core console application instead of class library.
  2. Add a Program class that calls the entrypoint of the custom activity. The Program class is exactly the same that the one included in the host projects and can be seen in the Custom apps in Sidra.