Skip to content

Data Intake types

As it follows, there can be several types of Data Intake including the Data Intake Processes via connector plugins.

1. Data Intake via landing zone

In certain types of data flows, such as for the generic batch file intake process, the landing zone is used as the starting point for file ingestion.

Some key features about a Data Intake from the landing zone would be:

  • Sidra incorporates already deployed out-of-the-box pipelines for the file ingestion from the Landing Zone. In this case, it is not necessary to perform an explicit deployment or manual execution of the pipelines.

  • This type of Data Intake is usually selected for certain types of data sources in semi-structured format among others. Thus, this DIP type can encompass the following scenarios:

    1. When the data can be deposited through some external data extraction process in e.g., .parquet or .csv format.
    2. There can be also a data ingestion from Excel files, which is a specialized sub-type of data ingestion from landing, that requires additional scripts for metadata configuration and a specialized DSU ingestion script.
    3. When there is a separate data extraction pipeline developed that actually extracts semi-structured data files (e.g., JSON) from services or APIs.

Step-by-step

Asset flow into Sidra's platform via landing zone

For a detailed explanation of a complete Data Intake using the landing zone, please continue in this overview page .

2. Data Intake with document indexing

Sidra incorporates a separate process for binary file (document) ingestion that is a bit different from the above process.

Although some stages as file registration and file ingestion are also performed in this type of data intake, there are some significant differences as per the indexing processing steps that need to happen for the files.

Azure Search is the key service that will be responsible for applying cognitive skills on the binary files (documents). The process usually starts by depositing the files in a special landing zone container called indexlanding.

Step-by-step

How binaries are ingested and the Knowledge Store

The binary file ingestion process is described in detail in this overview page and the information related to the Knowledge Store ingestion can be found here.

3. Data Intake Process via connector plugins

Other types of data ingestion flows extract directly the data from the data source (e.g. SQL Database) to the raw format in the Data Storage Unit, without the intermediate step of depositing them to the landing zone.

This can be done thanks to the developed Sidra connector plugins. More information can be seen in upcoming documentation.

Step-by-step

About Sidra connector plugins

For a detailed explanation of a complete Data Intake Process using connector plugins, please continue in this overview page .


Sidra Ideas Portal


Last update: 2022-09-29
Back to top