Data Intake types¶

As it follows, there can be several types of Data Intake including the Data Intake Processes via connector.

1. Data Intake via landing zone¶

In certain types of data flows, such as for the generic batch file intake process, the landing zone is used as the starting point for file ingestion.

Some key features about a Data Intake from the landing zone would be:

Sidra incorporates already deployed out-of-the-box pipelines for the file ingestion from the Landing Zone. In this case, it is not necessary to perform an explicit deployment or manual execution of the pipelines.
This type of Data Intake is usually selected for certain types of data sources in semi-structured format among others. Thus, this data intake type can encompass the following scenarios:
1. When the data can be deposited through some external data extraction process in e.g., .parquet or .csv format.
2. When there is a separate data extraction pipeline developed that actually extracts semi-structured data files (e.g., JSON) from services or APIs.

Step-by-step

Asset flow into Sidra's platform via landing zone

For a detailed explanation of a complete Data Intake using the landing zone, please continue in this overview page .

2. Data Intake with document indexing¶

Sidra incorporates a separate process for binary file (document) ingestion that is a bit different from the above process.

Although some stages as file registration and file ingestion are also performed in this type of data intake, there are some significant differences as per the indexing processing steps that need to happen for the files.

Azure Search is the key service that will be responsible for applying cognitive skills on the binary files (documents). The process usually starts by depositing the files in a special landing zone container called indexlanding.

Step-by-step

How binaries are ingested and the Knowledge Store

The binary file ingestion process is described in detail in this overview page and the information related to the Knowledge Store ingestion can be found here.

3. Data Intake Process via connectors¶

Other types of data ingestion flows extract directly the data from the data source (e.g. SQL Database) to the raw format in the Data Storage Unit, without the intermediate step of depositing them to the landing zone.

This can be done thanks to the developed Sidra connectors. More information can be seen in upcoming documentation.

Step-by-step

About Sidra connectors

For a detailed explanation of a complete Data Intake Process using connectors, please continue in this overview page .