How-to OleDB Dataprovider
Introduction
OleDB Dataprovider allows importing from different sources such as Excel or CSV files, databases and other systems.
Import from CSV files
File structure
The files from which you want to import are ideally located in one directory. This can look as follows, for example:
Import
Personal data.csv
Function data.csv
Organization data.csv
Employment data.csv
Configuration file
So that the OleDB DataProvider or more exactly the OleDB connection can be established cleanly, a schema.ini file must be present in the directory, which contains the metadata over the individual files. In it e.g. the delimeter can be configured.
[Personendaten.csv]
Format=Delimited(|)
HDR=Yes
MaxScanRows=0
[Funktionsdaten.csv]
Format=Delimited(|)
HDR=Yes
MaxScanRows=0
[Organisationsdaten.csv]
Format=Delimited(|)
HDR=Yes
MaxScanRows=0
[Anstellungsdaten.csv]
Format=Delimited(|)
HDR=Yes
MaxScanRows=0
Col1=Anstellungs_Id Integer
Col2=Start_Anstellung DateTime(yyyy.mm.dd)
Col3=Ende_Anstellung DateTime(yyyy.mm.dd)
Col4=Funktion Text
Format=Delimited(|)
Defines the delimiter of the CSV file. Here it’s defined as the pipe |
HDR=Yes
If YES, get column names from the first column to reference them by their name. If NO, columns are simply numbered
MaxScanRows=0
Defines how many rows are considered for defining the type of a column. 0 means it considers all the rows which is recommended, because it increases the precision of defining the data type.
Col3=Ende_Anstellung DateTime(yyyy.mm.dd)
Especially for columns with a date, it may happen that the automatic detection does not work and the date is always interpreted as text instead of as DateTime.
To get around this, you can specify what data type it is for each column.
Col3=Ende_Anstellung DateTime(yyyy.mm.dd)
means that column 3 should be read out as DateTime and is present in the original file in the format yyyy.mm.dd.
Important: You cannot specify a single column. If specification of the DataType for a column is needed, all columns must be defined.
Important: When using csv files AND if one of the column headers contain a dot (e.g. “Acad. Title”) in the column header name, this dot has to be replaced in the SQL query in the C# code with a hash # sign and it is recommended to use then the “as” statement to define a proper column name for further processing.
Examples:
Column name in CSV | SQL query in C# |
---|---|
Acad. Title | "SELECT [Acad# Title] as Akad_Titel FROM [my.csv] WHERE [foreign_id] = '{0}' |
Geb.Datum | "SELECT [Geb#Datum] as Geb_Datum FROM [my.csv] WHERE [foreign_id] = '{0}' |
Table1.Column4 | "SELECT [Table1#Column4] as DateOfBirth FROM [my.csv] WHERE [foreign_id] = '{0}' |
Connection String
Unlike an Excel file, for example, the connection string is not created on the file but on the folder. In the corresponding query, the file is then specified instead of the table name. Here is an example of a OleDbConnectionParameters
:
CustomConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\\temp;Extended Properties=\"Text;\""
The corresponding query call is then made as follows using an example for a OleDbCoreIdentityDataProviderConfiguration
:
GetForeignIdsCommand = "SELECT Personalnummer FROM Personendaten.csv"
To import a file in UTF-8 format you need to add CharacterSet=65001
to the end of the connection string. As an example:
Check CSV import locally with Powershell
For quickly checking if your SQL and the results are right, you can use the Powershell script located in C:\Git\DevOps\Scripts\PowerShellScripts\Test_OLEDB_Excel_CSV_Import.ps1 (copy below).
© ITSENSE AG. Alle Rechte vorbehalten. ITSENSE und CoreOne sind eingetragene Marken der ITSENSE AG.