What would be the best approach to process various CSV files on disk (1GB each) and import them.

  • 25 February 2022
  • 3 replies
  • 58 views

Userlevel 5
Badge +16
  • Thinkwise Local Partner Brasil
  • 389 replies

Hi. I have a question. I have various CSV files (updated every month) of 1GB (comma separated) how would I create a flow that would pick up this file to subsequently merge it into a table..  basically only inserts and updates. Anyone experience with this amounts of data?  In total it will be over 80 million records. 


3 replies

Userlevel 7
Badge +23

Hi Freddy,

I do have some experience that can help. It can help if those files are available in for instance Azure or on the database server file storage, that way you can make use of BULK INSERT.

But if that isn't the case and the files are somewhere else, it can still be done. In combination with a Read file connector, you can read the data from the file and put it all in one big variable; let's call it @file_data of type nvarchar(max). Note that nvarchar(max) can only be 2,1GB in size.

@file_data can then be used inside a (process) procedure to be processed. You can utilize the SQL function STRING_SPLIT to split the complete file data on first, the character “ ; ”, this will get you the rows assuming this is the separator, then string_split again on the character “ , ” (comma) to get the columns.

If the table-column structure of these CSV's does not change, you can create a staging table by simply creating the table inside your procedure hard-coded. In case the structure might change over time, you could decide to create a place in an application which holds the table-column structure to make the creation of the staging table flexible.

After that is all done, write a MERGE statement to either Update existing rows or Insert new rows :grin: . Hope this helps!

Userlevel 5
Badge +16

Hi @Mark Jongeling

Any chance you have been playing with the FORMATFILE option?  

I guess I'll try it with the bul insert and see how it performs.  

Userlevel 7
Badge +23

Hi @Mark Jongeling

Any chance you have been playing with the FORMATFILE option?  

I guess I'll try it with the bul insert and see how it performs.  

Hey Freddy,

I have not as the experience I've had with reading in such large files, the structure could change therefore I had to create a place in the end application which holds the table-column structure to make the creation of the staging table flexible.

Reply