Solved

read file connector UTF-8 and UTF-8-BOM ; indicium versus Windows GUI

  • 30 September 2022
  • 3 replies
  • 82 views

Userlevel 3
Badge +11

I am reading xml files in a system flow with the new read file connector. I have a few files that are being read correctly and a few that are corrupted. The difference between the files is the encoding: the 'good’ files are UTF-8 encoded and the others are UTF-8-BOM. The latter is not read correctly, resulting in a string that looks like this:

?<?xml version="1.0" encoding=

The old read disk file does the same. However, my current Proces flow is working correctly.

I have the feeling that the connectors behave different when comparing the Windows GUI (proces flow) and Indicium (system flow). I have noticed this before when I was trying to read a pdf file. The problem was the other way around: I was unable to read the pdf file correctly in the Windows gui but it works fine in a system flow.

Main question: how do I read both an UTF-8 and an UTF-8-BOM correctly using the read file connector in a system flow?

icon

Best answer by Hugo Nienhuis 30 September 2022, 13:57

View original

This topic has been closed for comments

3 replies

Userlevel 3
Badge +11

I have been able to bypass the problem by reading as varbinary and convert the file within SQL Server to UTF-8 with the statement: select @ubl_bestand_vc = convert(varchar(max), @ubl_bestand_data,0).

 

However, it might be something to look into

Userlevel 7
Badge +23

Thanks for sharing the solution 😄

We'll select your answer as Best answer and feel free to create a ticket for this in case you think we should look into this.

Userlevel 3
Badge +11

I am not going to write a ticket, but why can I choose Write preamble Yes/No in a Write file connector, and not do the same for a Read file connector? Maybe it belongs in the idea section?