Priority task #2809
closedImprove import tasks from CSV - handle BOM
100%
Description
When the CSV contains BOM (https://en.wikipedia.org/wiki/Byte_order_mark), which is very usual for files created on Windows, then the file can't be imported and the exception is very misleading, it says that the column is not found.
The BOM is almost nowhere visible - not on Windows, not in vim, not in "less" command. It can be seen during diff.
Please support also files with BOM, so it gets ignored during import.
Workaround: add some first dummy column to the file, all other columns will be handled well. If the task supports dummy columns.
Related issues
Updated by Alena Peterová over 3 years ago
- Related to Feature #1746: Configurable encoding, support for BOM added
Updated by Tomáš Doischer over 3 years ago
- Assignee set to Tomáš Doischer
- Target version set to 3.2.0
This should be quite straightforward because BOMInputStream handles BOM only if present and so there should be no need to handle exceptions if the original reader fails (https://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/input/BOMInputStream.html). But this needs to be tested.
Updated by Tomáš Doischer over 3 years ago
- Status changed from New to Needs feedback
- % Done changed from 0 to 80
Support for BOM in CSV was added. I also added a test - there is now a file with BOM (created in Sublime Text). I added the test for one of the tasks which extends AbstractCsvImportTask, not for AbstractCsvImportTask itself because that would be a bit of pain to do.
I didn't touch ImportAutomaticRoleAttributesFromCSVExecutor because it is obsolete. I also didn't change the documentation - this was a bug which was not mentioned before.
Code in branch: https://github.com/bcvsolutions/czechidm-extras/compare/doischer/2809-fix-issues-with-bom-in-csv-imports
@apeterova, can you give me feedback please?
Updated by Peter Štrunc over 3 years ago
- Status changed from Needs feedback to Closed
- Assignee changed from Tomáš Doischer to Peter Štrunc
- % Done changed from 80 to 100
LGTM, thanks for this fix. Merged it to develop.