Fixed
Pinned fields
Click on the next to a field label to start pinning.
Details
Components
Assignee
Richard ChapmanRichard ChapmanReporter
Attila VamosAttila VamosPriority
MinorCompatibility
MinorFix versions
Pull Request URL
Details
Details
Components
Assignee
Richard Chapman
Richard ChapmanReporter
Attila Vamos
Attila VamosPriority
Compatibility
Minor
Fix versions
Pull Request URL
Created March 1, 2016 at 11:23 AM
Updated March 23, 2016 at 11:44 AM
Resolved March 23, 2016 at 11:44 AM
From Jose Bello:
Hi, I sprayed a UTF16LE XML file to thor but I receive this error when I try to read it. Any ideas why? Is there an option in the dataset statement they I should be using? Below is a screenshot of the first two bytes of the file in fileview. System error: 2: Error - syntax error "Unsupported unicode detected in BOM header" [file offset 2] (//1.1.1.129:7100/var/lib/HPCCSystems/hpcc-data/thor/thor_data400/in/aurorapd/crash_xml._1_of_40) (in Xml Read G1 E2) Here is the dataset statement dataset('~thor_data400::in::aurorapd::crash_xml',r,xml('CRASH_XML/CRASH_TABLE/CRASH'));
I got the input file (batch_mCO00A_i340753_d20160226120000_CRASH_2425_20160226_102324527.xml) sprayed it and run this code:
r:=record STRING dataProviderId { XPATH('dataProviderId' )}; STRING caseNumber { XPATH('caseNumber' )}; STRING reportNumber { XPATH('reportNumber' )}; STRING reportDate { XPATH('reportDate' )}; STRING address { XPATH('address' )}; STRING county { XPATH('county' )}; STRING city { XPATH('city' )}; STRING state { XPATH('state' )}; STRING x_coordinate { XPATH('x_coordinate' )}; STRING y_coordinate { XPATH('y_coordinate' )}; STRING coordinate { XPATH('coordinate' )}; STRING hitAndRun { XPATH('hitAndRun' )}; STRING intersectionRelated { XPATH('intersectionRelated' )}; STRING officerName { XPATH('officerName' )}; STRING crashType { XPATH('crashType' )}; STRING locationType { XPATH('locationType' )}; STRING accidentClass { XPATH('accidentClass' )}; STRING specialCircumstance1 { XPATH('specialCircumstance1' )}; STRING specialCircumstance2 { XPATH('specialCircumstance2' )}; STRING specialCircumstance3 { XPATH('specialCircumstance3' )}; STRING lightCondition { XPATH('lightCondition' )}; STRING weatherCondition { XPATH('weatherCondition' )}; STRING surfaceType { XPATH('surfaceType' )}; STRING roadSpecialFeature1 { XPATH('roadSpecialFeature1' )}; STRING roadSpecialFeature2 { XPATH('roadSpecialFeature2' )}; STRING roadSpecialFeature3 { XPATH('roadSpecialFeature3' )}; STRING surfaceCondition { XPATH('surfaceCondition' )}; STRING trafficControlPresent { XPATH('trafficControlPresent' )}; STRING narrative { XPATH('narrative' )}; STRING quarantined { XPATH('quarantined' )}; STRING action { XPATH('action' )}; end; dataset('~batch_mco00a_i340753_d20160226120000_crash_2425_20160226_102324527.xml_copy',r,xml('CRASH_XML/CRASH_TABLE/CRASH'));
The result is same error message.
@Jose Bello @Gavin Halliday @Jacob Cobbett-Smith I have some questions:
Do we support UTF16LE/BE in our engine and that error message caused by a bug?
The XML file has a complex structure
<CRASH_XML> <CRASH_TABLE> <CRASH> </CRASH> <CRASH> </CRASH> ... </CRASH_TABLE> <PERSON_TABLE> ... </PERSON_TABLE> <VEHICLE_TABLE> ... </VEHICLE_TABLE> </CRASH_XML>
and I'm not sure this simple dataset instruction is good to read a subset (CRASH records from CRASH_TABLE) from that file.