Query using ECL Library results in a Roxie core dump

Environment

Alpha Dev

Description

We are exploring the deployment of ECL libraries to segregate some of the functionality for the more complicated Scoring queries.

 

We are doing a proof of concept in Alpha Dev Roxie environment.   We deployed a library and a query using the library, and when we execute the query, the roxie process core dumped.   

 

Here are some of the details:

 

ECLWatch (dev192 100-way roxie):   http://10.194.21.162:8010 

Library:  http://10.194.21.162:8010/?QuerySetId=roxie_eclcc&Id=iss_products_insurview_core.lib_insurview_bf.4&Widget=QuerySetDetailsWidget 

Query using library: http://10.194.21.162:8010/?QuerySetId=roxie_eclcc&Id=iss_core.service_iss_fcra_bf.12&Widget=QuerySetDetailsWidget 

 

Backtrace from the roxie log:

 

000091DD 2019-10-03 15:31:56.318 201411 208692 "================================================"
000091DE 2019-10-03 15:31:56.318 201411 208692 "Program: 10.194.192.8:/mnt/disk1/HPCCSystems/bin/roxie"
000091DF 2019-10-03 15:31:56.318 201411 208692 "Signal: 11 Segmentation fault"
000091E0 2019-10-03 15:31:56.318 201411 208692 "Fault IP: 00007FBD67F65454"
000091E1 2019-10-03 15:31:56.318 201411 208692 "Accessing: 00007FBF8A4EDF46"
000091E2 2019-10-03 15:31:56.318 201411 208692 "Backtrace:"
000091E3 2019-10-03 15:31:56.371 201411 208692 " /var/lib/HPCCSystems/queries/roxie_dev192/libW20191003-151228-3.so(+0x431454) [0x7fbd67f65454]"
000091E4 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN39CRoxieServerStrandedInlineTableActivity26InlineTableSimpleProcessor7nextRowEv+0xc4) [0x7fbf076eee84]"
000091E5 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN40CRoxieServerNormalizeLinkedChildActivity7nextRowEv+0x120) [0x7fbf07708c20]"
000091E6 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN26CRoxieServerFilterActivity7nextRowEv+0x3b) [0x7fbf076f126b]"
000091E7 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN26CRoxieServerFilterActivity7nextRowEv+0x3b) [0x7fbf076f126b]"
000091E8 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN30CRoxieServerLookupJoinActivity7nextRowEv+0x109) [0x7fbf07732a59]"
000091E9 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN32CRoxieServerThroughSpillActivity13OutputAdaptor7nextRowEv+0x369) [0x7fbf0772bac9]"
000091EA 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN30CRoxieServerLookupJoinActivity7nextRowEv+0x109) [0x7fbf07732a59]"
000091EB 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN37CRoxieServerStrandedAggregateActivity18AggregateProcessor7nextRowEv+0x66) [0x7fbf076f20e6]"
000091EC 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libthorhelper.so(_ZN16IEngineRowStream7readAllER23RtlLinkedDatasetBuilder+0x24) [0x7fbf06feb014]"
000091ED 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN36CRoxieServerLocalResultWriteActivity9onExecuteEv+0x37) [0x7fbf076dcd17]"
000091EE 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN32CRoxieServerInternalSinkActivity7executeEjPKh+0xaf) [0x7fbf0772abef]"
000091EF 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN20CRoxieServerActivity5startEjPKhb+0xf7) [0x7fbf076f6d07]"
000091F0 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN22CRoxieServerIfActivity5startEjPKhb+0x1e) [0x7fbf076f8ace]"
000091F1 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN20CRoxieServerActivity5startEjPKhb+0x11a) [0x7fbf076f6d2a]"
000091F2 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN32CRoxieServerInternalSinkActivity7executeEjPKh+0x8a) [0x7fbf0772abca]"
000091F3 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN20CRoxieServerActivity5startEjPKhb+0xf7) [0x7fbf076f6d07]"
000091F4 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN28CRoxieServerStrandedActivity5startEjPKhb+0xc) [0x7fbf076bb65c]"
000091F5 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN39CRoxieServerStrandedInlineTableActivity5startEjPKhb+0x1f) [0x7fbf0770c26f]"
000091F6 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN20CRoxieServerActivity5startEjPKhb+0x11a) [0x7fbf076f6d2a]"
000091F7 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN31CRoxieServerLibraryCallActivity13OutputAdaptor5startEjPKhb+0x37) [0x7fbf076d4017]"
000091F8 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN20CRoxieServerActivity5startEjPKhb+0x11a) [0x7fbf076f6d2a]"
000091F9 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN32CRoxieServerThroughSpillActivity5startEjjPKhb+0x188) [0x7fbf0772b298]"
000091FA 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN32CRoxieServerThroughSpillActivity13OutputAdaptor5startEjPKhb+0x5b) [0x7fbf076cd56b]"
000091FB 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN20CRoxieServerActivity5startEjPKhb+0x11a) [0x7fbf076f6d2a]"
000091FC 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN32CRoxieServerInternalSinkActivity7executeEjPKh+0x8a) [0x7fbf0772abca]"
000091FD 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libccd.so(_ZN14CActivityGraph10SinkThread10threadmainEv+0x1e) [0x7fbf076cd67e]"
000091FE 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libjlib.so(_ZN19CThreadedPersistent10threadmainEv+0x4d) [0x7fbf01ba258d]"
000091FF 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libjlib.so(_ZN19CThreadedPersistent8CAThread3runEv+0x10) [0x7fbf01ba8150]"
00009200 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libjlib.so(_ZN6Thread5beginEv+0x2c) [0x7fbf01ba193c]"
00009201 2019-10-03 15:31:56.371 201411 208692 " /opt/HPCCSystems/lib/libjlib.so(_ZN6Thread11_threadmainEPv+0x1e) [0x7fbf01ba300e]"
00009202 2019-10-03 15:31:56.371 201411 208692 " /usr/lib64/libpthread.so.0(+0x7e65) [0x7fbf002c2e65]"
00009203 2019-10-03 15:31:56.371 201411 208692 " /usr/lib64/libc.so.6(clone+0x6d) [0x7fbefffeb88d]"
00009204 2019-10-03 15:31:56.371 201411 208692 "Registers:"
00009205 2019-10-03 15:31:56.371 201411 208692 "EAX:00000000F9874383 EBX:0000000000001343 ECX:0000000086300E41 EDX:00007FBE90C79BC3 ESI:0000000000000004 EDI:0000000000001343"
00009206 2019-10-03 15:31:56.371 201411 208692 "R8 :00000000F9873D7B R9 :0000000000001337 R10:0000000000001333 R11:000000000000132F"
00009207 2019-10-03 15:31:56.371 201411 208692 "R12:000000000000007C R13:00007FBD629A1C08 R14:00007FBD629A1BD0 R15:00007FBF0512CC60"
00009208 2019-10-03 15:31:56.371 201411 208692 "CS:EIP:0033:00007FBD67F65454"
00009209 2019-10-03 15:31:56.371 201411 208692 " ESP:00007FBD629A1B50 EBP:0000000000000054"
0000920A 2019-10-03 15:31:56.371 201411 208692 "Stack[00007FBD629A1B50]: 0039A8A5A59AB246 C00318900039A8A5 00007FBEC0031890 0000134300007FBE 0000000000001343 C008B72000000000 00007FBEC008B720 C008B8C000007FBE"
0000920B 2019-10-03 15:31:56.371 201411 208692 "Stack[00007FBD629A1B70]: 00007FBEC008B8C0 C003036000007FBE 00007FBEC0030360 629A1BF000007FBE 00007FBD629A1BF0 049798B200007FBD 00007FBF049798B2 629A1C0800007FBF"
0000920C 2019-10-03 15:31:56.371 201411 208692 "Stack[00007FBD629A1B90]: 00007FBD629A1C08 C008B72000007FBD 00007FBEC008B720 0512CC7000007FBE 00007FBF0512CC70 629A1BF000007FBF 00007FBD629A1BF0 629A1C0800007FBD"
0000920D 2019-10-03 15:31:56.371 201411 208692 "Stack[00007FBD629A1BB0]: 00007FBD629A1C08 629A1BD000007FBD 00007FBD629A1BD0 0512CC6000007FBD 00007FBF0512CC60 076EEE8400007FBF 00007FBF076EEE84 A59ABA9A00007FBF"
0000920E 2019-10-03 15:31:56.371 201411 208692 "Stack[00007FBD629A1BD0]: 0039A8A5A59ABA9A C008B7480039A8A5 00007FBEC008B748 0000010100007FBE 0000000000000101 C00304B000000000 00007FBEC00304B0 0512CC7000007FBE"
0000920F 2019-10-03 15:31:56.371 201411 208692 "Stack[00007FBD629A1BF0]: 00007FBF0512CC70 CA04608800007FBF 00007FBECA046088 C008B85000007FBE 00007FBEC008B850 00001FF800007FBE 00007FBF00001FF8 5D964CAC00007FBF"
00009210 2019-10-03 15:31:56.371 201411 208692 "Stack[00007FBD629A1C10]: 000000005D964CAC C00304B000000000 00007FBEC00304B0 C0032A6800007FBE 00007FBEC0032A68 629A1C5000007FBE 00007FBD629A1C50 0000000100007FBD"
00009211 2019-10-03 15:31:56.371 201411 208692 "Stack[00007FBD629A1C30]: 0000000000000001 0000000300000000 0000000000000003 629A1D6000000000 00007FBD629A1D60 07708C2000007FBD 00007FBF07708C20 A59ABA1E00007FBF"
00009212 2019-10-03 15:31:56.371 201411 208692 "ThreadList:

Conclusion

None

Activity

Show:

Kevin Logemann October 8, 2019 at 2:50 PM

Thanks for the prompt resolution.   We will test the suggested workaround ASAP.

 

However, do the changes in the pull request eliminate the need for the EMBEDDED workaround?

Gavin Halliday October 7, 2019 at 5:54 PM

I have a reasonable idea what is causing it.  Try adding EMBEDDED on the front of all the child dataset in t_InsuranceScoreContext.  E.g. CIIDAccountWL, ScoreModels, AttributeNames etc. etc,

 

I think the caller is serializing the datasets, but the library is assuming they are in-memory format.  I should be able to create a test case to reproduce it.

 

Richard Chapman October 7, 2019 at 4:16 PM

I suspect it will be hard to debug this without being able to reproduce it inside a debugger - anything you might be able to do to produce a cut-down standalone example would greatly improve our ability to do so

Kevin Logemann October 7, 2019 at 3:20 PM

Any initial feedback on this one?   Any more info we need to provide?

 

This is close to blocking our progress on this effort, so any help in diagnosing/resolving the issue (or providing a viable workaround) would be most appreciated.

 

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Components

Assignee

Reporter

Priority

Fix versions

Pull Request URL

Affects versions

Created October 3, 2019 at 7:53 PM
Updated October 21, 2019 at 9:34 AM
Resolved October 21, 2019 at 9:34 AM