Roxie may core attempting to calculate required translation for a null subfile

Environment

10.173.160.101:8010

Description

On the 160 cluster, the roxie is coring when it can't load a query because the version it was compiled against isn't compatible anymore.  

 

"WARNING: Warning: Could not load /var/lib/HPCCSystems/queries/roxie_160/libW20190916-173655.so: /var/lib/HPCCSystems/queries/roxie_160/libW20190916-173655.so: undefined symbol: _Z23createKeySegmentMonitorbP10IStringSetjj" 00044CA8 USR 2020-05-23 01:15:49.472 87029 28757 "ERROR: 0: ccdstate.cpp(1002) : Failed to load query driversvtsa_services.dlsearchservice.3.1 from libW20190916-173655.so : Error loading /var/lib/HPCCSystems/queries/roxie_160/libW20190916-173655.so: /var/lib/HPCCSystems/q ueries/roxie_160/libW20190916-173655.so: undefined symbol: _Z23createKeySegmentMonitorbP10IStringSetjj" 00044CA9 PRG 2020-05-23 01:15:49.477 87029 28787 "Dali lookup regress::single::DG_CSV returned match in 7 ms" 00044CAA OPR 2020-05-23 01:15:49.480 87029  1045 "WARNING: Warning: Could not load /var/lib/HPCCSystems/queries/roxie_160/libW20191111-173054.so: /var/lib/HPCCSystems/queries/roxie_160/libW20191111-173054.so: undefined symbol: _Z23createKeySegmentMonitorbP10IStringSetjj" 00044CAB USR 2020-05-23 01:15:49.481 87029  1045 "ERROR: 0: ccdstate.cpp(1002) : Failed to load query personsearch_services.checksearchservice.1 from libW20191111-173054.so : Error loading /var/lib/HPCCSystems/queries/roxie_160/libW20191111-173054.so: /var/lib/HPCCSystems /queries/roxie_160/libW20191111-173054.so: undefined symbol: _Z23createKeySegmentMonitorbP10IStringSetjj" 00044CAC OPR 2020-05-23 01:15:49.485 87029 28648 "WARNING: Warning: Could not load /var/lib/HPCCSystems/queries/roxie_160/libW20191111-170654.so: /var/lib/HPCCSystems/queries/roxie_160/libW20191111-170654.so: undefined symbol: _ZN26RtlSafeLinkedDatasetCursorC1EjPPh" 00044CAD USR 2020-05-23 01:15:49.485 87029 28648 "ERROR: 0: ccdstate.cpp(1002) : Failed to load query frauddefensenetwork_services.searchservice.1 from libW20191111-170654.so : Error loading /var/lib/HPCCSystems/queries/roxie_160/libW20191111-170654.so: /var/lib/HPCCSyste ms/queries/roxie_160/libW20191111-170654.so: undefined symbol: _ZN26RtlSafeLinkedDatasetCursorC1EjPPh" 00044CAE PRG 2020-05-23 01:15:49.489 87029 28787 "Dali lookup regress::single::DG_FLAT_EVENS returned match in 7 ms" 00044CAF PRG 2020-05-23 01:15:49.492 87029 28787 "Dali lookup regress::single::DG_FLAT returned match in 3 ms" 00044CB0 OPR 2020-05-23 01:15:49.496 87029 28760 "WARNING: Warning: Could not load /var/lib/HPCCSystems/queries/roxie_160/libW20191028-171150.so: /var/lib/HPCCSystems/queries/roxie_160/libW20191028-171150.so: undefined symbol: _ZN23RtlLinkedDatasetBuilder10appendRowsEjPPh "

It probably shouldn't be coring if it can't load a query.

 

Conclusion

None

Activity

Show:

Richard Chapman May 29, 2020 at 1:49 PM

I don't think it's anything to do with older verisons. It seems to be coring referencing a subfile within a superfile that is null.

Stuart Ort May 26, 2020 at 7:59 PM

From node 33.  I'm not sure this is related to old queries (7.6 from 9/2019) not loading on 7.8.  Not all nodes cored (i.e. node 1 didn't)

Core was generated by `roxie --topology=RoxieTopology.xml --logfile --restarts=0 --stdlog=0'.
Program terminated with signal 11, Segmentation fault.
#0 isFileKey (f=0x0) at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/dali/base/dadfs.hpp:819
819 /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/dali/base/dadfs.hpp: No such file or directory.
Missing separate debuginfos, use: debuginfo-install hpccsystems-platform-with-spark-7.8.17-closedown05262020014710.x86_64
(gdb) where
#0 isFileKey (f=0x0) at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/dali/base/dadfs.hpp:819
#1 getMode (fileDesc=0x0) at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/roxie/ccd/ccdfile.cpp:2102
#2 CResolvedFile::getTranslators (this=0x7f89589635a0, projectedFormatCrc=-574812267, projected=0x7f88e85ef520 <mx7>, expectedFormatCrc=-661293735, expected=0x7f88e85ef5e0 <mx4>, mode=None,
fileMode=flat) at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/roxie/ccd/ccdfile.cpp:2131
#3 0x00007f8aecd3943f in CRoxieDiskReadBaseActivity::setVariableFileInfo (this=0x7f895895f6a0)
at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/roxie/ccd/ccdactivities.cpp:909
#4 0x00007f8aecd37147 in CRoxieSlaveActivity::onCreate (this=this@entry=0x7f895895f6a0)
at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/roxie/ccd/ccdactivities.cpp:358
#5 0x00007f8aecd371f0 in CRoxieDiskReadBaseActivity::onCreate (this=0x7f895895f6a0)
at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/roxie/ccd/ccdactivities.cpp:871
#6 0x00007f8aecd3e2ad in non-virtual thunk to CRoxieDiskReadActivityFactory::createActivity(SlaveContextLogger&, IRoxieQueryPacket*) const ()
at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/roxie/ccd/ccdactivities.cpp:1058
#7 0x00007f8aece2d741 in CRoxieWorker::doActivity (this=this@entry=0xa476940)
at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/roxie/ccd/ccdqueue.cpp:1261
#8 0x00007f8aece2e128 in CRoxieWorker::threadmain (this=0xa476940)
at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/roxie/ccd/ccdqueue.cpp:1350
#9 0x00007f8ae69ed4d5 in non-virtual thunk to CPooledThreadWrapper::run() () from /opt/HPCCSystems/lib/libjlib.so
#10 0x00007f8ae69e4688 in Thread::begin (this=0x7959bc0)
at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/system/jlib/jthread.cpp:291
#11 0x00007f8ae69e3d0d in Thread::_threadmain (v=0x7959bc0)
at /var/lib/jenkins/workspace/LN-with-Plugins-Spark_Candidate-7.8.x-Nightly-Build/LN/centos-el7-x86-64-aws-ris/HPCC-Platform/system/jlib/jthread.cpp:137
#12 0x00007f8aec055e65 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f8ae4e2288d in clone () from /lib64/libc.so.6

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Components

Assignee

Reporter

Priority

Fix versions

Affects versions

Created May 26, 2020 at 7:56 PM
Updated June 1, 2020 at 11:33 AM
Resolved June 1, 2020 at 11:32 AM

Flag notifications