Slave killed by oom killer due to excessive heap usage by 100's of compressed readers

Environment

Boca Dataland 6.0.4-rc1

Description

Upon executing the Business Linking test cases, I am receiving the "System error: 4: MP link closed " errors for the following 2 scripts. This is happening after the upgrade of Boca Dataland from 6.0.2-rc4 to 6.0..4-rc1.

W20160719-141803
W20160719-095741

Conclusion

None

Activity

Show:

Jacob Cobbett-Smith September 14, 2016 at 1:03 PM

- please close (https://hpccsystems.atlassian.net/browse/HPCC-15972#icft=HPCC-15972 is the issue that this discussion caused).

Jacob Cobbett-Smith July 21, 2016 at 9:21 AM

I have confirmed that you can hit this problem by writing a query with a lot of highly compressible data that spills 100's of times.
i.e. like this query did.

This is not a new symptom, but the default compression may expose the issue more readily (since LZ4's compression ratio is higher).

I am marking this issue as 6.2 and 'Awaiting Information' re. questions asked above.
I've opened a new issue (see https://hpccsystems.atlassian.net/browse/HPCC-15972#icft=HPCC-15972) for future work to improve the handling of stream compressed files.

But for now and even post that improvement, the ECL/data should be scrutinized and fixed to avoid the route cause, since it will be very inefficient at best as it stands.

Manjunath Venkataswamy July 20, 2016 at 6:57 PM

And could you also see if this in any way is linked to the https://hpccsystems.atlassian.net/browse/HPCC-15698#icft=HPCC-15698 ?

Manjunath Venkataswamy July 20, 2016 at 6:56 PM

Thanks for the detailed explanation of what caused the issue. However, Could you take a look at the following WUID (for the same script), that did run to completion on 6.0.2-rc4. W20160707-134752

Meanwhile i will get with the developers that provide me with this script to find out if the compression type could be checked based on change from LZW to LZ4.

Jacob Cobbett-Smith July 20, 2016 at 4:38 PM

The one thing that has changed between 5.6.x and 6.0 (not 6.0.2 and 6.0.4) is the default compression type.
It used to be LZW and now it's LZ4.
That is likely to have increased the compression ratio and may have exacerbated the problem.

Duplicate
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Priority

Compatibility

Point

Fix versions

Affects versions

Created July 20, 2016 at 1:12 PM
Updated September 28, 2017 at 9:39 AM
Resolved September 28, 2017 at 9:39 AM

Flag notifications