Poor code for count(x) > n

Description

Spotted in example addressclean1b.

I'm not sure what the underlying ECL is, but activity 392 contains the following code at the end of the onStart():

byte * rowJ39; rowJ39 = (byte *)vQ19.getbytes(); byte * endK39; endK39 = rowJ39+vR19; for (;rowJ39 < endK39;) { crN19.createRow(); rtlWriteInt3(crN19.rowBuilder().row() + 0U,*((int *)(rowJ39 + 0U))); crN19.finalizeRow(3U); rowJ39 += 4U; } byte * rowL39; byte * * curM39 = crN19.queryrows(); unsigned vN39 = crN19.getcount(); for (;vN39--;) { rowL39 = *curM39++; vM19++; if (vM19 > 251LL) { break; } vL19++; }

Why wasn't that optimized to remove the project(), and why wasn't the count executed inside the previous loop?

Conclusion

None

Activity

Show:

Richard Chapman December 2, 2016 at 9:50 AM
Edited

Optimizing count(ds) > n to count(choosen...) feels like it could be a bad idea in other circumstances too - for example where ds is an index read or an unfiltered diskread where the count is very cheaply available

Gavin Halliday December 2, 2016 at 9:34 AM
Edited

It is caused by

count(x) > 250

where x is

dataset(IF-set, record);

However that gets transformed to count(choosen(x,251)) > 250 - which would be even harder to optimize.

Linking to https://hpccsystems.atlassian.net/browse/HPCC-15034#icft=HPCC-15034 which highlights that transformation causing problems in other situations.

Richard Chapman December 1, 2016 at 2:58 PM

a count(choosen()) perhaps?

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Components

Assignee

Reporter

Priority

Fix versions

Labels

Created December 1, 2016 at 2:53 PM
Updated June 1, 2017 at 2:20 PM
Resolved January 3, 2017 at 9:28 AM

Flag notifications