Keyed Join in containerized does not map parts to workers (consequently performs no remote lookups)

Description

In k8s, keyedjoin currently accessing all index parts directly using local handlers, i.e. it does not farm off requests to other workers.
This occurs because when mapping parts to workers, it currently sees all parts as residing on 'localhost', but the workers on arbitrary k8s IPs, causing it to think no parts map to any workers.
As a result, all are considered 'off cluster', causing all parts to be dealt with directly by local handlers.

Options to force remote keyed lookups (e.g. 'forceRemoteKeyedLookup') do not help/work, because there is no known mapping from parts -> workers.

The fix is when mapping parts to workers, to consider the k8s case differently, as if all parts map to any worker.
With the being true, the parts will be round-robined onto workers, in effect partitioning them amongst the workers.
The workers will then each have a proportion of the index parts to handle with local handlers, with the rest being farmed out to remote handlers.

Conclusion

None

Attachments

4

Activity

Show:

Jacob Cobbett-Smith November 29, 2023 at 2:00 PM

- if you could update this ticket with results of any keyed join tests once you are running on >=9.4.16. Thanks.

Jacob Cobbett-Smith November 28, 2023 at 1:32 PM

- when us-linkinghpcc-dev is upgraded to 9.4.16 could you retest ?

Jacob Cobbett-Smith November 28, 2023 at 1:29 PM
Edited

With this fix deployed, I performed some very basic testing

Aleida Lima November 16, 2023 at 5:27 PM

I ran the same workunit in Thor Cloud (https://eclwatch-hpcc.us-linkinghpcc-dev.azure.lnrsg.io/)and AlphaDev (https://alpha_dev_thor_esp.risk.regn.net:18010 )

 

Testing InstantID

 

Timing

Workunit

ZAP

With option

16mins

W20231116-150802

IID_forceRemoteKeyedLookup

Without option

18mins

W20231116-152947

IID_noforceRemoteKeyedLookup

AlphaDev

4mins

W20231116-103139

IID_onPrem

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Components

Assignee

Reporter

Priority

Fix versions

Pull Request URL

Created November 15, 2023 at 2:40 PM
Updated November 29, 2023 at 2:00 PM
Resolved November 27, 2023 at 2:03 PM