Keyed Join in containerized does not map parts to workers (consequently performs no remote lookups)
Description
In k8s, keyedjoin currently accessing all index parts directly using local handlers, i.e. it does not farm off requests to other workers. This occurs because when mapping parts to workers, it currently sees all parts as residing on 'localhost', but the workers on arbitrary k8s IPs, causing it to think no parts map to any workers. As a result, all are considered 'off cluster', causing all parts to be dealt with directly by local handlers.
Options to force remote keyed lookups (e.g. 'forceRemoteKeyedLookup') do not help/work, because there is no known mapping from parts -> workers.
The fix is when mapping parts to workers, to consider the k8s case differently, as if all parts map to any worker. With the being true, the parts will be round-robined onto workers, in effect partitioning them amongst the workers. The workers will then each have a proportion of the index parts to handle with local handlers, with the rest being farmed out to remote handlers.
Conclusion
None
Attachments
4
Activity
Show:
Jacob Cobbett-Smith November 29, 2023 at 2:00 PM
- if you could update this ticket with results of any keyed join tests once you are running on >=9.4.16. Thanks.
Jacob Cobbett-Smith November 28, 2023 at 1:32 PM
- when us-linkinghpcc-dev is upgraded to 9.4.16 could you retest ?
Jacob Cobbett-Smith November 28, 2023 at 1:29 PM
Edited
With this fix deployed, I performed some very basic testing
In k8s, keyedjoin currently accessing all index parts directly using local handlers, i.e. it does not farm off requests to other workers.
This occurs because when mapping parts to workers, it currently sees all parts as residing on 'localhost', but the workers on arbitrary k8s IPs, causing it to think no parts map to any workers.
As a result, all are considered 'off cluster', causing all parts to be dealt with directly by local handlers.
Options to force remote keyed lookups (e.g. 'forceRemoteKeyedLookup') do not help/work, because there is no known mapping from parts -> workers.
The fix is when mapping parts to workers, to consider the k8s case differently, as if all parts map to any worker.
With the being true, the parts will be round-robined onto workers, in effect partitioning them amongst the workers.
The workers will then each have a proportion of the index parts to handle with local handlers, with the rest being farmed out to remote handlers.