Suspect code when handling exception within receiveDaFsBuffer in dafilesrv client.

Description

Within the function below, if it generates a DAFSERR_protocol_failure exception , it calls flushDaFsSocket with exception handlers to try to clear up the socket.

It would be safer to move the handling outside of the exception handlers.
flushDaFsSocket also seems a bit odd, sleeping for 1 second ever (max) 1k it reads.

We have seen the client get stuck in this code seemingly looping indefinitely, we're not sure how much it was reading each time, but it had been running for ~24hrs.
It looks like it should break out if min_size is not read.

It should probably at least have a max timeout and close the socket.

Conclusion

None

Activity

Show:

Mark Kelly July 30, 2020 at 5:35 PM

A safe and small change is to eliminate the sleep before each read.
This way a rogue sending agent sending 100+ MB can be read quickly, instead of taking > 24 hr.

Mark Kelly November 14, 2019 at 8:07 PM

Perhaps the solution is to simply flush without pause/delay so that if a huge msg was sent and it went down this code path the data is read in and thrown away quickly.

Mark Kelly November 14, 2019 at 7:41 PM
Edited

Perhaps this was an issue when Dafilesrv was configured for SSLFirst or SSLOnly ?
And thus resolved with https://hpccsystems.atlassian.net/browse/HPCC-22156 ??
I have not seen this in some cluster logs in the past 3+ months.

Mark Kelly November 14, 2019 at 7:30 PM

I agree flushDaFsSocket() seems heavy -

But SOCKREADTMS() should throw an IJSOCK_Exception of JSOCKERR_timeout_expired if it did not read at least 1 byte before 60 sec.

Jacob Cobbett-Smith May 9, 2019 at 5:01 PM

can you take a look at this?

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Components

Assignee

Reporter

Priority

Fix versions

Pull Request URL

Created May 9, 2019 at 3:43 PM
Updated August 7, 2020 at 8:51 AM
Resolved August 7, 2020 at 8:51 AM