start thor master failed due to frunssh taking warning as error

Environment

Amazon EC2/Ubuntu 14.04. I haven't tested in other environment. But it is very consistent on Amazon Cloud. I tried both HPCC Instance Cloud and HPCC juju charm. The code run OK in HPCC 5.2.6-1 but not 5.4.2-1

Description

The error from frunssh log:
ssh result(0):
ERR: Warning: Permanently added '10.197.82.122' (ECDSA) to the list of known hosts.
ERROR: /var/lib/jenkins/workspace/CE-Candidate-withplugins-5.4.2-1/CE/ubuntu-14.04-amd64/HPCC-Platform/services/runagent/frunssh.cpp(73) : frunssh : ERR: Warning: Permanently added '10.197.82.122' (ECDSA) to the list of known hosts.
ssh result(0):

This should be just a warning but the code in common/remote/rmtssh.cpp/exec() throws a exception.

Conclusion

None

Activity

Show:

Jacob Cobbett-Smith March 2, 2016 at 4:05 PM

- definite candidate for RedBook - thought it was discussed/on there already actually.

Michael Gardner October 19, 2015 at 8:04 PM
Edited

The issue here was that init_thor was making a call (via frunssh) to init_thorslave. Within the init_thorslave start function we have an rsync call that uses ssh to execute. That specific call was not properly squelching warnings and banners. This was causing the CFRunSSH code to append the stderr from the rsync, and throw an exception which caused frunssh to fail with a non zero return code. When this happens, we kill off all the slaves and the master and exit because we assume an error occurred.

The following Ticket was used to clean up any ssh calls in 6.0 because I had a feeling something like this might happen. This is also why I never ran into this regression, as I'm almost always working on 6.0 issues. https://hpccsystems.atlassian.net/browse/HPCC-14132

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Components

Assignee

Reporter

Priority

Fix versions

Labels

Affects versions

Created October 19, 2015 at 12:29 AM
Updated March 2, 2016 at 4:31 PM
Resolved October 20, 2015 at 7:50 AM