Additional "best practices" recommendations for bare metal configs

Description

A recent issue with a bare metal configuration prompted me to look a little more closely at the configuration recommendations in the "best practices" section of the Administrator's Guide. The specific situation I ran into was not covered, and it may be that additional, similar situations should be addressed as well.

The specific issue involved Dali's metadata store being located on the same disk partition as the cluster's data built and managed by Thor. If a condition such as "out of disk space" occurs, there is a chance that Dali's metadata will become corrupted. The obvious recommendation is to not place Dali's metadata on the same disk partition as highly-variable cluster data.

From an email thread with Jake:

Right, agree it could be more explicit about other reasons why Dali should be given enough resources (including disk) to perform reliability, and the impact of it not being able to do so, e.g. if low on network or cpu throughput, or if it runs out of disk space because it's using a shared disk with heavy disk consuming processes (not necessarily just HPCC components).

This feels like something that our folks in ops. would have good input on, i.e. this and other best practices.

This issue is therefore a request to expand our Best Practices along the lines outlined above, and perhaps in other directions that our ops folks can suggest.

Conclusion

None

Activity

Show:

Jim DeFabia February 16, 2023 at 2:28 PM

Sys admin manual

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Components

Assignee

Reporter

Priority

Compatibility

Point

Fix versions

Roadmap

Not applicable

Created February 16, 2023 at 1:09 PM
Updated February 27, 2023 at 4:46 PM
Resolved February 27, 2023 at 4:46 PM