HDDS-14990. Show failed volumes in ozone admin datanode list output and SCM metrics#10058
HDDS-14990. Show failed volumes in ozone admin datanode list output and SCM metrics#10058xichen01 wants to merge 4 commits intoapache:masterfrom
ozone admin datanode list output and SCM metrics#10058Conversation
adoroszlai
left a comment
There was a problem hiding this comment.
Thanks @xichen01 for the patch.
| Integer healthyVolumeCount = node.hasHealthyVolumeCount() ? node.getHealthyVolumeCount() : null; | ||
| BasicDatanodeInfo singleNodeInfo = new BasicDatanodeInfo.Builder( | ||
| DatanodeDetails.getFromProtoBuf(node.getNodeID()), node.getNodeOperationalStates(0), | ||
| node.getNodeStates(0)).withVolumeCounts(totalVolumeCount, healthyVolumeCount).build(); | ||
| node.getNodeStates(0)).withVolumeCounts(totalVolumeCount, healthyVolumeCount) | ||
| .withFailedVolumes(getFailedVolumes(node)).build(); |
There was a problem hiding this comment.
I'd like to suggest a small refactoring to reduce duplication of code related to BasicDatanodeInfo creation.
| } | ||
| } else { | ||
| pipelineListInfo.append("No pipelines in cluster."); | ||
| pipelineListInfo.append(System.getProperty("line.separator")); |
There was a problem hiding this comment.
Why is this needed? pipelineListInfo is output with println below.
There was a problem hiding this comment.
This is to maintain consistency with the pipelineListInfo in the previous branches, where pipelineListInfos always have line breaks.
There was a problem hiding this comment.
Thanks for the info. Then please change to:
pipelineListInfo.append("No pipelines in cluster.")
.append(System.getProperty("line.separator"));due to recently added PMD rule (HDDS-14934).
(BTW, I think using \n would be fine for now, it is already used in few places instead of System.getProperty("line.separator"), both in this file and in 122 others.)
| assertTrue(output.contains("Failed volume")); | ||
| assertTrue(output.contains("/data/disk2")); | ||
| assertTrue(output.contains("/data/disk1")); | ||
| assertTrue(output.contains("/data/disk5")); |
There was a problem hiding this comment.
nit: please use assertThat(output).contains(...) for better failure message (see HDDS-9951)
| @CommandLine.Option(names = {"--failed-volumes"}, | ||
| description = "Only show datanodes that have at least one failed volume.", |
There was a problem hiding this comment.
Would --with-failed-volume be better?
There was a problem hiding this comment.
how about --show-failed-volumes ? --with-failed-volume seems mean that "display along with bad disk information".
There was a problem hiding this comment.
--show-failed-volumes implies failed volumes are normally hidden, and included in the output only if this flag is used.
I think the command
ozone admin datanode list --with-failed-volume
reads better.
If we want to be extra clear, it can be --nodes-with-failed-volume.
There was a problem hiding this comment.
Update to --nodes-with-failed-volume
ozone admin datanode list show failed volumes and add metrics
ozone admin datanode list show failed volumes and add metricsozone admin datanode list output and SCM metrics
adoroszlai
left a comment
There was a problem hiding this comment.
Thanks @xichen01 for updating the patch. Let's wait to see if others have comments.
|
|
sreejasahithi
left a comment
There was a problem hiding this comment.
Thanks @xichen01 for working on this.
| @CommandLine.Option(names = {"--nodes-with-failed-volumes"}, | ||
| description = "Only show datanodes that have at least one failed volume.", | ||
| defaultValue = "false") | ||
| private boolean nodeWithFailedVolumes; | ||
|
|
There was a problem hiding this comment.
The --nodes-with-failed-volumes filter is silently ignored when --node-id is used.
We should make these options mutually exclusive to avoid confusion when a user provides both in the command but receives the result for the node whose ID was specified, regardless of whether it has failed volumes.
What changes were proposed in this pull request?
ozone admin datanode listcan dispaly "Total volume count" and "Healthy volume count", Datanode JMX can display the NumFailedVolumesAdd:
ozone admin datanode list --failed-volumesto disply failed disk Datanode onlyExample
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14990
How was this patch tested?
new test