-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Q: Any way to reset/clear SMART attributes (i.e. 199 UDMA_CRC_Error_Count) #172
Comments
HI @stevecs, Thanks for the question! Since SMART attributes are obsolete and being replaced with Device Statistics, I did check and there is a feature that can be used to reset some statistics. The SATA Phy event counters log also has something like this. The phy event counters log does support the CRC counter and device statistics has both CRC counter and command timeouts (It's called "Number of Resets Between Command Acceptance and Command Completion"). I will test a few different products for these features as well and update this issue as I find out more. |
@vonericsen Thanks for taking a look and will be interested in what you find. Yes I've been seeing the 'slow demise' of SMART attributes over the years (not to mention that they were never really standardized or enforced) but they did at least provide a lot of data that was very useful (and have always wanted similar details in SAS/SCSI/FC devices over the last ~40 years). I was not aware of "Device Statistics" for SATA drives (to be fair, I only have a couple hundred SATA drives most are SAS/FC or NVME). So that's interesting. Would be interested if you could point to any URL's for specs or standards to that for "bedtime reading". |
Yeah, there are multiple reasons for this, some dating back to when SMART was released in ATA-3. Seagate's firmware group has not given any timeline in which SMART attributes will be removed, but the device statistics log has been supported for quite a while now.
ACS-3 was the first spec to define the majority of the device statistics log on SATA and it is very similar to the standardized outputs from SAS/FC log pages. The SAT specs even translate many of these statistics to these log pages today as well. We support showing that page with One other part of device statistics added to the standard is |
Refactored the code to simplify it and start reading which statistics support reinitialization. Will need to identify a product that supports statistic reinitialization for full testing. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
Adding function to set the date and time timestamp and updating how it is displayed to convert it to a more human readable timestamp. Adding a function to reset/reinitialize supported device statistics. Cleaned up more about how a statistic is looked up from its page and offset to be more readable and maintainable. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
Adding support for issing the read log ext with the feature field set to 1 to trigger a reset of the phy event counters. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
… options Adding options to set the device timestamp as well as new options to issue the command to reset the SATA phy event counters and supported device statistics that support resetting. [#172] Signed-off-by: Tyler Erickson <[email protected]>
I have created the branch feature/SATA_Dev_Stats_and_Phy_Counters_Refresh and have added the initial code to support issuing the commands that reset supported device statistics and the SATA phy event counters log. I am still working on testing, but feel free to pull this and test it out in the meantime. |
…e stats Pulling in the library bug fix from populating device statistics. Also added in additional Seagate device erase statistics that were already in openSeaChest_Info [#172] Signed-off-by: Tyler Erickson <[email protected]>
Adding CDL device statistics on SATA drives. This code supports concurrent ranges 0-3 per ACS-6 statistics. This also handles the difference between whole device policies and concurrent range policies depending on what the drive populates when it is read. All these statistics are defined in the spec as supporting the read then initialize feature as well according to the standards. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
[#172] Signed-off-by: Tyler Erickson <[email protected]>
Hello, I have cloned the /SATA_Dev_Stats_and_Phy_Counters_Refresh branch and ran OpenSeaChest_Info --help, but I did not find any instructions regarding the reset function for SATA Phy event counters. Does this mean that the reset functionality is currently unavailable? Thank you for your assistance! |
Hi @LAN007w, The option should be there. It would be under the "SATA Only" section of the help. The option is Also, just to verify, you have cloned the feature/SATA_Dev_Stats_and_Phy_Counters_Refresh branch for openSeaChest and opensea-operations? (If you did a recursive clone of openSeaChest on this branch, it would have pulled the operations branch as well) |
Hi @vonericsen , Thank you very much for your suggestions! I followed your advice and re-cloned the /SATA_Dev_Stats_and_Phy_Counters_Refresh branch. After running --resetATAPhyEvents, I received the response: “Successfully reinitialized SATA Phy event counter log.” Additionally, after running the --resetDevStats transport, I received: “Successfully reinitialized Device Statistics. NOTE: Only statistics marked with the read then initialize supported bit were reinitialized.” However, when I tried to view the results using --deviceStatistics, I encountered an error with code 22. To further investigate, I switched to openSeaChest-v24.08.1 and ran the check again. Unfortunately, I found that the "Number Of Interface CRC Errors" still shows as 14. Any insights or suggestions you may have would be greatly appreciated! Thank you again for your help. |
Hi @LAN007w, The device statistics log command to reset statistics can only reset the ones that mark as reset capable. Resetting the phy event counters log will only reset that page.
Can you share the full output you received when this error happened? |
D:\002\openSeaChest\Make\VS.2019\x64\Debug>OpenSeaChest_Info -d PD0 --deviceStatisticsopenSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
|
Thanks for the information! |
The date and time timestamp statistic can report either the most recent value or the power on hours in milliseconds. We need a little more work beyond this commit, but this will workaround the crash/failure for now. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
If you pull the feature branch on the opensea-operations submodule, I have made a workaround for now that should get past this spot in device statistics. You can do this by going into the subproject |
Hi @vonericsen, Thank you so much for your detailed explanation and for providing a temporary workaround. I really appreciate your efforts in addressing the issue, and it’s great to hear that further updates are being worked on. I’ll follow your suggestion to check out the develop branch and pull the latest changes. Thanks again for your help and for keeping the community updated! Looking forward to the upcoming fixes. Best regards, |
No Problem! One more thing as I am working on changes, can you share the verbose output? This will print out all the drive's raw data responses from reading the log. |
|
…nd time timestamp Fixing a bug in the set date and time timestamp command for ATA drives. It was checking the wrong byte when validating if the timestamp is supported. When printing the date and time timestamp it is possible for it to be representing the power on time in milliseconds, so this also adds support for that. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
Fixing a conversion error when converting from milliseconds since the Unix Epoch (Jan 1, 1970) to struct tm. The year was initialized wrong, then when converting to tm_year it was not adjusting based on the unix epoch and leading to an invalid/out of range year during the conversion. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
Refactored the code to simplify it and start reading which statistics support reinitialization. Will need to identify a product that supports statistic reinitialization for full testing. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
Adding function to set the date and time timestamp and updating how it is displayed to convert it to a more human readable timestamp. Adding a function to reset/reinitialize supported device statistics. Cleaned up more about how a statistic is looked up from its page and offset to be more readable and maintainable. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
Adding support for issing the read log ext with the feature field set to 1 to trigger a reset of the phy event counters. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
Adding CDL device statistics on SATA drives. This code supports concurrent ranges 0-3 per ACS-6 statistics. This also handles the difference between whole device policies and concurrent range policies depending on what the drive populates when it is read. All these statistics are defined in the spec as supporting the read then initialize feature as well according to the standards. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
The date and time timestamp statistic can report either the most recent value or the power on hours in milliseconds. We need a little more work beyond this commit, but this will workaround the crash/failure for now. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
…nd time timestamp Fixing a bug in the set date and time timestamp command for ATA drives. It was checking the wrong byte when validating if the timestamp is supported. When printing the date and time timestamp it is possible for it to be representing the power on time in milliseconds, so this also adds support for that. [Seagate/openSeaChest#172] Signed-off-by: Tyler Erickson <[email protected]>
…nch changes Pulling in fixes that were made in the develop branch of various libraries as well. This will fix other bugs we've also run into due to bounds checking on the develop branch. [#172] Signed-off-by: Tyler Erickson <[email protected]>
Thanks! I've merged the fixes from the develop branch into here as well which fixed a few bugs with The idea behind the SATA Phy event counters log and its CRC counter is that it will track this as errors happen. Then it can be reset to zero, tests can be run, then checked again. |
More of a question so if this is the wrong forum let me know. I have been looking to see if there is a way to clear/reset some SMART attributes or how to go about it. I am, in particular, looking at 199 UDMA_CRC_Error_Count as I have a good number of drives that have values there due to misbehaving back planes or bad cables/hba's in the past.
Yes I can track each variable to see if it increases but that gets harder to see/monitor with hundreds of drives. The ability to reset/clear that to zero would be very useful. Likewise other values like 188 Command Timeout.
I know that these can be cleared by the OEM on refurbished drives, as well as I've seen some instances where they can be cleared with certain firmware updates. But have not found any means so far to clear them for general/advanced users.
Vast majority of our rotating rust drives are seagate if that matters (ST4000's though ST20000's) if it's a oem specific type of command.
The text was updated successfully, but these errors were encountered: