Filesystem Corruption Is Not Disk Failure: Evidence-Based Triage and Capacity Recovery Using Block-Level Diagnostics and Remapping
Keywords:
filesystem integrity, fsck, ext4, bad blocks, block device health, badblocks, device-mapper, linear mapping, capacity recovery, operational triage, CAPEXAbstract
Operational storage teams sometimes treat filesystem-check results (for example, fsck reports) as proof ofphysical disk failure, leading to premature retirement of storage devices and avoidable replacement costs. This article clarifies the critical distinction between filesystem integrity failures
References
Andy Chou et al., "An empirical study of operating systems errors," ACM Digital Library, 2001.
Online]. Available: https://dl.acm.org/doi/10.1145/502034.502042
Bianca Schroeder, Garth A. Gibson, "Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?"In FAST'07: 5th USENIX Conference on File and Storage Technologies, San Jose, CA, 2007. [Online]. Available: https://www.cs.toronto.edu/~bianca/papers/fast07.pdf


