Erreur de disque dur DRDY: est-ce un crash

5

J'utilise IBM Thinkpad, 1,7 GHz, 512 RAM avec Linux Mint 9 installé. J'ai deux partitions en plus de root.

Une des partitions est devenue en lecture seule hier, après quoi j'ai redémarré mon système. Il est extrêmement lent avec DRDY Erreur: mon disque dur est-il écrasé? Journal des erreurs lors du démarrage.

Differences between boot sector and its backup.
failed command : READ DMA
BMDMA : stat 0X25
ata 1.00 : status : { DRDY ERR }
ata 1.00 : status :{ UNC }
Buffer I/O error on logical device, logical block 65467

Sortie smartctl pour la partition:

mint mint # smartctl -a /dev/sda1
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     TOSHIBA MK4026GAX RoHS
Serial Number:    X5LY1623T
Firmware Version: PA107E
User Capacity:    40,007,761,920 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   6
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Feb 17 06:48:25 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:          ( 153) seconds.
Offline data collection
capabilities:              (0x1b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    No General Purpose Logging support.
Short self-test routine 
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (  30) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       310
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       3968
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       40
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   082   082   000    Old_age   Always       -       7257
 10 Spin_Retry_Count        0x0033   179   100   030    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       3484
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       489
193 Load_Cycle_Count        0x0032   064   064   000    Old_age   Always       -       367150
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       36 (Lifetime Min/Max 14/57)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       33
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       82
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       1
199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       101
222 Loaded_Hours            0x0032   085   085   000    Old_age   Always       -       6146
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       227
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0

SMART Error Log Version: 1
ATA Error Count: 2371 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2371 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:03:10.061  READ DMA
  f8 00 00 00 00 00 e0 00      00:03:10.061  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:03:10.053  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:03:10.053  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:03:10.053  READ NATIVE MAX ADDRESS

Error 2370 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:03:03.328  READ DMA
  f8 00 00 00 00 00 e0 00      00:03:03.327  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:03:03.320  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:03:03.319  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:03:03.319  READ NATIVE MAX ADDRESS

Error 2369 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:02:56.582  READ DMA
  f8 00 00 00 00 00 e0 00      00:02:56.582  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:02:56.574  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:02:56.574  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:02:56.574  READ NATIVE MAX ADDRESS

Error 2368 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:02:49.809  READ DMA
  f8 00 00 00 00 00 e0 00      00:02:49.809  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:02:49.801  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:02:49.801  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:02:49.801  READ NATIVE MAX ADDRESS

Error 2367 occurred at disk power-on lifetime: 7256 hours (302 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 1a 1b 00 e0  Error: UNC 5 sectors at LBA = 0x00001b1a = 6938

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 05 1a 1b 00 e0 00      00:02:43.056  READ DMA
  f8 00 00 00 00 00 e0 00      00:02:43.056  READ NATIVE MAX ADDRESS
  ec 00 00 00 00 00 a0 02      00:02:43.048  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 02      00:02:43.048  SET FEATURES [Set transfer mode]
  f8 00 00 00 00 00 e0 00      00:02:43.047  READ NATIVE MAX ADDRESS

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


Device does not support Selective Self Tests/Logging

Dois-je obtenir un nouveau disque dur sur mon PC?

pranjal
la source
En plus de la ligne "ata 1.00: status", il devrait y avoir une lecture "ata 1.00: error ..." qui vous indiquera l'erreur exacte.
Sleske

Réponses:

2

Le journal des erreurs SMART contient des informations utiles:

Erreur: secteurs UNC 5 à LBA = 0x00001b1a = 6938

Cela signifie une erreur non corrigible. La dernière commande était un DMA READ, il s’agit donc d’une erreur de lecture. Il semble que les secteurs 6938 à 6943 ne sont pas lisibles.

En outre, dans les attributs SMART, nous pouvons voir qu'il y a 40 secteurs réalloués avec succès, 82 secteurs en attente de réallocation et 1 erreur non corrigible (probablement celle du journal):

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       40
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       82
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       1

Tout indique que le lecteur est en panne, alors sauvegardez les données immédiatement. Si vous ne pouvez pas copier les données à cause des erreurs, utilisez ddrescue pour créer une image de la partition en ignorant les blocs défectueux. ce tutoriel est très utile.

Aleix Mercader
la source
-2

Vous devez considérer DRDY Errs comme une défaillance matérielle fatale.

Texx
la source
1
Je pense que 'DRDY' est la partie du message d'erreur qui n'est pas si mauvaise: ata.wiki.kernel.org/index.php/Libata_error_messages
Daniel Beck
3
DRDY n'est que le drapeau de disponibilité de l'appareil. ERR, et surtout UNC (uncorrectable) sont le problème.
Aleix Mercader