Проект

Общее

Профиль

Действия

Изъятие сбойного диска из массива

Собираем информацию о диске

http://serverfault.com/questions/381177/megacli-get-the-dev-sd-device-name-for-a-logical-drive

Нас интересует 'Target Id' из вывода
megacli -ldinfo -Lall -aall

Virtual Drive: 5 (Target Id: 5)
Name                :r0-2-ssd

Поставил lshw, сравниваем 'Target Id' и 'bus info':

bus info: scsi@0:2.5.0
logical name: /dev/sdf

Смотрим какой раздел находится на этом диске:
lvs -o +seg_pe_ranges |grep /dev/sdf

ssd-kvm321-chi-slave-db       ssd   -wi-ao-- 400.00g      /dev/sdf:2560-104959 

Собираем информацию по "Other Error Count: 1".
megacli AdpEventLog -GetEvents -f megacli.log -a0

Из файла megacli.log:

===========
Device ID: 15
Enclosure Index: 32
Slot Number: 15
Error: 3

seqNum: 0x00000f9a
Time: Sun Sep  7 15:54:17 2014

Code: 0x00000071
Class: 0
Locale: 0x02
Event Description: Unexpected sense: PD 0f(e0x20/s15) Path 500056b36789abdc, CDB: 2a 00 17 6e 08 00 00 00 80 00, Sense: 6/29/00
Event Data:
===========

Slot Number: 15
Серийный номер: OCZ-6R12G0UG3MU5KHK2OCZ-VERTEX460
SCSI WWN: 5e83a970e3f8ae05

Смотрим какому виртуальному устройству соответствует сбойный физический диск

megacli -LdPdInfo -a0 -nolog

Virtual Drive: 5 (Target Id: 5)
Name                :r0-2-ssd
RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0
Size                : 446.625 GB
Sector Size         : 512
Parity Size         : 0
State               : Optimal
Strip Size          : 64 KB
Number Of Drives    : 1
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: No
LD has drives that support T10 power conditions: No
LD's IO profile supports MAX power savings with cached writes: No
Bad Blocks Exist: No
Is VD Cached: No
Number of Spans: 1
Span: 0 - Number of PDs: 1

PD: 0 Information
Enclosure Device ID: 32
Slot Number: 15
Drive's position: DiskGroup: 5, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 15
WWN: 5e83a970e3f8ae05
Sequence Number: 2
Media Error Count: 0
Other Error Count: 3
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA

Raw Size: 447.130 GB [0x37e436b0 Sectors]
Non Coerced Size: 446.630 GB [0x37d436b0 Sectors]
Coerced Size: 446.625 GB [0x37d40000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: 1.0 
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x500056b36789abdc
Connected Port Number: 0(path0) 
Inquiry Data: OCZ-6R12G0UG3MU5KHK2OCZ-VERTEX460                           1.0     
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Solid State Device
Drive:  Not Certified
Drive Temperature : N/A
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Drive has flagged a S.M.A.R.T alert : No

Смотрим массив

megacli -LDGetProp -Name -L5 -a0 -nolog

Adapter 0-VD 5(target id: 5): Name: r0-2-ssd

Exit Code: 0x00

Выводим диск

Разбираем массив

megacli -CfgLdDel -L5 -a0 -nolog

Adapter 0: Deleted Virtual Drive-5(target id-5)

Exit Code: 0x00

Смотрим физический диск по его Enclosure Device ID и Slot Number - [E:S]

megacli -pdInfo -PhysDrv [32:15] -a0 -nolog

Enclosure Device ID: 32
Slot Number: 15
Enclosure position: 1
Device Id: 15
WWN: 5e83a970e3f8ae05
Sequence Number: 3
Media Error Count: 0
Other Error Count: 3
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA

Raw Size: 447.130 GB [0x37e436b0 Sectors]
Non Coerced Size: 446.630 GB [0x37d436b0 Sectors]
Coerced Size: 446.625 GB [0x37d40000 Sectors]
Sector Size:  0
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: 1.0 
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x500056b36789abdc
Connected Port Number: 0(path0) 
Inquiry Data: OCZ-6R12G0UG3MU5KHK2OCZ-VERTEX460                           1.0     
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Solid State Device
Drive:  Not Certified
Drive Temperature : N/A
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Drive has flagged a S.M.A.R.T alert : No

Exit Code: 0x00

http://tech.2499.pl/?p=96

1. Перевод в offline (пропускаем)

Так как у нас диск в статусе "Firmware state: Unconfigured(good), Spun Up", то переводить его в offline не нужно.
Он был бы в online если бы принадлежал одному из логических дисков.

megacli -PDOffline -PhysDrv [32:15] -a0 -nolog

Adapter: 0: Failed to change PD state at EnclId-32 SlotId-15.

Exit Code: 0x01

2. Помечаем диск, как отсутствующий

megacli -PDMarkMissing -PhysDrv [32:15] -a0 -nolog

Adapter: 0: Failed to change PD state at EnclId-32 SlotId-15.

FW error description: 
  The specified device is in a state that doesn't support the requested command.  

Exit Code: 0x32

Не сработало. Видимо так и нужно на контроллерах PERC H710P
Mark the drive as missing (seems to not work on R510 H700 card, so just do the next step)
http://www.maths.cam.ac.uk/computing/docs/public/megacli_raid_lsi.html

3. Подготавливаем диск к изъятию

megacli -PdPrpRmv -PhysDrv [32:15] -a0 -nolog

Prepare for removal Success

Exit Code: 0x00

после этого шага меняется: Firmware state: Unconfigured(good), Spun down

4. Включаем подсветку диска

megacli -PdLocate -start -PhysDrv [32:15] -a0 -nolog

Adapter: 0: Device at EnclId-32 SlotId-15  -- PD Locate Start Command was successfully sent to Firmware 

Exit Code: 0x00

5. Изъятие диска

6. Проверка, тот ли изъяли

megacli -pdinfo -physdrv [32:15] -aall -nolog

Adapter 0: Device at Enclosure - 32, Slot - 15 is not found.

Exit Code: 0x00


Все верно

Обновлено Рамиль Абдулбяров больше 9 лет назад · 4 изменени(я, ий)