The FreeBSD Diary

The FreeBSD Diary (TM)

Providing practical examples since 1998

If you buy from Amazon USA, please support us by using this link.
[ HOME | TOPICS | INDEX | WEB RESOURCES | BOOKS | CONTRIBUTE | SEARCH | FEEDBACK | FAQ | FORUMS ]
Restoring an INOPERABLE 3Ware unit 12 February 2012
Share
Need more help on this topic? Click here
This article has no comments
Show me similar articles

I've been using a 3Ware 9550SX-8LP since 2006. Over the weekend, I encountered the first problem with it. It became inoperable. That's an overstatement, but the problem was easily fixed.

After a reboot to upgrade the kernel, Nagios alerted me to a problem. I checked via the command line and found this situation:

# tw_cli info c0

Ctl   Model        (V)Ports  Drives   Units   NotOpt  RRate   VRate  BBU
------------------------------------------------------------------------
c0    9550SX-8LP   8         8        3       1       4       1      OK       

# tw_cli info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   OK             -       -       64K     195.548   ON     ON     
u1    SPARE     OK             -       -       -       69.2404   -      ON     
u2    RAID-10   INOPERABLE     -       -       64K     195.548   OFF    ON     

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u2     69.25 GB    145226112     WD-WMAKE2379003     
p1     OK               u1     69.25 GB    145226112     WD-WMAKE2379069     
p2     OK               u0     69.25 GB    145226112     WD-WMAKE2379066     
p3     OK               u0     69.25 GB    145226112     WD-WMAKE2379012     
p4     OK               u0     69.25 GB    145226112     WD-WMAKE2379286     
p5     OK               u0     69.25 GB    145226112     WD-WMAKE2379019     
p6     OK               u0     69.25 GB    145226112     WD-WMAKE2394339     
p7     OK               u0     69.25 GB    145226112     WD-WMAKE2378696     

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       255    02-Sep-2010

Here, you can see that u2 has a problem. Looking at the output details, we can also see that u2 contains a single HDD and is connected to port 0 (p0). That means it is one of the two spares that have existed in this array since I set it up.

I will remove that unit, and add it back into the array. See below.

Fixing it

I found help via Google and used that as an example. I also posted to FreeBSD Forums before I proceeded. But today, before I received a reply, I went ahead...

First, I removed the defective u2 unit:

# tw_cli maint deleteunit c0 u2
Deleting unit c0/u2 ...Done.


# tw_cli info

Ctl   Model        (V)Ports  Drives   Units   NotOpt  RRate   VRate  BBU
------------------------------------------------------------------------
c0    9550SX-8LP   8         8        2       0       4       1      OK       

# tw_cli info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   OK             -       -       64K     195.548   ON     ON     
u1    SPARE     OK             -       -       -       69.2404   -      ON     

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               -      69.25 GB    145226112     WD-WMAKE2379003     
p1     OK               u1     69.25 GB    145226112     WD-WMAKE2379069     
p2     OK               u0     69.25 GB    145226112     WD-WMAKE2379066     
p3     OK               u0     69.25 GB    145226112     WD-WMAKE2379012     
p4     OK               u0     69.25 GB    145226112     WD-WMAKE2379286     
p5     OK               u0     69.25 GB    145226112     WD-WMAKE2379019     
p6     OK               u0     69.25 GB    145226112     WD-WMAKE2394339     
p7     OK               u0     69.25 GB    145226112     WD-WMAKE2378696     

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       255    02-Sep-2010  

This has removed the unit from the array. Now I add it back into the array. I knew it was p0 because it was listed as so in the above output.

# tw_cli maint createunit c0 p0 rspare
Creating new unit on controller /c0 ... Done. The new unit is /c0/u2.
WARNING: This Spare unit may replace failed drive of same interface type only.

Now the status looks like this:

# tw_cli info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   OK             -       -       64K     195.548   ON     ON     
u1    SPARE     OK             -       -       -       69.2404   -      ON     
u2    SPARE     OK             -       -       -       69.2404   -      OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u2     69.25 GB    145226112     WD-WMAKE2379003     
p1     OK               u1     69.25 GB    145226112     WD-WMAKE2379069     
p2     OK               u0     69.25 GB    145226112     WD-WMAKE2379066     
p3     OK               u0     69.25 GB    145226112     WD-WMAKE2379012     
p4     OK               u0     69.25 GB    145226112     WD-WMAKE2379286     
p5     OK               u0     69.25 GB    145226112     WD-WMAKE2379019     
p6     OK               u0     69.25 GB    145226112     WD-WMAKE2394339     
p7     OK               u0     69.25 GB    145226112     WD-WMAKE2378696     

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       255    02-Sep-2010  

The next step is to verify that new unit:

# tw_cli
//supernews> maint verify c0 u2
Sending start verify message to /c0/u2 ... Done.

//supernews>

Now that verify has started, you can see that in the output of info:

# tw_cli info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   OK             -       -       64K     195.548   ON     ON     
u1    SPARE     OK             -       -       -       69.2404   -      ON     
u2    SPARE     VERIFYING      -       23%     -       69.2404   -      OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     VERIFYING        u2     69.25 GB    145226112     WD-WMAKE2379003     
p1     OK               u1     69.25 GB    145226112     WD-WMAKE2379069     
p2     OK               u0     69.25 GB    145226112     WD-WMAKE2379066     
p3     OK               u0     69.25 GB    145226112     WD-WMAKE2379012     
p4     OK               u0     69.25 GB    145226112     WD-WMAKE2379286     
p5     OK               u0     69.25 GB    145226112     WD-WMAKE2379019     
p6     OK               u0     69.25 GB    145226112     WD-WMAKE2394339     
p7     OK               u0     69.25 GB    145226112     WD-WMAKE2378696     

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       255    02-Sep-2010  

This was all much easier that I thought it was going to be...


Share
Need more help on this topic? Click here
This article has no comments
Show me similar articles