Monday, September 26, 2005

Cluster Shared Disk Refused To Start

One of our Shared Disk on the clustered server refused to start and we see Event ID 1066 (Cluster disk resource Disk : is corrupt. Running ChkDsk /F to repair problems.). When we tried to access the drive, we were access denied. We can’t access the drive, so how can we run chkdsk against it?

Here is where team work comes into play. My colleague helps to search the Internet for solution while I am trying other methods on the server. My colleague found out the way to let the operating system take control of the shared disk rather than the cluster service.

This is how you do it.

1) Go to Device Manager, right click, highlight View and select Show hidden devices.
2) Expand Non-Plug and Play Drivers.
3) Double click on Cluster Disk Driver.
4) On the Cluster Disk Driver Properties dialog box, click on the Driver tab.
5) Change the Startup Type to Demand.
6) On the Services console, change the Startup Type for the Cluster Service to Manual.
7) Make sure you don’t allow the cluster resource group to failover to the other node. You can take the resource group offline.
8) Reboot the server.

Once we restarted the server, we check on the ACL of the problematic drive and found out that the ACL was incorrect. We reset the ACL manually and reversed the steps mentioned above to let the cluster service take control of the shared disk. We rebooted the server and everything was back to normal.

By the way, we got the information from Need to run chkdsk on a clustered system?.

No comments: