SQL 2000/2005 under Windows 2003 Cluster crash

Last Post 25 Jun 2010 11:51 AM by gunneyk. 1 Replies.
AddThis - Bookmarking and Sharing Button Printer Friendly
  •  
  •  
  •  
  •  
  •  
Sort:
PrevPrev NextNext
You are not authorized to post a reply.
Author Messages
Shankar
New Member
New Member

--
25 Jun 2010 09:51 AM

One of my customers have three MSCS (Active-Passive) 2-node clusters under Windows 2003 SP2 Standard Ed.  The nodes run one instance each for SQL2000 (SP4) and SQL2005 (SP3).  They run DBCC CHECKDB on all databases, and also backup the logs once an hour.  Periodically throughout the day, like every 30 min, the "I/O takes longer than 15 sec" message appears in the SQL Log.  The servers are HP Proliant (BL25p), each have 8GB of memory, storage is EMC.  The servers and storage checked out OK, Did a Storage check, couldn't find any latency anywhere.  Besides the I/O error, once a month or so, the SQL instance itself crash (generally no meaningful errors), but would restart itself.  But once in 2-3 months, with on SQL Instance failure, the node would failover.

This behavior is similar among all 3 clusters.  The Windows EventLogs, SQL ErrorLogs show nothing tangible.

Any ideas?

Shankar

gunneyk
New Member
New Member

--
25 Jun 2010 11:51 AM
That's not much to go on. But the fact that they get the I/O taking longer than 15 seconds on a regualr basis tells me that there is a fundamental problem with the I/O somewhere. The common point is most likely the SAN or the SAN switch. For every message you see there are probably hundreds of other I/O's taking many seconds but still under the 15 second threshold. Look to see what is happening when these messages appear. What does the File and Wait stats say for I/O?
You are not authorized to post a reply.

Acceptable Use Policy