Cluster Sevice Fails with Event ID 1069

Last Post 08 Apr 2008 04:46 PM by ntharrison. 3 Replies.
AddThis - Bookmarking and Sharing Button Printer Friendly
  •  
  •  
  •  
  •  
  •  
Sort:
PrevPrev NextNext
You are not authorized to post a reply.
Author Messages
poverud
New Member
New Member

--
31 Aug 2006 01:00 AM
SQL Server 2005 (No SP)
Windows 2003 64bit
MSDTC
Cluster

---

I'm facing random restarts of my SQL servers and I know that the Cluster service has something to do with it, one way or another.

System Events:

- Prior, nothing relevant.
- 18:56:00 Cluster resource 'SQL Server' in Resource Group '******' failed.
- 18:56:01 The SQL Server Agent (MSSQLSERVER) service was successfully sent a stop control.


Application Events:
- 18:55:57 [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]TCP Provider: The specified network name is no longer available.
- 18:55:57 [sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]Communication link failure
- 18:55:57 [sqsrvres] OnlineThread: QP is not online.
- 18:55:57 [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed'
- 18:55:57 [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
- 18:55:57 [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
- 18:55:57 [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
- 18:55:57 [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
- 18:55:57 [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
- 18:55:57 [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
- 18:55:57 [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
- 18:55:57 [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
- 18:55:57 [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
- 18:56:08 Configuration option 'Agent XPs' changed from 1 to 0. Run the RECONFIGURE statement to install.
- 18:56:09 [sqsrvres] OnlineThread: asked to terminate while waiting for QP.
- 18:56:14 Service Broker manager has shut down.

This happens on both server at the same time.

I have noticed that prior to this, the Windows Installer service is executed, although no updates are downloaded.
I have checked for DNS errors in case the failover would cause any name collisions.
I have not checked the cluster dependencies, because I can not take any group offline.
MSDTC is installed in its own group, although dependencies are not verified.
The MSSQL instance alowes remote connections.
The Cluster Service user has access privileges in the master DB.

---

The only users accessing the servers at this time is the SQL Server Agent and the Cluster Service.

---

I would be happy to provide more information.

Thank you in advance

with Regards
Filip Poverud
poverud
New Member
New Member

--
31 Aug 2006 04:48 AM
quote:

Originally posted by: rm
Is sql restarting caused by cluster failover? If so, have to find out why failover.


Actually there is no failover. I expressed myself incorrectly.
The cluster service fails on both servers at the same time.

We are looking into the heartbeat, could be a network issue if it's connected through a switch or something.

/F
ntharrison
New Member
New Member

--
08 Apr 2008 04:46 PM
I have the very same issue on a production system. The error occurs daily (nearly every day) at just after 6am.

As such I have made the assumption that the issue is environmental, possibly something to do with th SAN or network at the data centre in which the system is located. As I am the production DBA I have no admin access to the SAN, Network switches + routers or backup software so I have logged the job with the 3rd party vendor who manages this for us. This is quite frustrating as it has been 2 weeks now and I have not had a response.

I would suggest that you look at the times you have the issue and any environmental tasks for the SAN or network that occur at those times. I am pretty sure that they are going to come back and tell me that at that time each day the synch the SAN Mirroring between sites or something similar. This would explain why the cllustered network and disk resources both go offline at the same time for all nodes of the cluster.
jorge.rivera
New Member
New Member

--
10 Apr 2008 10:40 AM
Hi

Some time before i get errors like that, MSFT recommend as first step update all drivers (for all devices including HBA), bios, Antirus, multipath agents, the issue disapper, i have around 3 months without problems, may be your issue is not the same, but it can help you.
In my case windows 2003 , sql 2000 sp3a, 32bit, SAN EMC cx 500.. check this patchs STOR Miniport 32-bit Driver (x86).

Cheers...
You are not authorized to post a reply.

Acceptable Use Policy