Debugging Connection Leaks

21 Jun Debugging Connection Leaks

We recently had an issue with a web service we built to be the gateway for data entry to a client application. The issue was inconsistent (the worst kind). The errors seemed to occur in relatively short but frequent waves.

A review of the error logs showed the following error:

Description:
Failure to Open Database Connection

Error Stack Trace:
at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection)
at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory)
at System.Data.SqlClient.SqlConnection.Open()
at LeadManager.Data.DataAccessBase.openCNX()

This indicated a connection leak issue. There are a limited number of sql connections in the connection pool for any application. When the application attempt to obtain a connection and all the connections in the connection pool are in use the application will wait for a connection to be released back to the connection pool. However, it will only wait for the length of the connection timeout setting. If no connections become available it will timeout and throw an exception.

After a thorough review of our data access layer we determined that the connection leak issue might be caused by sqlDataReaders not being explicitly closed. There were potential instances in the data access layer where an exception in reading from the sqlDataReader might cause the processing jump out of the try block to the catch block essentially jumping over the explicit close of the sqlDataReader. To fix this potential issue we changed all instantiation of sqlDataReaders to ‘using’ blocks like this:


using (SqlDataReader dr = cmd.ExecuteReader())
{
if (dr != null && dr.HasRows)
{
dr.Read();
// process data reader here
}
}

After moving this attempted fix to production we saw instant improvement. However the improvement was short lived. Within 12 hours the issue was occurring again. We had not fixed the issue but something we had done had alleviated the problem short term. At this point I turned to the Database to try to review the connections and from where those connections were coming. Not being an sql Database pro I turned to google and found two very helpful items.

  1. The sysprocesses table
  2. DBCC functions

The following queries pointed me to exactly where my problem lied.

First I wanted to see the total number of connections to my database:

Select count(dbid) as ‘Total Connections’, DB_NAME(dbid) as ‘Database Name’
From sys.sysprocesses with (nolock)
where DB_NAME(dbid) = ‘MyDatabaseName’
Group By dbid

Next I wanted to see from where those connections were coming. In particular I wanted to see the oldest connections:

SELECT * FROM sys.sysprocesses WITH (nolock)
WHERE DB_NAME(dbid) = ‘MyDatabaseName’
Order By login_time asc

Third, and most importantly, I wanted to know what sql script or stored procedures where these oldest connections to the database running. Using the spid (sql process id) from the previous querys oldest connections I ran the following:

DBCC INPUTBUFFER(87)

This query gives the stored procedure or sql script being run by the connection using the specified spid.

I ran these queries in my QA database for testing purposes. The results of these queries showed that even in the QA database where there was little traffic I had over 100 open connections to my database. The oldest connection was almost an hour old. And the sql script being called by almost all of the open connections to the database was the same script. Searching my data access layers for the script I quickly found it and immediately saw that this particular method was missing it’s ‘finally’ block from where all methods in my data access layer close their database connections. This particular method turned out to be a method called by a scheduled job that runs every 60 seconds.

After adding this ‘finally’ block to close the connection and updating the web service in QA, My open connections went from over 100 to 3. After checking back 2 hours later there were still only 3 open connections.

My research for ways to detect and diagnose this connection leak issue pointed me toward many tools some free some very expensive. But a couple of well crafted sql statements were the best tools I could have used.

Gregg Duncan

Tags: