Connection.Open hanging/freezing

Discussion of open issues, suggestions and bugs regarding ADO.NET provider for Oracle
klaus linzner
Posts: 28
Joined: Thu 16 May 2013 09:18

Connection.Open hanging/freezing

Post by klaus linzner » Fri 06 Dec 2013 10:04

Hi everyone,
I'm sorry I have to start such a squishy post, but we're pretty much out of ideas...

About half a year ago we started noticing problems with OracleConnection.Open if network issues came up. Every now and then (once every two to three weeks) an OracleConnection.Open would freeze the thread and not return. Never... And we were not able to find any clue, schema or hint on what could've caused it and how we could reproduce it.

To be at least able to detect the problem and not freeze up any working Process, we started calling it from within a new Task so we're able to cancel it after x seconds, throw an Exception and continue working. Although we had to take the performance hit it was important to us to at least know that there's a problem.

A couple of weeks ago we started testing our applications over crappy network connections by using SoftPedia Connection Emulator. Although we still can't give any details on what triggers the freezing, we're now at least able to reproduce it each day in at least one testcase. At one point, any subsequent calls to connection.Open on Connections having the same connectionString as the connection that froze first would hang;


Long story short:
* Random freezing on connection.Open with Devart 7.2.114 (Which is used by our 2.5 release)
* sometimes any following connection openings with the same ConnectionString will fail as well.
* Problems seems much less frequent in Devart 7.9.333 (which is used by our 2.6 release)

Any now my questions:
* Do you know what could trigger this behavior?
* Did you have this problem yourself too once in a while?
* Do you know if any changes were made between 7.2.114 and 7.9.333 that would explain the improvement?
Upgrading the Devart Version from 7.2 to 7.9 in our 2.5 release would take several days, not to think of addins developed by other departments that would require retesting, so I'd like to be rather sure it'd improve the situation before walking down that path...

BR && thanks a lot for any help,
Klaus

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Tue 10 Dec 2013 14:56

Try setting the "Validate Connection=true;" connection string parameter to validate connections that are being got from the pool. For more information, please refer to
http://www.devart.com/dotconnect/oracle ... ction.html

klaus linzner
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: Connection.Open hanging/freezing

Post by klaus linzner » Tue 04 Feb 2014 09:23

Hello Pinturiccio,
We tried several Connection handling settings and it seems that Validate Connection doesn't affect this behavior. (Most of the time we're running without connection pooling anyway).

Any other ideas how we could deal with this problem?

dcoracle600pro
Posts: 51
Joined: Mon 09 Apr 2012 09:57

Re: Connection.Open hanging/freezing

Post by dcoracle600pro » Thu 06 Feb 2014 10:08

We are experiencing the exact same symptoms on version 7.2.104

We have a WCF application that comes to a complete halt following a short network issue. Have already tried "Validate Connection=true;" which did not help.

We are able to reproduce the issue by simulating a network issue as follows:

1) Start the WCF application and ensure there is at least some traffic
2) Alter Oracle Client TNS file so that all Oracle connection attempts time out (change host name to 1.1.1.1)
3) Wait a few seconds and then correct the TNS file

All WCF calls that utilize the database will now hang... Forever...

Klaus,

I would be interested to see your workaround code if you can share it.

klaus linzner
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: Connection.Open hanging/freezing

Post by klaus linzner » Thu 06 Feb 2014 12:38

dcoracle600pro wrote:Klaus,
I would be interested to see your workaround code if you can share it.
Hi,
Glad we're not the only ones ;)
So far our workaround code is pretty limited to gracefully shutdown in places where our wrapper (to support Oracle as well as MsSql and manage connectivity problems) around Devart is used (pretty much everywhere except EntityFramework).

In the simple case we just call connection.Open inside a Task

Code: Select all

return Task.Factory.StartNew(() =>
{
    Stopwatch stopSync = Stopwatch.StartNew();
    connection.Open();
    stopSync.Stop();
    return stopSync.ElapsedMilliseconds;
}, cancelToken, TaskCreationOptions.LongRunning, TaskScheduler.Default);
and pass on a cancellation token; once this cancellation is elapsed we try to deal with exceptions,... and eventually start shutdown; problem being is that the Task scheduling sometimes induces a performance hit of several ms;

In cases where this performance overhead is too high and we need to scrape every ms we can, we call Open() inline, but create a timer before.
Once this timer is fired we try to access the original thread and abort it, afterwards try to reset the abort on the original thread.
It depends on the OS, environment and user privileges if this method works and if we can resume on the original thread, but at least we notice the problem in our application and can deal with it.

Code: Select all

Tuple<Thread, string> callbackTuple = new Tuple<Thread, string>(Thread.CurrentThread, connectionIdentifier);
             
using (Timer failoverLogging = new Timer(ConnectionOpenTimerCallback, callbackTuple, MaxOpenTimeInMs, MaxOpenTimeInMs)){
    try
    {
        connection.Open();
        failoverLogging.Change(Timeout.Infinite, Timeout.Infinite);
        stopSync.Stop();

        if (cancelToken.IsCancellationRequested)
        {
            tcs.TrySetCanceled();
        }
        else
        {
            tcs.TrySetResult(stopSync.ElapsedMilliseconds);
        }
    }
    catch (ThreadAbortException threadAbortException)
    {
        log.Error("Aborted while connecting sync, try to recover...", threadAbortException);
        Thread.ResetAbort();
        threadAbortException.RethrowWithoutStackTraceLoss();
    }
}
...
private void ConnectionOpenTimerCallback(object state)
{
    try
    {
        Tuple<Thread, string> callbackTuple = state as Tuple<Thread, string>;
        if (callbackTuple == null)
        {
            log.ErrorFormat("Connection.Open - Timeout exceeded. {0}", state);
        }
        else
        {
            Thread waitingThread = callbackTuple.Item1;
            string connectionIdentifier = callbackTuple.Item2;

            log.DebugFormat(
                "Connection.Open - Timeout exceeded on connection {0} on Thread {1} - try aborting",
                connectionIdentifier, waitingThread.ManagedThreadId);
            waitingThread.Abort("You're hanging...");
        }
    }
    catch (Exception ex)
    {
        log.Fatal("ConnectionOpenTimerCallback failed", ex);
    }
}
With those methods we at least could detect the problem and start alerts, but we weren't able to recover (do magic and be able to connect again) from it;
I really hope that this problem gets solved somehow, so we can kick this Threading-Voodoo out of our code again (thinking that my company needs to support this code for the next 15 years causes some kind of pain)

Anyway - up until now we haven't found any reliable way to enforce this behavior.
Although it was possible to produce it at times with "Softperfect Connection emulator", pulling the plug, shutting down the network device, disconnecting Vpn,... it wasn't possible to find any method that can produce it anytime. I tried your tnsnames method too, but wasn't able to (re)produce it either...

If you, or Devart, have any other ideas on how to deal with freezing - or even how to reproduce it - you're more than welcome! Last week we ran into a similar problem, this time with a frozen command execution, but dealing like above on Command level is definitely out of scope, the performance hit would be too high.

BR, Klaus

dcoracle600pro
Posts: 51
Joined: Mon 09 Apr 2012 09:57

Re: Connection.Open hanging/freezing

Post by dcoracle600pro » Thu 06 Feb 2014 23:44

Thanks Klaus, appreciate it.

Hopefully we hear back from Devart on this as it very disruptive for us also.

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Fri 07 Feb 2014 15:30

dcoracle600pro wrote:We are able to reproduce the issue by simulating a network issue as follows:
We have reproduced the freezing, however it does not freeze forever for us. After returning a correct TNS file, the application soon "unfreezes" continues to work correctly. However, the more threads open OracleConnection in the application, the longer the freeze lasts.

Are you sure that your application freezes truly forever?
Please tell us more information about your environment and application:
* Full version of your Oracle Client
* The capacity of Oracle Client
* Full version of Oracle server
* Is Oracle RAC used on your server
* The connection string you use (without credentials)
* The call stack of the frozen thread, where the freezing happens in our code
In order to help us reproduce the issue, please send us a small test project that reproduces the problem.
klaus linzner wrote:If you, or Devart, have any other ideas on how to deal with freezing - or even how to reproduce it - you're more than welcome!
dcoracle600pro have specified the way to reproduce the connection freezing. Can your issue be reproduced in the same way?

Please also tell us the information that we asked dcoracle600pro for.

dcoracle600pro
Posts: 51
Joined: Mon 09 Apr 2012 09:57

Re: Connection.Open hanging/freezing

Post by dcoracle600pro » Mon 10 Feb 2014 23:48

Hi Pinturiccio

The application does freeze for at least 24 hours (at which point we restart it) in multiple tests we have run.

We suspect the freezing only occurs once we have "corrupted" all the connections in the connection pool (which is accomplished by aggressively calling the web service).

* Full version of your Oracle Client
11.2.0.1.0
* The capacity of Oracle Client
Oracle Runtime Client
* Full version of Oracle server
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
* Is Oracle RAC used on your server
No. Also, this occurs in all our database environments
* The connection string you use (without credentials)

Code: Select all

    <add name="MyConnectionString" connectionString="metadata=res://*/ELECBID.csdl|res://*/ELECBID.ssdl|res://*/ELECBID.msl;provider=Devart.Data.Oracle;provider connection string="User Id=CHANGED;Password=;Server=MYDP.WORLD;Direct=False;Persist Security Info=True""
      providerName="System.Data.EntityClient" />

* The call stack of the frozen thread, where the freezing happens in our code

I have attached a debugger which suggests the call stack is as follows. I am not able to verify this 100%

Code: Select all

System.Threading.ThreadHelper.ThreadStart (object obj) 
   System.Threading.ExecutionContext.Run (ExecutionContext executionContext, ContextCallback callback, object state) 
    System.Threading.ExecutionContext.Run (ExecutionContext executionContext, ContextCallback callback, object state, bool preserveSyncCtx) 
     System.Threading.ExecutionContext.RunInternal (ExecutionContext executionContext, ContextCallback callback, object state, bool preserveSyncCtx) 
      Devart.Data.Oracle.bf.b (object A_0) 
       OciDynamicType.OCIServerAttach (HandleRef, HandleRef, byte[], int, uint) 
        (Unmanaged code)  
         Devart.Data.Oracle.bf.b (object A_0) 
          OciDynamicType.OCIHandleFree (HandleRef, int)

Regards
Sam

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Tue 11 Feb 2014 12:56

dcoracle600pro wrote:
* The capacity of Oracle Client
Oracle Runtime Client
Please tell us the capacity of Oracle Client: x86 or x64.

dcoracle600pro
Posts: 51
Joined: Mon 09 Apr 2012 09:57

Re: Connection.Open hanging/freezing

Post by dcoracle600pro » Tue 11 Feb 2014 21:14

x64

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Wed 12 Feb 2014 13:41

We recommend you to set "Connection Timeout=0;" in the connection string. Please tell us if it helps.

dcoracle600pro
Posts: 51
Joined: Mon 09 Apr 2012 09:57

Re: Connection.Open hanging/freezing

Post by dcoracle600pro » Fri 14 Feb 2014 03:02

Our initial testing looks like this may have resolved the issue!

However, I am a little concerned about turning off connection timeout. Can you comment on what possible side effects we may encounter? Just want to know what to expect before we put this in production.

Thank you very much for your effort so far.

Shalex
Site Admin
Posts: 9543
Joined: Thu 14 Aug 2008 12:44

Re: Connection.Open hanging/freezing

Post by Shalex » Tue 18 Feb 2014 11:38

There is a rare problem you have encountered in our current implementation of Connection Timeout. We are planning to fix the provider behaviour to make usage of Connection Timeout with non-zero value be safe with your scenario.

With the "Connection Timeout=0;" option in a connection string, the behaviour of connection timeout is managed by settings of Oracle client (not by our provider).

clint_silver
Posts: 9
Joined: Mon 24 Feb 2014 15:06

Re: Connection.Open hanging/freezing

Post by clint_silver » Mon 24 Feb 2014 16:32

Sorry for bringing up an old thread but was this ever resolved, getting exact same symptoms where intermittent connections, getting issue inside and outside firewall connections. been happening for a long time, no noticable changes.

Multiple clients, might be 2-3 times a day, might be once a week. App working fine earlier that day, works fine on restart.

Error is always
Devart.Data.Oracle.OracleException (0x80004005): Server did not respond within the specified timeout interval




versions
devart 8.1.55.0 and 7.8.287.0
Oracle server 11.2.0.2.0 (RAC)
Oracle client 11.2.0.1.0

I can see 2 types of behaviour when we get the issue. We use SCAN listeners with RAC,
1. I can see the SCAN listener log taking the connection entry but not passing onto the local node listener.
2. I can see the SCAN listener log entry AND the local node entry so I dont think its related to RAC.

In both cases, the devart command timeout kicks in after 15 seconds of thinking its got a dead connection. Many other clients connected fine at same time, no load on client, server or firewall.

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Wed 26 Feb 2014 15:56

OracleConnection class has the 'Connection Timeout' property. The default value of the 'Connection Timeout' property is 15 second. If a connection is not established after 15 seconds, the "Server did not respond within the specified timeout interval" exception is thrown.

We recommend you to set "Connection Timeout=0;" in the connection string. A value of 0 indicates no limit.

Post Reply