Connection.Open hanging/freezing

Discussion of open issues, suggestions and bugs regarding ADO.NET provider for Oracle
clint_silver
Posts: 9
Joined: Mon 24 Feb 2014 15:06

Re: Connection.Open hanging/freezing

Post by clint_silver » Thu 27 Feb 2014 08:47

In that case Id recommend setting it to something like 70 seconds. If the program has frozen and an organization has sqlnet timeout at default 60 seconds then you should see a timeout after 60 seconds. However (Im an old DBA) it maybe that some companies have sqlnet timeout to some rather large figure in which case the frozen devart connection will hang for that period of time. 70 seconds is a rather distinct period that isnt too long but is enough to kill the program rather than leaving it frozen.

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Mon 03 Mar 2014 10:49

dotConnect for Oracle has own connection timeout. The 'Connection Timeout' property of the OracleConnection class has the default value 15 second. If a connection is not established during this period, dotConnect for Oracle generates the exception.
Pinturiccio wrote:A value of 0 indicates no limit.
This means that there is "no limit" only for dotConnect. In fact there is a limit because of Oracle Client. The attempt of establishing a connection will be interrupted when a timeout value will be reached either in dotConnect for Oracle or in Oracle Client.
If you set "Connection Timeout=0;" only dotConnect for Oracle timeout will be disabled. dotConnect for Oracle will wait for establishing a connection until the Oracle Client timeout is reached.

clint_silver
Posts: 9
Joined: Mon 24 Feb 2014 15:06

Re: Connection.Open hanging/freezing

Post by clint_silver » Thu 06 Mar 2014 11:25

So we set up 2 programs connecting to the database through firewall. All the programs do is
Hi database
Bye database.

runs every second. I can see listener log, connections coming in. firewall sees connections coming and being returned.

running for a few days with no issue. output to a log. got a timeout with 1 of them this morning.

this is the config of the one that timed out.

<DevArtConnection.Properties.Settings>
<setting name="ConnectionString" serializeAs="String">
<value>User Id=<username>;Password=<password>;Data Source=<instance>;Connection Timeout=90;Default Command Timeout=0;Pooling=false;Oci Session Pooling=false</value>
</setting>
</DevArtConnection.Properties.Settings>



06/03/2014 08:07:02 - Connecting... (Timeout = 90)
06/03/2014 08:07:02 - Connected suscesfully
06/03/2014 08:07:02 - Closing...
06/03/2014 08:07:02 - Clossed
06/03/2014 08:07:03 - Connecting... (Timeout = 90)
06/03/2014 08:07:24 - Exception, Devart.Data.Oracle.OracleException (0x80004005): ORA-12170: TNS:Connect timeout occurred
at Devart.Data.Oracle.ao.a(b5 A_0, o A_1)
at Devart.Data.Oracle.OracleInternalConnection..ctor(b5 connectionOptions, OracleInternalConnection proxyConnection)
at Devart.Data.Oracle.an.a(p A_0, Object A_1, DbConnectionBase A_2)
at Devart.Common.DbConnectionFactory.a(DbConnectionBase A_0, p A_1)
at Devart.Common.DbConnectionFactory.b(DbConnectionBase A_0)
at Devart.Common.DbConnectionClosed.Open(DbConnectionBase outerConnection)
at Devart.Common.DbConnectionBase.Open()
at Devart.Data.Oracle.OracleConnection.Open()
at DevArtConnection.Program.MakeConnection() in c:\Users\conn_test.exe
06/03/2014 08:07:24 - Closing...
06/03/2014 08:07:24 - Clossed
06/03/2014 08:07:25 - Connecting... (Timeout = 90)
06/03/2014 08:07:26 - Connected suscesfully



The second devart program running beside it, same machine, different dos window

config

<userSettings>
<DevArtConnection.Properties.Settings>
<setting name="ConnectionString" serializeAs="String">
<value>User Id=<username>;Password=<password>;Data Source=<instance>;Pooling=false;Oci Session Pooling=false</value>
</setting>
</DevArtConnection.Properties.Settings>
</userSettings>


and that didnt timeout at the same time as the other one did. The only difference being the default timeout parameters are being used in the one that didnt timeout

In our main app with the issue that gets this intermittently, default timeout is used. Id fully expect at some point that the default one will go as well.

Also note the connection timed out after what looks like 21 seconds, this does not correlate with any timeout setting at any layer that we have.

06/03/2014 08:07:03 - Connecting... (Timeout = 90)
06/03/2014 08:07:24 - Exception, Devart.Data.Oracle.OracleException (0x80004005): ORA-12170: TNS:Connect timeout occurred


On my SCAN listener log which both connections are hitting
I see 08:07:01 both connections come in
I see 08:07:02 both connections come in
I see 08:07:03 both connections come in
.. only default connection comes in until 08:07:26 when both connections start coming again

but if I look at my local listener log that the SCAN has handed off to

I see 08:07:01 both connections come in
I see 08:07:02 both connections come in
I see 08:07:03 Only the default connection has made it to here so the SCAN listener got the connection but has not handed off to local.
default connection only until
08:07:26 both connections coming in again

What could intermittently stop an individual connection being handed off to local listener, we have wireshark logs and are examining

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Thu 13 Mar 2014 13:00

Establishing a connection can take much time because of different reasons. We don't know why connection opening took more than 90 second for you in one case for a few days. Probably your server or network was overloaded. Maybe a limit for simultaneously open connections was reached.

If a connection opening can take more than default 15 seconds for you, we recommend using a greater value for the 'Connection Timeout' connection string parameter.

klaus linzner
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: Connection.Open hanging/freezing

Post by klaus linzner » Thu 13 Mar 2014 13:56

Please don't get me wrong - but the problem could be devart.data.* as well.

Our application consists of several services. At one point or the other, each service talks to the database. For example - take the following setup:
Service "ServiceA" connects to the database "ORAA" with login "User1" and "User2"
Service "ServiceB" connects to the database "ORAA" with login "User2" and "User3"
Both run on the same machine.
After connection issues we sometimes run into the situation that ServiceA still works fine but "ServiceB" can't connect anymore with login "User2". "User2" is working well when connecting from "ServiceA". "ServiceB" also has no troubles connecting as "User3".
So far we've tried many of the connectionString settings. Pooling on/off , Validation on/off, connection timeout on/off - but at one point or the other we're running into problems with oracleConnection once a network problem has occured (but has been restored).

Using non pooled connections that never return upon a call to connection.open (the initial source for this thread) while another thread in the same application happiliy closes/opens connections to the same database with the same user but the connectionString is just enough altered so they're not the same is in my opinion not acceptable.

Our current interaction with devart.data.oracleconnection is that we're calling open in separate task, calling close in a seperate task and calling dispose in a seperate task (and of course - the same with components that use oracleconnection like OracleAlerter). This won't help with any problem itself - it just helps us to know that we're lost.
Besides this we need to close our connections periodically (because they get slower, they may leak temp tablespace and they may leak memory).

Instead of calling a simple one-liner like "connection.open()" we're required to build a unnecessary huge library (that introduces bugs as well) to deal with any of the above mentioned issues - and we still can't workaround every issue or deal with every Exception as even CorruptedStatedExceptions can be thrown.

Please don't take it personally, but reading "increase connection timeout" as answer seems like a slap in the face. Why don't you (or your developers) just sit down for a day and try to reproduce (one of) the problems?
Not trying to show that your code works but try to break it.
Throw more load at the server than it can handle.
Interrupt the network.
Pull the plug.
Misconfigure the firewall.
Simulate a crappy network connection.

I'm sure you'll stumble upon some of our problems...

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Fri 14 Mar 2014 14:39

klaus linzner wrote:After connection issues we sometimes run into the situation that ServiceA still works fine but "ServiceB" can't connect anymore with login "User2". "User2" is working well when connecting from "ServiceA". "ServiceB" also has no troubles connecting as "User3".
So far we've tried many of the connectionString settings. Pooling on/off , Validation on/off, connection timeout on/off - but at one point or the other we're running into problems with oracleConnection once a network problem has occured (but has been restored).
Please tell us what exception occurs when you can't connect with "User2" on ServerB. If the application freezes - please provide us a snippet of code where the freezing happens.
klaus linzner wrote:Why don't you (or your developers) just sit down for a day and try to reproduce (one of) the problems?
Not trying to show that your code works but try to break it.
Throw more load at the server than it can handle.
Interrupt the network.
Pull the plug.
Misconfigure the firewall.
Simulate a crappy network connection.
Please send us a small test project with the corresponding DDL/DML scripts for reproducing the issue. Please also describe the steps for reproducing the issue consistently.

klaus linzner
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: Connection.Open hanging/freezing

Post by klaus linzner » Wed 19 Mar 2014 23:32

Pinturiccio wrote: Please tell us what exception occurs when you can't connect with "User2" on ServerB. If the application freezes - please provide us a snippet of code where the freezing happens.
No exception occurs when "User2" on ServerB can't connect. The call to open is made, but it never (at least not within several hours) returns.
Pinturiccio wrote: Please send us a small test project with the corresponding DDL/DML scripts for reproducing the issue. Please also describe the steps for reproducing the issue consistently.
As said before - we're not able to reproduce the issue with a short snippet. Is it ok to create a test project that runs several minutes accross several threads while you simulate a lost/faulty/broken network connection?

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Thu 20 Mar 2014 16:52

klaus linzner wrote:Is it ok to create a test project that runs several minutes accross several threads while you simulate a lost/faulty/broken network connection?
It's currently the only way to reproduce the issue, so send us your project and describe the steps for reproducing the issue.

klaus linzner
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: Connection.Open hanging/freezing

Post by klaus linzner » Fri 21 Mar 2014 07:51

Pinturiccio wrote:It's currently the only way to reproduce the issue, so send us your project and describe the steps for reproducing the issue.
Thanks. I'm gonna try the suggestion you gave in the other thread (http://forums.devart.com/viewtopic.php?f=1&t=29156) first and get back to you.

clint_silver
Posts: 9
Joined: Mon 24 Feb 2014 15:06

Re: Connection.Open hanging/freezing

Post by clint_silver » Mon 24 Mar 2014 10:07

klaus linzner wrote:
Pinturiccio wrote: Please tell us what exception occurs when you can't connect with "User2" on ServerB. If the application freezes - please provide us a snippet of code where the freezing happens.
No exception occurs when "User2" on ServerB can't connect. The call to open is made, but it never (at least not within several hours) returns.
Pinturiccio wrote: Please send us a small test project with the corresponding DDL/DML scripts for reproducing the issue. Please also describe the steps for reproducing the issue consistently.
As said before - we're not able to reproduce the issue with a short snippet. Is it ok to create a test project that runs several minutes accross several threads while you simulate a lost/faulty/broken network connection?
I'm able to simulate it happening with a connection program.

We have no load on client, server or firewall. No dropped packets at firewall. PCAP on client and DB server, analyze in wireshark and no packet errors at time of timeout. there are some infrequent malformed TNS packets intermittently in the pcap from the client but not at time of timeout and usually see these anyway. Time is synced across servers.

Have many different applications hitting the database, only devart connections are experiencing this intermittent time out. It cant be reproduced at will except by letting the connection program run every second, keep openning connection to database and immediately close the connection, and eventually we will see it.
so you see every second

1.hello
2.goodbye
3.hello
4.goodbye
....
....
....
<some time later>. timeout.

Note, connections from outside firewall timeout more frequently than connections from inside but I don't know why this is.

Have you somewhere I can send the 2 test programs, the only difference between them though is the timeout settings in the config.

clint_silver
Posts: 9
Joined: Mon 24 Feb 2014 15:06

Re: Connection.Open hanging/freezing

Post by clint_silver » Mon 24 Mar 2014 17:42

ok, got my network guys to have another look at the PCAP and TNS malformed packets and using some wireshark filters have found a malformed TNS packet at the time of timeout.

We also have other malformed packets for connections that didnt timeout but maybe some difference there.

My network guy can export the connection streams where we have an example of both, have you somewhere I can send these so you can inspect?

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Connection.Open hanging/freezing

Post by Pinturiccio » Thu 27 Mar 2014 14:47

clint_silver wrote:ok, got my network guys to have another look at the PCAP and TNS malformed packets and using some wireshark filters have found a malformed TNS packet at the time of timeout.
We have analyzed your packet captures but it does not help us to investigate the issue.
dotConnect for Oracle in OCI connection mode send data to Oracle Client and Oracle Client establishes a connection with the server and creates TNS packets. dotConnect for Oracle does not influence TNS packets generation in the OCI mode.
clint_silver wrote:Have you somewhere I can send the 2 test programs, the only difference between them though is the timeout settings in the config.
You can send your 2 test programs via this form.

clint_silver
Posts: 9
Joined: Mon 24 Feb 2014 15:06

Re: Connection.Open hanging/freezing

Post by clint_silver » Thu 27 Mar 2014 15:08

ok, thanks for reply. I was doubtful the packets would help. I take your point that you only send the connection via TNS.

We have an Oracle SCAN listener with 5 IPs. As I had noted, we send to only 1 of the addresses which is routed to a single node of the RAC where the service is configured. I have setup another connection oprogram which is looking directly at the VIP address of the database node. This is bypassing the SCAN.

When we get a good connection we see

1.connection leaves client
2. goes through firewall
3. Hits SCAN listener
4. SCAN passes off connection to local listener on DB node

A bad connection where we get a timeout is
1. connection leaves client
2. goes through firewall
3. Hits SCAN listener
4. does not get passed off to local listener.

this new connection program is bypassing step 3. As we see the timeout once every couple of days, if we go for a period of time without seeing the issue we might get some confidence that its the SCAN layer that's causing it. Still don't have an answer as to why as the program connects every second, other than the random malformed packet, we dont know how thats being formed. I might open a call with oracle.

Post Reply