Request hang - OCIFREEHANDLE

Discussion of open issues, suggestions and bugs regarding ADO.NET provider for Oracle
Post Reply
peledkfir
Posts: 22
Joined: Sat 15 Jan 2011 11:06

Request hang - OCIFREEHANDLE

Post by peledkfir » Sat 15 Jan 2011 11:33

Details:
Windows server 2008 64 bit.
.net 4 (using WCF & EF) application host in IIS 7.
Oracle server 11.2g.
dotConnect ver 6.0.70.0.
using threadpool = off (in connection string. In old version it can give you connections that are already participating in transaction. didn't turned it on since then)

Scenario:
Client request times-out after 10 minute.
Although the request is saved in DB.

Research:
After taking dump file using windows debugging tools with adplus.exe.
We can see the following details:

Call stack of the request called CloseConnection of the context. This is the rest of the stack:

0:028> !clrstack
OS Thread Id: 0x20c8 (28)
Child SP IP Call Site
00000000098ecea8 0000000077be6d5a [NDirectMethodFrameStandalone: 00000000098ecea8] OciDynamicType.nativeOCIHandleFree(System.Runtime.InteropServices.HandleRef, Int32)
00000000098ece70 000007ff0022e492 DomainBoundILStubClass.IL_STUB_PInvoke(System.Runtime.InteropServices.HandleRef, Int32)
00000000098ecf40 000007ff005bcae9 OciDynamicType.OCIHandleFree(System.Runtime.InteropServices.HandleRef, Int32)
00000000098ecf80 000007ff00558110 Devart.Data.Oracle.a2.b()*** ERROR: Module load completed but symbols could not be loaded for Devart.Data.Oracle.dll

00000000098ed180 000007ff00557a83 Devart.Data.Oracle.OracleInternalConnection.a(Boolean)
00000000098ed1d0 000007ff0055775e Devart.Common.DbConnectionInternal.CloseInternalConnection()
00000000098ed250 000007ff005574ff Devart.Common.DbConnectionInternal.Close()
00000000098ed2a0 000007ff0017a637 Devart.Common.DbConnectionBase.Close()
00000000098ed2e0 000007fee30e6555 System.Data.EntityClient.EntityConnection.StoreCloseHelper()*** WARNING: Unable to verify checksum for System.Data.Entity.ni.dll

00000000098ed340 000007fee30e64d3 System.Data.EntityClient.EntityConnection.CloseHelper()

***********************
You can see the call for OciDynamicType.OCIHandleFree.

Looking on the unmanaged stack:

0:028> kb
RetAddr : Args to Child : Call Site
00000000`77bc595a : 00000000`00000000 00000000`0c8c79a0 00000000`098ec830 00000000`0c8c7970 : ntdll!ZwWaitForSingleObject+0xa
00000000`77bc5851 : 00000000`00000000 00000000`0c8bbd40 00000000`00000003 00000000`09132b00 : ntdll!RtlpWaitOnCriticalSection+0xea
*** ERROR: Symbol file could not be found. Defaulted to export symbols for oracore11.dll -
00000000`04f96470 : 00000000`09ab8080 00000000`09132b00 00000000`0c8bbd40 00000000`00000000 : ntdll!RtlEnterCriticalSection+0xf4
*** ERROR: Symbol file could not be found. Defaulted to export symbols for OraClient11.Dll -
00000000`04b77985 : 00000000`00000009 00000000`05bab2a0 00000000`0c8c79a0 00000000`04f9644f : oracore11!sltsmna+0x1c
00000000`04b72393 : 00000000`09ab8080 00000000`00000000 00000000`04e25cb0 00000000`04f9644f : OraClient11!kpufhndl0+0x5a3d
00000000`04b71f34 : 00000000`0c8bbd40 00000000`0c8bbe90 00000000`00000000 00000000`04a35677 : OraClient11!kpufhndl0+0x44b
00000000`04a35a7d : 00000000`041a143f 00000000`0c8bbd40 00000000`098ece10 00000000`098ecf60 : OraClient11!kpufhndl+0xe
*** ERROR: Symbol file could not be found. Defaulted to export symbols for oci.dll -
00000000`041a34c5 : 00000000`098ee140 000007fe`f9d417c7 00000000`0468f380 00000000`098ee140 : OraClient11!OCIHandleFree+0x15
000007fe`f9d417c7 : 00000000`098ecda0 00000001`4d90a938 00000000`00000000 00000000`00000000 : oci!OCIHandleFree+0x71

***********************

oci!OCIHandleFree is waiting for SingleObject (entering critical section).

Searching for that critical section shows that:

0:028> !critsec 00000000`09132b00

CritSec +9132b00 at 0000000009132b00
WaiterWoken No
LockCount 1
RecursionCount 1
OwningThread ef4
EntryCount 0
ContentionCount 2
*** Locked
0:028> !threads -special
ThreadCount: 28
UnstartedThread: 0
BackgroundThread: 16
PendingThread: 0
DeadThread: 12
Hosted Runtime: no

OSID Special thread type
9 3434 Gate
10 26ec DbgHelper
11 1358 GC
12 13b0 GC
13 ef4 Finalizer
14 1b30 ProfilingAPIAttach
15 3a04 Timer
16 1e38 ADUnloadHelper
17 3b98 ThreadpoolWorker
22 2ef8 ThreadpoolWorker
23 1340 ThreadpoolWorker
24 16fc ThreadpoolWorker
28 20c8 IOCompletion
30 a54 IOCompletion
31 2eb0 IOCompletion
32 3648 IOCompletion

***********************

That means that the user request is blocked by the finalizer thread.

Conclusion:
It is known that letting the finalizer dispose ODP.NET object can cause such reactions. That it does not handle very good. So it is good practice to dispose any ODP object explicitly.
But when using EF you don't have full control of the ADO objects (like commands). And if he doesn't dispose explicitly I have a problem.
It seems I got same behaviour using dotConnect.

Questions:
1. Is it a known bug?
2. Does direct mode can help solve this problem (If so, does direct mode can screw other things?) ?

Answer for 2:
Direct mode didn't helped. It got more timeouts then before. I quit trying that direction to solve the problem.

StanislavK
Devart Team
Posts: 1710
Joined: Thu 03 Dec 2009 10:48

Post by StanislavK » Tue 18 Jan 2011 15:44

We couldn't reproduce the problem in our environment. Could you please describe the situation in which it occurs in more details? E.g., how you are creating the ObjectContext instance, what actions you are performing with it, and what kind of queries you are executing. If it is possible, please send us a test project with which the problem can be reproduced.

peledkfir
Posts: 22
Joined: Sat 15 Jan 2011 11:06

Post by peledkfir » Tue 18 Jan 2011 21:45

The problem occurs during stress testing to the application.
It randomly happens between 4 hours to 12 hours from the start of the test.

I'm creating the auto generated context of entity framework (not with TT, the one from entity framework 3.5).
Basically the service, for each request, save changes in 2 different models and execute StoreProcedure (using OracleCommand, ADO) in transaction (using transaction scope).

I know that not explicitly disposing OracleCommand, OracleConnection etc objects reproduce the hang faster. Because when I tried to solve it, I searched in the code for places that didn't disposed those objects. When eliminating all of them the hang occurs only one time in 12 hours stress test instead of 5 times or so.

Now I can't find in my code any place that I'm not disposing those objects (including EF's auto generated contexts).

This is serious problem: Because I get this hang after the transaction is committed in the DB, but the requesting system of the service gets time out and roll-backing theirs. That cause a lack of synchronization in my application.

I'll try to build a test project to reproduce it. I'll get back to you if I'll succeed.

peledkfir
Posts: 22
Joined: Sat 15 Jan 2011 11:06

Post by peledkfir » Wed 02 Feb 2011 21:25

After applying the workaround I mentioned in other post:
http://www.devart.com/forums/viewtopic.php?t=20005
It seems the problem does not reproduces any more.

AndreyR
Devart Team
Posts: 2919
Joined: Mon 07 Jul 2008 13:16

Post by AndreyR » Thu 03 Feb 2011 12:52

Glad to hear that the issue is resolved.
Thank you for the notification.

Post Reply