AccessViolationException / SEHException while calling open/close

AccessViolationException / SEHException while calling open/close

Postby klaus linzner » Thu 16 May 2013 10:55

Hello,
We have a rather large application that's working with multiple parallel OracleConnections. Every few days we get an unrecoverable AccessViolationException while calling the connections open or close method. Although we're not able to reliably and exactly reproduce the problem in the application we were able to create a small sample application that at least forces the the AccessViolationException (rarely SEHException) while calling Close().
In general we're running several parallel tasks, each of this task loops on connection open, close and dispose.
After about 16300 open/close (should be within less than one minute) things start getting messy:
At first I get exceptions that the Server did not respond within the specified timeout interval, then all hell breaks lose:
ORA 01041 (internal error), 12536 (operation would block), 12560 (TNS protocol adapter), 12650 (encryption and data integrity) are thrown seemingly randomly. It's usually not all of the above mentioned error codes, but most of them;
I don't care that much about those error codes (at least not in this sample context) - the problem is that several seconds after those errors start occuring, I get an unhandled Exception:

With Devart 7.7.226 I get the following:
Code: Select all
Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at OciDynamicType.nativeOCIHandleFree(HandleRef , Int32 )
   at OciDynamicType.OCIHandleFree(HandleRef , Int32 )
   at Devart.Data.Oracle.ay.c()
   at Devart.Data.Oracle.ao.c()
   at Devart.Data.Oracle.OracleInternalConnection.a(Boolean A_0)
   at Devart.Common.DbConnectionInternal.CloseInternalConnection()
   at Devart.Common.DbConnectionInternal.Close()
   at Devart.Common.DbConnectionBase.Close()


With Devart 7.2.96 and 7.2.114 I sometimes got this one instead:
Code: Select all
Exception Info: System.Runtime.InteropServices.SEHException
Stack:
   at OciDynamicType.nativeOCIHandleFree(System.Runtime.InteropServices.HandleRef, Int32)
   at OciDynamicType.nativeOCIHandleFree(System.Runtime.InteropServices.HandleRef, Int32)
   at OciDynamicType.OCIHandleFree(System.Runtime.InteropServices.HandleRef, Int32)
   at Devart.Data.Oracle.bf.b(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.ThreadHelper.ThreadStart(System.Object)


I thought that maybe the installed Oracle Drivers could be the issue, but I ran the sample application with Oracle's ODP as well: It throws the ORA errors as well, but as soon as the server is ready again, it's connecting again. As of now I have installed just one ORA Client so the driver itself shouldn't be the issue. (if there's a way to specify a specific ORA_HOME for Devart to use I'll gladly try that too)

Can you please take a look at this issue and provide a bugfix and/or a workaround to resolve this? If you have any problem reproducing the error feel free to use my sample (see below) or ask.

Looking forward to hear from you,
Klaus

Here's the sample I ran in a .NET 4.5 console application (x86 and x64)
Code: Select all
using System;
using System.Threading;
using System.Threading.Tasks;
using Devart.Data.Oracle;

namespace Samples.MemoryTest
{
    class SimpleMemoryTest
    {
        private const string ConnectionStringTemplate = "Data Source=TNSNAME;User ID=YOURUSER;Password=YOURPASS;Unicode=True;Pooling=False";
        private const int NumberOfParallelExecutions = 90;

        private static string connectionString;
        private static int countGlobalOpen = 0;

        static void Main(string[] args)
        {
            Console.WriteLine("Will start {0} tasks that run parallel and that'll just open and close the connection", NumberOfParallelExecutions);
            ShowDevArtVersion();
            RegisterGlobalExceptions();

            //Make sure that pooling is enabled, whatever the connection String says;
            OracleConnectionStringBuilder csb = new OracleConnectionStringBuilder(ConnectionStringTemplate);
            csb.Pooling = false;
            connectionString = csb.ConnectionString;

            for (int i = 0; i < NumberOfParallelExecutions; i++)
            {
                int currentInstanceNumber = i;
                Task suite = new Task(() => StartTestSuite(currentInstanceNumber), TaskCreationOptions.PreferFairness);
                suite.Start();
            }

            Console.WriteLine("Tasks started - we'll see what happens");
            Console.ReadLine();
        }

        private static void StartTestSuite(int currentInstanceNumber)
        {
            Console.WriteLine("Task {0} started OpenAndClose", currentInstanceNumber);
            int countInstanceOpen = 0;
            while (true)
            {
                try
                {
                    using (OracleConnection connection = new OracleConnection(connectionString))
                    {
                        connection.Open();
                        #region Some logging
                        countInstanceOpen++;
                        if (countInstanceOpen % 100 == 0)
                        {
                            Console.WriteLine("Task {0} opened {1} times", currentInstanceNumber, countInstanceOpen);
                        }
                        Interlocked.Increment(ref countGlobalOpen);
                        if (countGlobalOpen % 1000 == 0)
                        {
                            Console.WriteLine("All tasks openend {0} times", countGlobalOpen);
                        }
                        #endregion
                        connection.Close();
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine("Ex after {0} task, {1} global connections: {2}", countInstanceOpen, countGlobalOpen, ex.Message);
                }
            }
        }

        private static void RegisterGlobalExceptions()
        {
            AppDomain.CurrentDomain.UnhandledException += CurrentDomainOnUnhandledException;
        }

        private static void CurrentDomainOnUnhandledException(object sender, UnhandledExceptionEventArgs unhandledExceptionEventArgs)
        {
            ShowDevArtVersion();
            if (unhandledExceptionEventArgs.IsTerminating)
            {
                Console.WriteLine("Can't recover from and can't handle UnhandledDomainException {0}");
            }
            else
            {
                Console.WriteLine("Unhandled Domain Exception : {0}", unhandledExceptionEventArgs.ExceptionObject);
            }
        }

        private static void ShowDevArtVersion()
        {
            Console.WriteLine("Using Devart Version {0}", typeof(OracleConnection).Assembly.GetName().Version);
            Console.WriteLine("(Check bindind redirect in app.config if this is not the desired version");
        }
    }
}

klaus linzner
 
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: AccessViolationException / SEHException while calling open/close

Postby Pinturiccio » Mon 20 May 2013 14:57

We are trying, but we can't reproduce the issue with your code. It is executed without any errors.
As ODP.NET provider throws the same errors, the reason is most likely in the client. Please provide the following information:
1. The exact version of your Oracle client;
2. The capacity or your Oracle client;
3. The Oracle database version;
4. Is the issue reproduced if you use Oracle client of another version?
5. Try using the Direct mode. Is the issue reproduced in this case? For more information, please refer to http://www.devart.com/dotconnect/oracle/docs/?directmode.html.

klaus linzner wrote:if there's a way to specify a specific ORA_HOME for Devart to use I'll gladly try that too

Yes, dotConnect for Oracle allows you to specify an Oracle Home for the connection. You can do this with the 'HOME' parameter in the connection string, or using the 'Home' property of the OracleConnection or OracleConnectionStringBuilder objects.

For more information, please refer to http://www.devart.com/dotconnect/oracle/docs/?Devart.Data.Oracle~Devart.Data.Oracle.OracleConnection~Home.html
Pinturiccio
Devart Team
 
Posts: 1862
Joined: Wed 02 Nov 2011 09:44

Re: AccessViolationException / SEHException while calling open/close

Postby klaus linzner » Wed 22 May 2013 09:12

Hello Pinturiccio,
Thanks for your reply.
We are trying, but we can't reproduce the issue with your code. It is executed without any errors.
As ODP.NET provider throws the same errors, the reason is most likely in the client.

As mentioned in the first post: The problem is not that an OracleException (that can be handled) is thrown - problem is that short after the OracleExceptions were thrown an AccessViolationException (that leads to an application crash) is thrown.
Did you get to the point where Oracle is stressed and you get (at least) one of the Errors/Exceptions I mentioned? If not you did not get to the point where Oracle reaches its temporary limit and problems start.
Anyway - besides the mentioned error codes I now get the following Exception too: Devart.Data.Oracle.OracleException: Server did not respond within the specified timeout interval.

And again: The OracleExceptions I get are ok and fine - problem being is that from the point that those start arising it won't take long until an AccessViolation occurs. ODP throws basically the same error messages, but doesn't get to the point that AccessViolations occur.

1. The exact version of your Oracle client;

Installed 11.2.0.3 clients, ODAC 11.2.0.3x64 client;

2. The capacity or your Oracle client;

I'm sorry - I'm really not sure what you mean on the capacity or the Oracle client, could you please explain?

3. The Oracle database version;

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
x86_64/Linux 2.4.xx

4. Is the issue reproduced if you use Oracle client of another version?

Used the following Oracle Homes:
64bit:
HOME: NAME OraClient11g_home2, Version 11.2.0.1
HOME: NAME ODACx64, Version 11.2.0.1
32bit:
HOME: NAME OraClient11g_home1_32bit, Version 11.2.0.1
Although all are installed 11.2.0.3 I guess 11.2.0.1 is shown as version because OCI is still listed in 11.2.0.3 install as 11.2.0.1

5. Try using the Direct mode. Is the issue reproduced in this case? For more information, please refer to http://www.devart.com/dotconnect/oracle/docs/?directmode.html.

By now we weren't able to reproduct this issue in direct mode, but my guess is that the performance on connect in direct mode is just too slow so the server never struggles. The performance in direct mode is basically unacceptable: open and close takes in the best case (low load) trice that long as non-direct, with more load it takes close to 20 times. (Open and close off 100 connection on 10 parallel instances (see sample) takes about 116 seconds per instance, compared to 5seconds) and its performance is directly proportional to the instances running, whereas non-directs performance is relatively stable.
Direct mode brought up other errors, but let's not get distracted from the current issue.

For more information, please refer to http://www.devart.com/dotconnect/oracle/docs/?Devart.Data.Oracle~Devart.Data.Oracle.OracleConnection~Home.html

Thanks a lot for the info - I was able to run the sample with different homes!

Regards, Klaus
klaus linzner
 
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: AccessViolationException / SEHException while calling open/close

Postby Pinturiccio » Fri 24 May 2013 08:48

klaus linzner wrote:Did you get to the point where Oracle is stressed and you get (at least) one of the Errors/Exceptions I mentioned?

Your application works fine and we don't get any exceptions after the application works for a long time.

klaus linzner wrote:If not you did not get to the point where Oracle reaches its temporary limit and problems start.

Please specify what limit exactly do you mean?

klaus linzner wrote:I'm sorry - I'm really not sure what you mean on the capacity or the Oracle client, could you please explain?

We meant that your Oracle client has x64 or x86 capacity.

klaus linzner wrote:Thanks a lot for the info - I was able to run the sample with different homes!

Did you reproduce the issue when using other homes?
Pinturiccio
Devart Team
 
Posts: 1862
Joined: Wed 02 Nov 2011 09:44

Re: AccessViolationException / SEHException while calling open/close

Postby klaus linzner » Fri 24 May 2013 09:17

We meant that your Oracle client has x64 or x86 capacity.

Tried the sample as x64 as well as x86, issue can be reproduced both ways...

Did you reproduce the issue when using other homes?

Yes, tried several and was always reproducible

Your application works fine and we don't get any exceptions after the application works for a long time.
...
Please specify what limit exactly do you mean?

It doesn't matter if it runs for a long time. Try raising the workload by the number the number of parallel executions, maybe lower the number of connections Oracle can take.

Let me give you some background on this sample: We're getting "random" AccessViolations throughout our Application while calling Connection.Open and Connection.Close. They're pretty much not reproducible and can occur after two days or two weeks. Judging from the number of new Threads regarding Memory problems you know as well as I do that there are some problems. In order to not open another thread that get's closed with "not reproducible" I tried building a sample that can reproduce the issue each time so you don't need to look at random place but instead can take a look at a specific case.

Basically thousands of connections are opened and closed parallel. Within less than two minutes oracle stops accepting new connection and/or throws Exception and seconds after this I get an application crash as Memory is corrupt.
Depending on your environment you can increase server load by decreasing/increasing NumberOfParallelExecutions or running it from two workstations targeting the same server.

I really hope you can fix the memory issues...
klaus linzner
 
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: AccessViolationException / SEHException while calling open/close

Postby klaus linzner » Tue 28 May 2013 06:28

Hello again,
Are you working on this bug, were you able to reproduce it or do you need any other info?
klaus linzner
 
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: AccessViolationException / SEHException while calling open/close

Postby Pinturiccio » Tue 28 May 2013 12:47

We followed your instructions and increased NumberOfParallelExecutions, ran several instances of the application, and started getting errors "ORA-12518: TNS:listener could not hand off client connection", and then "Server did not respond within the specified timeout interval". Sometimes, the following error occurs: "Attempted to read or write protected memory. This is often an indication that other memory is corrupt." But it is not an unhandled one.

We will investigate the issue and notify you about the results as soon as possible.
Pinturiccio
Devart Team
 
Posts: 1862
Joined: Wed 02 Nov 2011 09:44

Re: AccessViolationException / SEHException while calling open/close

Postby klaus linzner » Tue 28 May 2013 13:07

Great - thanks for the info!
I'm glad you're able to reproduce the issue (at least partially)

BR
klaus linzner
 
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: AccessViolationException / SEHException while calling open/close

Postby Pinturiccio » Wed 19 Jun 2013 12:55

We have fixed the bug with AVE in heavy load long-running applications. We will post here when the corresponding build of dotConnect for Oracle is available for download.
Pinturiccio
Devart Team
 
Posts: 1862
Joined: Wed 02 Nov 2011 09:44

Re: AccessViolationException / SEHException while calling open/close

Postby klaus linzner » Wed 19 Jun 2013 14:26

Thank you - I'm really looking forward to it!

BR, Klaus
klaus linzner
 
Posts: 28
Joined: Thu 16 May 2013 09:18

Re: AccessViolationException / SEHException while calling open/close

Postby Pinturiccio » Tue 25 Jun 2013 08:06

The new build of dotConnect for Oracle 7.7.267 is available for download now!
It can be downloaded from http://www.devart.com/dotconnect/oracle/download.html (trial version) or from Registered Users' Area (for users with valid subscription only).
For more information, please refer to http://forums.devart.com/viewtopic.php?t=27386
Pinturiccio
Devart Team
 
Posts: 1862
Joined: Wed 02 Nov 2011 09:44

Re: AccessViolationException / SEHException while calling open/close

Postby klaus linzner » Thu 18 Jul 2013 11:47

Sorry it took me so long...
The fix seems good - I haven't been able to reproduce the bug. Thanks a lot!
klaus linzner
 
Posts: 28
Joined: Thu 16 May 2013 09:18


Return to dotConnect for Oracle