Is ADO.NET 2.0 reliable?

Damir Dobric Posts

Next talks:

 

    

Follow me on Twitter: #ddobric



 

 

Archives

 

I know that this question is a little bit provocative, but if you are interested in philosophy of ADO.NET and reliability at all, this post is right for you.
In last few weeks, I found some very strange behavior in one ASP.NET 2.0 application. Long time everything was working, but as we started to perform some load tests we found some very strange behavior in the application. After many of team members couldn’t solve the problem (described below), I’ve got the task to do more intensive investigation. Here is the result.

Imagine there is a database with two tables Table1 and Table2. In this example both table have same columns column_1 and column_2 of the integer type. Note that behavior described here is fully independent on the table schema. I used this schema for the sake of simplicity. Assume the tables have data populated as shown:

Table1:  

111 – 112
121 – 122
131 – 132
. . .
1N1 – 1N2

Table2:

211 – 212
221 – 222
231 – 232
. . .
2M1 – 2M2

With this data it is possible to extrapolate the table and the row from each value. For example the value 271 is the value from Table1, Row 7 and column1. Now let’s take a look at following example, which is consisted of three code-snippets (each in one method Main, Test1 and Test2). The first method Main() starts two threads. In the first thread is executing method Test1 and in the second Thread is executing method Test2. The Main() ends up when both started threads are finished.

  static void Main(string[] args)
  {
     new Thread(delegate()
     {
        Test1();
     }).Start();


     new Thread(delegate()
     {
        Test2();
     }).Start();
 
}


Following implements the method Test1. This method successively reads the data from the table 1. Additionally, it checks if the value of the first column read from the first row is greater than 199. If you take a look on the content of table Table1 this will never be the case. Let’s call this condition C1.

public static void Test1()
{
  for (int i = 0; i < 10000; i++)
  {
    Thread.Sleep(50);
    DbConnection con = m_Factory.CreateConnection();
    con.ConnectionString = cConnStr;

    DbCommand cmd = m_Factory.CreateCommand();

    cmd.Connection = con;
    cmd.CommandText = "select * from Table1";
    con.Open();

    using (DbDataReader reader =
           cmd.ExecuteReader(CommandBehavior.CloseConnection))
    {
      if (reader.Read())
      {
        Console.WriteLine(String.Format("Table1: Field1: {0},
        Field2: {1}"
, reader[0], reader[1]));
        if (reader.GetInt32(0) >= 200)
        Console.WriteLine("Error");
      }
    }
  }
}

Now let’s take a look on the method Test2, which is started in the second thread. Comparing to method Test1, this one is almost identical. The only differences are marked as red-bold. This method reads the data from the Table2 and checks if the value of the first row and first column is less than 200. However we know that this will also never be the case, because this value is always set on 211. Let’s call this condition C2.

public static void Test2()
{
  for (int i = 0; i < 10000; i++)
  {
    Thread.Sleep(50);
    DbConnection con = m_Factory.CreateConnection();
    con.ConnectionString = cConnStr;

    DbCommand cmd = m_Factory.CreateCommand();

    cmd.Connection = con;
    cmd.CommandText = "select * from Table2";
    con.Open();

    using (DbDataReader reader =

           cmd.ExecuteReader(CommandBehavior.CloseConnection))

    {
      if (reader.Read())
      {
        Console.WriteLine(String.Format("Table1: Field1: {0},
        Field2: {1}"
, reader[0], reader[1]));
        if (reader.GetInt32(0) < 200)
        Console.WriteLine("Error");
      }
    }
  }
}

Now, one philosophical question: “If one of two conditions C1 and C2 is satisfied, is the ADO.NET 2.0 reliable?” Theoretically, I would vote for answer NO. Out there, there might be some people who think different. However, practically, I think this should NEVER happen.

In the next step, I extend the example with some additional threads. First I slightly changed the method Main(). In this case method Main() starts two additional threads, which execute the new method Test3 (each of them). This method  creates the connection to database, creates the command, which just insert one row in the table called Table3. Additionally, method Test3 is started in two different threads, which obviously share the same command and connection. Be aware that this is not the best programming pattern, because you will easily run into multithreading problems, depending on implementation of the method Test3.

  static void Main(string[] args)
  {
     new Thread(delegate()
     {
        Test1();
     }).Start();


     new Thread(delegate()
     {
        Test2();
     }).Start();
  
      DbCommand cmd = m_Factory.CreateCommand();
      cmd.CommandText =
      "insert into Table3 (column_1, column_2) values (0, 1);";
      DbConnection con = m_Factory.CreateConnection();
      con.ConnectionString = cConnStr;
      cmd.Connection = con;
      con.Open();

      new Thread(delegate(object cmdInstance)
      {
        Test3(cmdInstance as DbCommand);
      }).Start(cmd);

      new Thread(delegate(object cmdInstance)
      {
         Test3(cmdInstance as DbCommand);
      }).Start(cmd);
 
}

Finally, the method Test3() executes the command and sometimes just closes the connection. This is of course the bug, because the another thread which also runs the same Method
shares this connection. However, practically, this can happen implicitly, when you have classes which wrap up dealing with connections. In such cases this problem would not be obvious as in this example. This code snippet is just “huge” simplification of the real code, which I had to analyze.

   public static void Test3(DbCommand cmd)
   {
     int n = 0;
     while (true)
     {
       
try
        {
         
cmd.ExecuteNonQuery();
         
if (++n == 10)
         
{
             n = 0;
            
cmd.Connection.Close();
          }

          Thread.Sleep(100);
        }
       
catch(Exception ex)
       
{
           cmd.Connection.Close();
          
cmd.Connection.Open();
          
Console.WriteLine(ex.Message);
       
}
      }
   
}

As you see, the method Test3 sometimes closes the connection. This will force another thread, which also executes the method Test3 to fail on the line cmd.ExecuteNonQuery(). In this case the exception will be thrown and the connection just re-opened. I know, it is not good design, but implicitly could happen.
You will probably not believe me, that when this happen, at least one of conditions C1 or C2 is satisfied. Personally, I would believe this too. Because of that I wrote example, so you can do it by yourself.

Note: If the Visual Studio hangs with 100%-CPU start the example again or start it without debugger. This is an issue of VS and there is nothing to do with this example. 

I would be very glad to see you comment about this dangerous behavior.

[Copyright 2007, DAENET GmbH - www.daenet.eu]

 


Posted Jul 18 2007, 12:29 PM by Damir Dobric
Filed under:

Comments

Miha Markic wrote re: Is ADO.NET 2.0 reliable?
on 08-20-2007 10:06

Man, your example if flawed in the beggining. You are abusing multithreading where classes used are *not* multithread safe. You are basically screwing a connection which might be reused from other threads. But what do you expect from such mutlthread bugs? Mutlithreading bugs are always hard to diagnose and might appear seemingly unrelated to the point where the error really happens.

To sum it up: your code is flawed (as you know) and expecting meaningfull errors from multithreading errors is very very optimistic.

Instead of prooving that ado.net doesn't work, you should educate the developer that created such a mess ;-)

Damir Dobric wrote re: Is ADO.NET 2.0 reliable?
on 08-20-2007 11:10

Thanks for your comment Micha. You are right, that nobody should create such mess. However there is one thing which is a little different than you write.  That is, the Threads inTest1 and Test2 are thread-safe. If they are safe, why do they not work as expected?

Developers who implemented Test1 and Test2 have nothing to do with developers who implemented buggy Test3. Test1 and Test2 were implemented in one assembly and Test3 in another one. The bad code in Test3 in reality does not look as shown in this post. This post is huge simplification of the problem.

Miha Markic wrote re: Is ADO.NET 2.0 reliable?
on 08-20-2007 11:28

Hi Damir,

I think you are wrong in your assumptions. Test1 and Test2 are not thread safe because of the way connection pooling works. When you create a new instance of SqlConnection and open it you are not getting a new *physicall* connection unless pooled ones are in use. It is quite possible that your test methods all hand around a common *physicall* connection or something like that. Also take note, that when you create a multithreading mess the results aren't ones you would expect - a problem causes by inproper multithreading might corrupt whoknows which part of your application.

And BTW, my name is Miha :-)

D.Adamec wrote re: Is ADO.NET 2.0 reliable?
on 08-20-2007 14:05

I think Damir wrote this sample as a small simplification that reproduces a problem hiding  in a code deep under some DAL classes that follows usual design-paterns and guides for accessing data.  And combination of  factory and ADO.NET classes in multithreaded environment  leads to the point where one query gets result from another query. The point is that the flawed code is here just to show us this behavior and it seems that the switching connection can jump out even when the code would be written better.

What this sample made me think of is: we all use a lot of built-in classes from many sources (core, WPF, WWF, WCF, factories, guidance packs, app blocks etc.), market push us to adopt more and more, quicker and quicker and the truth is: we should be very careful combining those bright new shiny things and putting them into our systems. For example, the book says: “Windows Workflow Foundation managing the multithreading for you so you can think of the business side of your application“.  Great tip  . And of course if we combine WWF with my DAL from Ent. App block and some other XYZ new technology and put it /of course/ in a web application… we should know what we are doing ;-) … and we want to be sure that the query one will get the results of query one.

The mixture of connections  from the Damir’s sample should somehow be marked at least.

Same Old Applications wrote rely and reliability
on 08-20-2007 16:35

Damir Dobrić post interesting article (&quot;Is ADO.NET reliable&quot;) on his blog about unexpected

Eduardo wrote re: Is ADO.NET 2.0 reliable?
on 10-19-2007 21:31

The bug must be in the Data Provider's connection pool. Probably the reader in C2 is reading data from Test3 connection.

Damir Dobric wrote re: Is ADO.NET 2.0 reliable?
on 10-20-2007 0:08

Thanks for your comment Eduardo. However some people think this would be a "feature by design" :)

developers.de wrote re: Is ADO.NET 2.0 reliable?
on 04-22-2011 16:14

Is ado net 2 0 reliable.. OMG! :)

developers.de is a .Net Community Blog powered by daenet GmbH.