SIGPIPE signal in PGDac while connection reset (Linux)

Discussion of open issues, suggestions and bugs regarding PgDAC (PostgreSQL Data Access Components) for Delphi, C++Builder, Lazarus (and FPC)
Post Reply
Kedarius
Posts: 2
Joined: Thu 20 Sep 2018 13:03

SIGPIPE signal in PGDac while connection reset (Linux)

Post by Kedarius » Thu 20 Sep 2018 14:58

TLDR;
I am using PGDac with Delphi 10.2 to create simple Linux daemon. The communication with the DB is really simple, I just need to send few SQL queries/inserts and process returned rows.
Everything works fine until I try to restart the database server (PostgreSQL 10). The very first command after the server restart causes SIGPIPE signal and the whole application is terminated even though I do wrap all my PGDac calls in try ... except block.
I think that the problem is very similar to viewtopic.php?f=7&t=25540 .
Details:
I do have some wrapper objects in my code, but basically I have two PGDac objects:

Code: Select all

   fPGConnection: TPgConnection;
   fquery:TPgQuery;
When I try to do SQL query, the code works cca like this:

Code: Select all

   
   try
   if fPGConnection.Connected then fPGConnection.ping; //check if the connection is still alive
    except
         on e:Exception do
          begin
            log_msg(e.Message);
            fPGConnection.Connected := false;  //I would think that this is not needed
          end;
   end;
   if not fPGConnection.Connected then Connect; //if not connected then call my function Connect that sets the connection params and opens the connection
   if fPGConnection.Connected then  //if connection is connected - try the SQL query
   begin
      fQuery.Connection := fPGConnection;
      fQuery.SQL.Text := sql;
       try
         fQuery.Open;
       except
         on e:Exception do
         begin
            if e.message=SNotRows then begin end  //ignore exception about query not returning rows
            else raise(e);
         end;
       end;
  end;
This works fine until I restart the server. After the server is restarted this happen:
1) the connection.connected property is still true
2) the ping raises the exception with message "terminating connection due to administrator command"
3) however the connection.connected property is still true.
4) If I try to disconnect, the code tries to write something into the now dead socket and the OS responds with SIGPIPE
5) SIGPIPE signal will effectively terminate the whole application even though I tried to handle it main thread

If I do not use ping, the problem is the same, only few lines later when I try to open the query.

I even tried to use fPGConnection.Options.LocalFailover := true; and handle the connection error. But unfortunately it is the same. PGDac calls my handler, I output something to the log without modifying the RetryMode, then PGDac tries to close the connection and resend the query. However again when disconnecting it tries to send some data to the now dead socket and the OS responds with SIGPIPE

I did manage to hotfix the problem with adding flag MSG_NOSIGNAL to send() function in CRVioTcp.pas (function TCRVioTcp.WriteNoWait, line 906):

Code: Select all

  Result := send(FSd, buffer[offset], count, MSG_NOSIGNAL);
With this hotfix OS does not generate SIGPIPE and the ping (or opening query) raises exception " Error on data writing to the connection:#015#012System error: 20.#015#012Socket Error Code: 32($20)". However the connected property is still true and I need to manually set it to false. And setting the connected property to false tries again to disconnect and so it tries to write the data to the still dead socket again and so it creates yet another exception (same message) but at least does set the connected to false. After this nested exception I am able to reconnect to the server and continue normally.

So my question is rather simple :-) How should I handle connection disconnects on Linux? The application is supposed to run for a long time (hopefully months in a single run) and should not die everytime the database server is restarted or unavailable....

azyk
Devart Team
Posts: 1119
Joined: Fri 11 Apr 2014 11:47
Location: Alpha Centauri A

Re: SIGPIPE signal in PGDac while connection reset (Linux)

Post by azyk » Sat 22 Sep 2018 09:33

Thank you for the information. We've reproduced the issue and are investigating it now. We will let you know the results.

azyk
Devart Team
Posts: 1119
Joined: Fri 11 Apr 2014 11:47
Location: Alpha Centauri A

Re: SIGPIPE signal in PGDac while connection reset (Linux)

Post by azyk » Fri 05 Oct 2018 08:22

We included the suggested fix in the next build of PgDAC. In the TCRVioTcp.WriteNoWait method, we replaced the line:

Code: Select all

Result := send(FSd, buffer[offset], count, 0);
with

Code: Select all

Result := send(FSd, buffer[offset], count, {$IFDEF LINUX}MSG_NOSIGNAL{$ELSE}0{$ENDIF});
To handle disconnections to PostgreSQL, you can use the TPgConnection.OnConnectionLost event handler: https://www.devart.com/pgdac/docs/devar ... onlost.htm . To enable it, set TPgConnection.Options.LocalFailover to True.

Note that the OnConnectionLost event does not occur immediately after the connection to PostgreSQL is lost. It occurs when PgDAC requests PostgreSQL, and by that moment, the connection will be already lost.

Here is a sample of Linux console application demonstrating how to handle disconnections to PostreSQL using PgDAC:

Code: Select all

program PgDACHandleDisconnectLinux;

{$APPTYPE CONSOLE}
{$R *.res}

uses
  SysUtils,
  Classes,
  MemData,
  DBAccess,
  PgAccess;

type
  TEvent = class
    class procedure onConnectionLost(Sender: TObject; Component: TComponent;
      ConnLostCause: TConnLostCause; var RetryMode: TRetryMode);
  end;

class procedure TEvent.onConnectionLost(Sender: TObject; Component: TComponent;
  ConnLostCause: TConnLostCause; var RetryMode: TRetryMode);
var
  Msg: string;
begin
  case ConnLostCause of
    clUnknown:         Msg := 'for reasons not known';
    clExecute:         Msg := 'during SQL execution';
    clOpen:            Msg := 'during query opening';
    clApply:           Msg := 'during DataSet.ApplyUpdates';
    clServiceQuery:    Msg := 'during service information request';
    clTransStart:      Msg := 'during transaction start';
    clConnectionApply: Msg := 'during Connection.ApplyUpdates';
    clConnect:         Msg := 'during connection establishing';
  end;
  Write(TimeToStr(Now)+' '+Component.Name+' - Connection lost '+Msg, ' ');
  RetryMode := rmReconnectExecute;
  Sleep(500);
end;

var
  PGCon: TPgConnection;
  PgQuery: TPgQuery;
  s: string;
  OldConnectionID: integer;

begin
  try
    PGCon := TPgConnection.Create(nil);
    PGCon.onConnectionLost := TEvent.onConnectionLost;
    PGCon.Options.LocalFailover := True;
    // Set up credentials here
    PGCon.Server := '192.168.0.1';
    PGCon.Database := 'postgres';
    PGCon.Username := 'postgres';
    PGCon.Password := 'postgres';
    PGCon.Port := 5432;
    PGCon.Options.UseUnicode := True;

    PgQuery := TPgQuery.Create(nil);
    PgQuery.Connection := PGCon;
    PgQuery.SQL.Text := 'select * from emp';

    Writeln('------------------------ MENU ------------------------');
    Writeln('QUIT           - type q               and press RETURN');
    Writeln('CONTINUE       -                          press RETURN');
    Writeln('CONNECT LOST   - restart PostgreSQL   and press RETURN');
    Writeln('EXCEPTION      - drop emp table       and press RETURN');
    Writeln('CONNECT LOST N - stop PostgreSQL      and press RETURN');
    Writeln('------------------------ MENU ------------------------');
    Writeln;
    Writeln;

    Repeat
      try
        Write('Connecting ......... ');
        if not PGCon.Connected then
        begin
          PGCon.Connect;
          Writeln('OK');
        end
        else
          Writeln('already connected');
        OldConnectionID := PGCon.ProcessID;

        Write('Open emp table ..... ');
        PgQuery.Open;
        Writeln('OK');
        PgQuery.Close;

        Writeln('Connection ID ...... ', IntToStr(PGCon.ProcessID));
        if OldConnectionID <> PGCon.ProcessID then
          Writeln('PgDAC handle connection disconnect on Linux successfully!');

      except
        on e: Exception do
        begin
          if e.Message = 'relation "emp" does not exist' then
            Writeln(e.Message)
          else
            raise;
        end;
      end;
      Writeln('Do something:');
      Readln(s);
    Until s = 'q';
  except
    on e: Exception do
      Writeln(e.ClassName, ': ', e.Message);
  end;
  Writeln('The end. Press RETURN to qxit.');
  Readln;

end.

Post Reply