What is fault tolerant auditing and why it is important
One of the characteristics a SQL Server auditing solution must have is to be reliable. It means that it is a solution you can trust, that doesn’t provide a false sense of security while in fact it is not auditing the parameters set, has information leak, data loss, or in some situations it’s not running at all
ApexSQL Audit is designed to be fault tolerant and handle the situations which can bring to auditing breakdowns
Network failures
ApexSQL Audit is a centralized auditing solution for SQL Servers, with centralized storage of audited data and the first issue it has to deal with is to provide fault tolerant auditing when it comes to a network outage. As a distributed instance can run independently, a loss of connection to the central instance doesn’t affect auditing – the events on a distributed instance are captured in SQL traces as usual, therefore continued auditing is provided. The only issue is transfer of the audited records to the central repository
When auditing distributed instances, SQL trace files created on a remote SQL Server instance are transferred to the central instance once they reach the size of 1 MB. If the network connection between distributed and central instance is lost, these files cannot be transferred. ApexSQL Audit stores them locally as long as the central instance is not available. Once the connection to the central repository is reestablished, the SQL trace files are transferred to the central instance where they are further processed and audited records are stored in the central repository database
So, even when the network is down, auditing of distributed instances is not affected. The only issue a network outage causes is a delay in transfer of audited records and its storing into a central repository
SQL Server instance is not running
Another potential problem is when a SQL Server instance, either on the central or distributed instance stops unexpectedly. ApexSQL Audit instances are Windows services; they are independent of SQL Server service and are designed to automatically reestablish connections to monitored SQL Server instances, if connections are lost
If the SQL Server instance on the central instance stops, ApexSQL Audit will continue to audit the events on a distributed instance seamlessly. The SQL traces will be transferred to the central instance and once the central SQL Server instance is running again, the processed records will be inserted into the central repository database
If the SQL Server instance on a distributed instance stops, there will be no new events on the SQL Server instance, SQL traces will not be created and auditing will be stopped as expected, but will continue as soon as the SQL Server instance is started again. New events will be added into the last SQL trace
In the next article, we’ll describe some additional scenarios and how these are handled by ApexSQL Audit
See also:
ApexSQL Audit fault tolerant auditing – part II
August 12, 2013