Friday, April 16, 2010

Backups vdi and stack dumping

When using litespeed version 4.8.3 with sql2005 we found that we recieved the below errors.
> Title:
> "State: nn\n.SqlDumpExceptionHandler: Process nn generated fatal exception c0000005 EXCEPTION_ACCESS_VIOLATION. SQL Server is terminating this process"
>
>
> Problem Description:
> Backing up transaction logs for multiple databases can result in the generation of a stack dump error with the below logs:
>
> - State: 0\015.Stored function 'xp_sqllitespeed_version' in the library 'xpSLS.dll' generated an access violation. SQL Server is terminating process 62.
> - State: 0\015.SqlDumpExceptionHandler: Process 52 generated fatal exception c0000005 EXCEPTION_ACCESS_VIOLATION. SQL Server is terminating this process.
> - Job 'DBA - Backup TranLogs - LiteSpeed - ReportServerTempDB'
> (0xBC7DB5449E405941A11F84897255D32E) - Status: Failed - Invoked on:
> 2009-03-16 20:00:04 - Message: The job failed. The Job was invoked by
> Schedule 691
> - State: 0\015.Stored function 'xp_sqllitespeed_version' in the library 'xpSLS.dll' generated an access violation. SQL Server is terminating process 60.
> - Job 'DBA - Backup TranLogs - LiteSpeed - UIPState'
> (0x5EBFFF7592D92F479BF9D5613CAEF565) - Status: Failed - Invoked on:
> 2009-03-16 20:00:00 - Message: The job failed. The Job was invoked by
> Schedule 694 (Every 30
> - State: 0\015.SqlDumpExceptionHandler: Process 60 generated fatal exception c0000005 EXCEPTION_ACCESS_VIOLATION. SQL Server is terminating this process.
> - Job 'DBA - Backup TranLogs - LiteSpeed - ClientValuationService'
> (0x62F11A3F9A73AC4081B1AD8F5C42EAC8) - Status: Failed - Invoked on:
> 2009-03-16 20:00:01 - Message: The job failed. The Job was invoked by
> Schedule
>
>
> Cause:
> Insufficient memory as a result of defect in product.
> See example stats below:
>
> Memory
> MemoryLoad = 68%
> Total Physical = 3839 MB
> Available Physical = 1198 MB
> Total Page File = 7778 MB
> Available Page File = 4455 MB
> Total Virtual = 3071 MB
> Available Virtual = 213 MB


in doing some research and lo0king at the stackdump. It looks like we are in the stack when a deallocation goes sideways on heap corruption.


ntdll!RtlpExecuteHandlerForException+0xd
ntdll!RtlDispatchException+0x1b4
ntdll!KiUserExceptionDispatcher+0x2d
ntdll!DbgBreakPoint
ntdll!RtlpDphReportCorruptedBlock+0x239
ntdll!RtlpDphNormalHeapFree+0x45
ntdll!RtlpDebugPageHeapFree+0x203
ntdll!RtlDebugFreeHeap+0x3b
ntdll!RtlFreeHeapSlowly+0x4e
ntdll!RtlFreeHeap+0x15e
xpSLS+0xc0350
0x7eea8f70
xpSLS+0x149b18
0xffffffff


Based on the inital exception message,

Exception happened when running extended stored procedure "xp_sqllitespeed_version" in the library "xpSLS.dll".
SQL Server is terminating process 164. Exception type: Win32 exception; Exception code: 0x80000003.


We know this has to do with LiteSpeed's xpSLS.dll. xpSLS.dll implements LiteSpeed's extended stored procedures, and is loaded into the address space of the SQL Server engine. This is tied to the implementation of our extended stored procedures.

we suspected this to be a memory corruption.

and we looked at upgrading to 5.0.1 as we read on quests website tthat there were some bugs in older versions of litespeed.
here is a work around if you see the same issues.

> Schedule backups to run in serial instead of in parallel or upgrade to the latest version of Litespeed for sql server.
>
>
here's some additional information about the the memory consumption of LiteSpeed, in particular which memory area would it compete (use) on SQL Server.
>
> VDI uses the area set by the SQL Server mem to leave setting. When SQL Server starts up it allocates mem-to-leave amount of memory. It then allocates its buffer pool and a number of other memory allocations needed for the database to run. Then when it is finished allocating all of the memory required at the end of start up it then *releases*/frees the memory it had just allocated for the mem to leave. It does this in this manner so that it will have a contiguous mem to leave area separate from the buffer pool and such. One can see this behavior by injecting into sqlservr.exe and detouring malloc and such. THEN as third party apps execute extended stored procedures or connect and use VDI to perform backups, any memory that SQL Server allocates just using normal malloc or AllocateVirtualMemory will come out of that mem-to-leave area. SQL Server native backups do not use VDI and do not allocate extra memory from that area themselves when doing native backups.

In order to fix these errors, I would suggest to install 5.0.1 or later. to install the package so you dont see this error
State: 0\015.Stored function 'xp_sqllitespeed_version' in the library 'xpSLS.dll' generated an access violation. SQL Server is terminating process

Reinstalling litespeed

Phase I: Before upgrade, steps to take.

a. Stop all your backup jobs.

b. Close out your LiteSpeed console.

c. Logon with system administrator privilege to perform the installation.

d. Copy LiteSpeed.msi to C or D drive locally.

e. Run your installation and press next at the license registration to complete (it will pick up the existing license key)

f. If all is good, go to phase IV, else proceed next steps below.



Phase II: In the event, you may get an error like xpSLS.dll or slssqlmaint.exe is being used.

a. Go to your SQLServer\binn\ rename name xpSLS.dll to *.old and slssqlmaint.exe *.old

b. Try your installation again, if it is still fail with the same error, then try to restart your SQLServer services.

a. In the event, it may still lock on the file, reboot the server is the last option (not very likely, but I have seen it, and fyi only.)

c. Then, you can run through phase I again.

d. If all is good, go to phase IV, else proceed next steps below.



Phase III: Other scenarios for cluster that you may get an error like, fail to create director or access is denied. Perform the step below;

1. Copy your LiteSpeed.msi to the active node.

2. Logon with system administrator account equivalent.

3. Double click to begin the setup.

4. Get to the screen where it gives you the option to SELECT all instances for the same login and show ACTION (INSTALL) for your instances.

5. Click the INSTALL | change it to IGNORE for all.

6. Click next to get to the license registration | select demo for now.



It should pass the error, if you are still getting the same error, then copy LiteSpeed.msi, and run the setup from your failing node to create the directory. Then, follow the step above from 2 to 6. Once, it is done with the installation, and you should be able to run the INSTANCE CONFIGURATION to populate the binary files,

see steps below;



a. Select Start | all programs | Quest Software | LiteSpeed | Instance Configuration.

b. Leave the INSTALL intact this time.



Phase IV: Confirm your LiteSpeed engine is in place.



a. Open a query analyzer | select master | type EXEC XP_SQLLITESPEED_VERSION

a. It should return with product and engine match the value for both parameter.

b. In the event, it did not match, please restart your SQLServer services.

b. Test your backup from a LiteSpeed console against a small database.

No comments:

Post a Comment