28.02.2012, 19:11 | #1 |
Участник
|
fed: Two stories about inventory closing and SQL Locks
Источник: http://fedotenko.info/?p=228
============== Story #1: update_recordset considered harmfull. Recently, I was invited to a customer who had started to use Inventory Closing functionality about a month ago. They were using DAX2009 with recent rollup 7. Every time when they were running inventory closing, the system locked all (or at least – most of the) inventTrans records, safely preventing users from any regular work with logistics module. Before I came to the customer I already heard rumors that new functionality of non-financial transfer closing can cause long duration locks to inventory data. Now I had a chance to witness it with my own eyes. The sql statement which was a reason for show-stopper problem in IC was very easy to find. (I knew the place in the code before I went to a customer’s site). It was the following query from InventCostNonFinancialTransferHandler.updateInventTrans() method: update_recordset inventTrans setting NonFinancialTransferInventClosingRecId = _inventClosing.RecId, DateClosed = inventTrans.DateFinancial, ValueOpen = InventTransOpen::No where inventTrans.DateFinancial != dateNull() && inventTrans.DateFinancial Functions->Edit dimensions function. But I think we can ignore this possibility I can understand SQL Server concerns about the data to be modified by different process, but from application logic point of view it is highly improbable. When I identified the issue, I started to try different ways to fix it. First thing I tried to replace an update_recordset with equivalent set of statements w/o update_recordset: while select table {select forupdate table2 where table2.recId==table.recid; table2.update()} . Since outermost select does not use ‘pessimisticklock’ clause (and InventTrans has OCC enabled), it does not places any locks on table it read. Inner select use ‘forupdate’ clause, but because of OCC it is just a syntactic sugar, which does not generate any additional clauses or hints for the select statement sent to SQL Serever. This approach brought some progress: Although the query (now select, not update) was still taking 30-40 minutes to execute, it did not place these nasty U-locks over all records; IC was still slow, but it stopped to lock users at least. As a next stage, I tried to split one select onto two nested selects. For everyone who is into SQL Statement tuning it sounds like very counter-natural, but since we are unable to directly influence the query plan in X++ with hints, we must somehow ensure SQL Server selecting more or less decent execution plan. I tried the following approach: On outer level I placed a query which was simply producing a list of itemIds, which have non-closed transfers (any transfers, maybe even financial ones), in nested while select I used the copy of original select, only with adding ‘itemId==itemIdFromOuterSelect’ condition to every occurrence of InventTrans and inventSettlement in the query. This approach brings good results: Processing of non-financial transfers in my case took only 1-1.5 minutes. Maybe in your case, if you have many items with transfers, it will take about 30-40 minutes for execution of the whole logical block, but it yet much better than 30-40 minutes for one item dimension group. Well, maybe this approach is not the most perfect one. Maybe I should try to combine outermost select (to select item list) and inner update_recordset. Maybe this U-Locks won’t be so nasty after all. If the statement takes only 5-10 seconds to execute, then maybe benefits of fast execution will outweigh potential lock conflicts. But for my case, it was acceptable solution and the customer kept to use it. As far as I understand, the whole feature was first designed for DAX2012 (where related statement is always executed per-item) and then backported to DAX2009 without too much testing. Also, to my knowledge currently MS is working on a patch for this feature in DAX2009. Story #2: Wrong query execution plan can ruin the helper-based algorithm in Axapta On another project, the customer often had performance issues with inventory closing cancellation. In the beginning, it was a mere issue of total execution time. When they were running IC cancellation in interactive mode, it took around 3-4-5 hours to complete. I pointed them that to the fact, that if you execute IC cancellation in batch mode, it runs many times faster than in interactive. The trick is simple: When IC cancellation is started, it checks the mode of execution. If it is running in a batch mode, it checks which items has inventSettlement records in inventory closing being canceled, then it spawn a batch task of class InventCostClosingCancel_workInvent for every such an item. Next it iterates over all indirect cost journals created by inventory closing being canceled, and for every cost journal, it spawns a batch job of class inventCostClosingCancel_WorkProd. Then all these jobs start to execute (up to a batch server maximum number of thread at a time), thus brings significant (like 5-10 times) increase in performance. (It is worth to mention that all batch jobs are spawned with empty batch group, so you have to configure at least one of your batch servers to serve batch jobs from this batch group). So, after they started to run Inventory Closing cancellation in a batch mode, time of the operation dropped to 45-60 minutes. (Progress percent in batch screen to advance by 1-2 percent every minute.) But from time to time, a batch mode cancellation went exceptionally slow (like 0.1-0.15% in a minute). I decided that SQL server generates silly execution plans sometimes and prescribed to run update statistics on inventTrans/inventSettlement/prodTableJour before running IC cancelation. For a while this well known home remedy against spontaneous slowness of queries helped to resolve the issue. But eventually, since 8-10 months after GoLive, even after statistics update, inventory closing cancellation become very slow, so I had to investigate the issue a bit more. When I checked the list of connection, I found out that there is a lot of blocked processes from batch server, so it was a locking issue, not the issue of wrong execution plan. (Or at least so I decided after first look). Next I ran SQL Profiler, I also found out that the system creates several deadlock in a second. Finally I started to check the list of running queries and resources they are waiting for. I found out, that all processes was waiting on newly inserted key into inventSettlement.DateVoucherIdx or prodTableJour.dateVoucherIdx. I drew a conclusion that they are waiting on new key, because when I tried to select data from the tables using %%lockres%% pseudo-column, I was unable to find record being locked). I made assumption that a transaction insert record, then everyone wait on it and then transaction is getting rolled-back because of deadlock detection. (This was the only reasonable explanation why all connections are waiting on in-existent record.) The waiting statements were one of the following update_recordset statements from inventCostClosingCancel_workInvent.duplicateSettlements() and inventCostProdCancel.prodTableJour() respectively: update_recordset inventSettlement setting Cancelled = NoYes::Yes, QtySettled = -1 * inventSettlement.QtySettled, CostAmountSettled = -1 * inventSettlement.CostAmountSettled, CostAmountAdjustment = -1 * inventSettlement.CostAmountAdjustment where inventSettlement.Voucher == newVoucher && inventSettlement.TransDate == newTransDate && inventSettlement.Cancelled == NoYes::No && inventSettlement.ItemId == itemId; update_recordset prodTableJour setting Canceled = NoYes::Yes, AmountFinancial = -1 * prodTableJour.AmountFinancial, AmountPhysical = -1 * prodTableJour.AmountPhysical, Adjustment = -1 * prodTableJour.Adjustment where prodTableJour.Voucher == newVoucher && prodTableJour.TransDate == newTransDate && prodTableJour.Canceled == NoYes::No && prodTableJour.ProdId == prodTable.ProdId; After I found and identified the places in the X++ code, which was causing deadlocks, I was finally able to understand the issue. Meaning of IC cancellation is quite simple: It simply insert exact copy of closing records, but with reversed sign. Since insert_recordset statement in X++ does not support expressions in select list, the author of the code found the simple pattern to overcome this: First the system inserts copy of old data (inventSettlement/prodTableJour) ‘as is’ (only voucher id and date are changed), next, special update_recordset statement (which supports some basic expressions in setting clause), is updating newly inserted records, inverting a value of the fields. Since this insert and update are made in 1 transaction, no other user can read our half-cooked records with un-inverted sign. From application logic point of view, since, every helper thread is processing its own itemId or JournalId, no lock conflicts or deadlock can occur. But since an update statement in SQL (and update_recordset statement in X++) has a habit of locking EVERY record it read, locking logic suddenly becomes in-determenistic.If SQL Server use execution plan with index scan by ItemId/JournalId indexm several threads avoids locking conflicts, since every of them update its own personal set of keys. But from time, SQL Server having a bad day and it is selecting voucher+date index for both queries. In this case,1st helper thread inserts records for item A and locks the related key entries in Voucher+Date index; Then the 1st thread start to iterate over WHOLE index entries for given voucher and date (which are common for all helpers, since all inventSettlement records of the cancellation use the same voucher and date). Sooner or later, this scan was trying to read uncommitted inserts of the second helper thread, which caused 1st thread to be locked until 2nd thread commit transaction. Second thread, in turn, sooner or later starts to iterate all the index entries for the same voucher and date. Sooner or later it was hitting the locks on records inserted by 1st thread. SQL Server, after finding classical deadlock situation (2 processes are waiting on each other), terminates one of the two (or several) deadlocking connections causing all the threads by one to redo the same transaction. In both cases, I fixed the issue by building a very specific index to prevent any potential conflicting index reads in this pieces of code. In the first case (for inventSettlement) it was an index on TransDate+Voucher+ItemId. In the second case (for prodTableJour) it was an index on TransDate+Voucher+ProdId. Since then, SQL Server always use these indexes for related update_recordset statements, thus eliminating this silly deadlock issue. (It is silly, because it was not caused by some real life conflict for the same resources. It was just a result of mis-optimization of a query in SQL Server). If you do not believe me or just want to reproduce the 2nd issue on your own test installation, try to disable itemDateIdx index on inventSettlement and NumDateIdx index on prodTableJour table. Then try to run inventory closing cancellation in batch mode with SQL profiler running deadlock trace. Moreover, when I was preparing this article, I managed to run into the same issue on standard CONTOSO database on standard international version of DAX2009ru7. It seems that the issue can be quite frequent and even average installation can benefit from building these two additional indexes, I mentioned on the previous paragraph. Lessons learned:
Источник: http://fedotenko.info/?p=228
__________________
Расскажите о новых и интересных блогах по Microsoft Dynamics, напишите личное сообщение администратору. |
|
13.01.2014, 18:22 | #2 |
Administrator
|
Спасибо. Точно такая ситуация у меня последние пару месяцев. Вернее, точно такие: и с закрытием, и с отменой.
Вопрос про закрытие: А что, если закрытие нефинансовых переносов перенести () из InventCostClosing.createInventCostList() в InventCostItemDim.updateItem() (но выполнять разумеется только в том случае, если InventCostList.NumOfIteration == 0). Тогда, вроде бы, как раз и получится обновление по отдельности каждого айтема? Или есть какие-то ещё причины, по которым все нефинансовые переносы должны быть закрыты до того, как начнётся закрытие склада? Спасибо ещё раз.
__________________
Not registered yet? Register here! Have comments, questions, suggestions or anything else regarding our web site? Don't hesitate, send them to me |
|
13.01.2014, 18:28 | #3 |
Moderator
|
Цитата:
Сообщение от Maxim Gorbunov
Спасибо. Точно такая ситуация у меня последние пару месяцев. Вернее, точно такие: и с закрытием, и с отменой.
Вопрос про закрытие: А что, если закрытие нефинансовых переносов перенести () из InventCostClosing.createInventCostList() в InventCostItemDim.updateItem() (но выполнять разумеется только в том случае, если InventCostList.NumOfIteration == 0). Тогда, вроде бы, как раз и получится обновление по отдельности каждого айтема? Или есть какие-то ещё причины, по которым все нефинансовые переносы должны быть закрыты до того, как начнётся закрытие склада? Спасибо ещё раз. Проверить наличие этого фикса можно по присутствию таблицы inventCostTmpNonFinancialTransfer |
|
|
За это сообщение автора поблагодарили: Maxim Gorbunov (10), mazzy (5), gl00mie (5), Aquarius (1). |
14.01.2014, 11:53 | #4 |
Administrator
|
Спасибо за наводку на хотфикс, Денис!
Небольшое уточнение для тех, у кого ещё возникнет эта проблема: этот хотфикс выпущен уже после RU8. На данный момент роллапа, который бы его включал, пока нет. Вот ссылка на хотфикс: https://mbs2.microsoft.com/Knowledge...SKZOKTYYRUYNTT
__________________
Not registered yet? Register here! Have comments, questions, suggestions or anything else regarding our web site? Don't hesitate, send them to me |
|