Rather than dwell on that I thought I'd talk about exactly what the process was that led to us discovering the fixing the error.
The first report came from the users; a lot which they knew had two item numbers was only showing one. Other lots didn't seem to be affected. This was then traced back to a single batch, 73591, running this SQL gave us the transactions that affected the cache for that batch. To publish this data I'm obfuscating the data (see here);
SELECT LGG.TRANSACTION_ID,
HIDEDATA.Hide('LGG.LN', LGG.LOT_NUMBER) Lot_Number,
LGG.TRANSACTION_QUANTITY,
LGG.TRANSACTION_TYPE_NAME,
LGG.Item_Number,
HIDEDATA.Hide('LGG.ID', LGG.Item_Description) Item_Description,
LGG.ITEM_TYPE
FROM LOT_GENEALOGY_GETTRANSACTIONS LGG
WHERE 1 = 1
AND LGG.Batch_No = '73591';
The result of this is the following transactions;
LG Transactions For A Single Batch |
The next stage for debugging the cache was to (on a test system!) change the LOT_GENEALOGY_GETTRANSACTIONS view so that rather than looking at *everything* it only looks at the records for a single batch - this is simply done by adding the following where clause to the view SQL;
-- Restrict transactions to a single batch (for debugging)
AND GBH.Batch_No = '73591'
Now we're restricted the input records to just those affecting that batch we can just re-build the whole cache - it will take longer to do the delete than the insert. The script to do it is;
begin
-- Call the procedure
lot_genealogy.refreshwholecache;
end;
Once that has been completed the entire content of the cache is the single batch we're tracking. The SQL to show it is;
SELECT HIDEDATA.Hide('LGG.LN', LGW.Master_Lot_Number) Master_Lot_Number,
HIDEDATA.Hide('LGG.LN', LGW.Ingred_Lot_Number) Ingred_Lot_Number,
LGW.Ingred_Item_Number,
HIDEDATA.Hide('LGG.ID', LGW.Ingred_Item_Description) Ingred_Item_Description,
LGW.Ingred_Item_Type_Code,
LGW.Batch_Number,
HIDEDATA.Hide('LGG.LN', LGW.PRODUCT_LOT_NUMBER) PRODUCT_LOT_NUMBER,
LGW.PRODUCT_ITEM_NUMBER,
HIDEDATA.Hide('LGG.ID', LGW.Product_Item_Description) Product_Item_Description,
LGW.Product_Item_Type_Code
FROM LOT_GENEALOGY_WHEREUSED LGW
WHERE 1 = 1
The result of this is;
Lot Genealogy Cache: Result for a Single Batch |
If we look at the source code (available here - via Google Drive) you'll see that the final part of the WIP Issue is creating the records in LOT_GENEALOGY_BATCH_PRODUCT so the next stage to check is to see if these records are being created correctly. Here is the SQL;
SELECT LGBP.batch_number,
HIDEDATA.Hide('LGG.LN', LGBP.PRODUCT_LOT_NUMBER) PRODUCT_LOT_NUMBER,
LGBP.product_item_number,
HIDEDATA.Hide('LGG.ID', LGBP.Product_Item_Description) Product_Item_Description,
LGBP.product_item_type_code
FROM LOT_GENEALOGY_BATCH_PRODUCT LGBP
WHERE 1 = 1
AND LGBP.Batch_Number = '73591'
This gives us the result;
Lot Genealogy Batch Products |
If you look at the first one;
INSERT INTO NOETIX_SYS.Lot_Genealogy_WhereUsed
SELECT v_Transaction.Lot_Number, -- Master_Lot_Number
v_Transaction.Lot_Number, -- Ingred_Lot_Number
v_Transaction.Item$item, -- INGRED_ITEM$ITEM
v_Transaction.Item_Description, -- INGRED_ITEM_DESCRIPTION
v_Transaction.Item_Type_Code, -- INGRED_ITEM_TYPE_CODE
v_Transaction.Batch_Number, -- Batch_Number
v_Product.Product_Lot_Number, -- Product_Lot_Number
v_Product.PRODUCT_ITEM$ITEM, -- PRODUCT_ITEM$ITEM
v_Product.PRODUCT_ITEM_DESCRIPTION, -- PRODUCT_ITEM_DESCRIPTION
v_Product.PRODUCT_ITEM_TYPE_CODE -- PRODUCT_ITEM_TYPE_CODE
FROM DUAL
WHERE NOT EXISTS
(SELECT 1
FROM NOETIX_SYS.Lot_Genealogy_WhereUsed LGI
WHERE LGI.MASTER_LOT_NUMBER = v_Transaction.Lot_Number
AND LGI.Ingred_LOT_NUMBER = v_Transaction.Lot_Number
AND LGI.Batch_Number = v_Transaction.Batch_Number
AND (v_Product.Product_Lot_Number IS NULL AND
LGI.PRODUCT_LOT_NUMBER IS NULL OR
v_Product.Product_Lot_Number = LGI.PRODUCT_LOT_NUMBER));
You can see that this is actually just a select from DUAL so the fact that it's only working on the first pass (i.e. for the first product) means that the offending part of the code must be the NOT EXISTS.
Looking at the WHERE clause in the sub-select reveals that it's not using the Product_Item_Number field. Not using this field means that after it's inserted the first product it mistakes the second one for a duplicate and skips over it.
Altering the final AND statement to;
AND (v_Product.Product_Lot_Number IS NULL AND LGI.PRODUCT_LOT_NUMBER IS NULL OR (v_Product.Product_Lot_Number = LGI.PRODUCT_LOT_NUMBER AND v_Product.Product_Item_Number = LGI.PRODUCT_ITEM_NUMBER))
For BOTH pieces of SQL (the one updating all the existing records as well as this one which is creating a new master record) addresses the issue logically, if you rebuild the cache from scratch (using the script above) and re-run the SQL to get the content of the cache you will now see;
Lot Genealogy Cache - Complete |
If you do a source comparison between the new and old versions of the package you'll notice that this wasn't the only change I made - I've added a great deal more logging to the WIP Issue transaction so it's possible to see what is going on and what "SQL%ROWCOUNT = 0" actually means! I also added a warning to WIP Completion if no records are updated.
I hope this is of some interest ... Debugging an application, even one you've written yourself, is a skill that's very difficult to transfer. Hopefully there is enough information here for people to at least make a start for future issues (which will, by their very nature, be completely different I'm sure!).
No comments:
Post a Comment