Delay after Lookup has cached rows

Last Post 20 Dec 2012 12:45 PM by gunneyk. 2 Replies.
AddThis - Bookmarking and Sharing Button
Author Messages Informative
rerichards
New Member
New Member

--
19 Dec 2012 03:46 PM
We are using Visual Studio 2008, SQL 2008R2 (10.50.4000) Enterprise, Windows Server 2008 R2 SP1, 64 bit, 24 cores.

We have 128G RAM.

While loading our data warehouse we have SQL Server Max Memory set at 20480 (20G), leaving essentially 108G to the operating system and DTExec.

Perfmon (Available MBytes) and the log outputs show that during the load we still have upwards of 10G available memory.

Getting to my question...During the "Pre-Execute" phase of the data flow we have numerous Lookup components set with Full Cache. During the Pre-Execute phase these Lookups are being cached, as can be seen in the package Progress tab. One of these lookups caches 210 million rows, which takes between 8-10 minutes. After the 210 million rows are cached, there is a delay of about 1 hour before the Pre-Execute completes, this aforementioned Lookup being the final Lookup component in the data flow.

My question: Why the huge delay after the 210 million rows are cached and the Pre-Execute phase completes?
rerichards
New Member
New Member

--
20 Dec 2012 04:19 AM
Here is some additional info for clarification:

It takes 10 minutes to reach this point in the process, which provides this message in the Progress tab:
"[LKP Type Group [3529]] Information: component "LKP Type Group" (3529) has cached a total of 207354458 rows."

Then it takes 1 hour to perform the next process, , which provides these two messages in the Progress tab:
"[LKP Type Group [3529]] Information: The component "LKP Type Group" (3529) processed 207354458 rows in the cache. The processing time was 4322.94 seconds. The cache used 62828400774 bytes of memory.
Progress: Pre-Execute - 100 percent complete"
gunneyk
New Member
New Member

--
20 Dec 2012 12:45 PM
I am not familiar with the process enough to say why it is taking so long or even what it is doing during that time. However I have to wonder if it is worth caching so many rows in the first place. I woudl think a proper index on the lookup table would be much more effective than caching all those rows. From what I understand the caching is basically a hash based process and optimized for smaller sets. Sorry I couldn't be of more help than that.


Acceptable Use Policy
---