Two weeks ago, I got a case about Lucene.NET performance issue when building index. The customer has a application that hosted in w3wp.exe process. The application is used to build a huge number of documents index data in intranet.

Symptoms

To use the application to build index in entire intranet, the application always runs at a 8 core CPU web server. After application started one hour, the index building performance gets much slow down. At that time, the task manager shows a big number of threads count in w3wp.exe. Six hours passed, the application seems hang and index building performance down to about 100 documents per hour. To compare with this situation, a laptop with 2 core can complete this job in 6 hours.

Analysis

I caught a memory dump to analysis this situation. When low performance occurs, I found that there is a lot of threads in w3wp.exe:

0:000> !threadpool
CPU utilization: 43%
Worker Thread: Total: 8686 Running: 8686 Idle: 0 MaxLimit: 32767 MinLimit: 8
Work Request in Queue: 2
AsyncTimerCallbackCompletion TimerInfo@0000006b07880a10
AsyncTimerCallbackCompletion TimerInfo@0000006b079de6c0
--------------------------------------
Number of Timers: 0
--------------------------------------
Completion Port Thread:Total: 2 Free: 2 MaxFree: 16 CurrentLimit: 2 MaxLimit: 1000 MinLimit: 8

Most of threads were waiting a lock at that time:

0:8726> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
230 0000006f53405798 8245 2 0000006fd14fb540 3e442959 0000006d881301c8 Lucene.Net.Index.DocumentsWriter

 What is the "lucky" thread doing at that time?

0:000> ~ 2959s
clr!ThreadQueue::EnqueueThread+0x47:
000007ff`ede67ffe 0f8529a6fcff    jne     clr!ThreadQueue::EnqueueThread+0x65 (000007ff`ede3262d) [br=1]
0:2959> !dumpstack
OS Thread Id: 0x3e44 (2959)
Current frame: clr!ThreadQueue::EnqueueThread+0x47
Child-SP         RetAddr          Caller, Callee
00000070033ad580 000007fff32e155a KERNELBASE!ResetEvent+0xa, calling ntdll!NtClearEvent
00000070033ad5b0 000007ffede6826e clr!SyncBlock::Wait+0xb2, calling clr!ThreadQueue::EnqueueThread
00000070033ad690 000007ffede9433d clr!AwareLock::OwnedByCurrentThread+0xe, calling clr!GetThread
00000070033ad6c0 000007ffede68198 clr!ObjHeader::Wait+0x24, calling clr!AwareLock::OwnedByCurrentThread
00000070033ad6f0 000007ffede68157 clr!ObjectNative::WaitTimeout+0xc7, calling clr!ObjHeader::Wait
00000070033ad790 000007ff8fd1c1b6 (MethodDesc 000007ff8fcd53a8 +0x56 Lucene.Net.Index.DocumentsWriter.WaitReady(Lucene.Net.Index.DocumentsWriterThreadState)), calling (MethodDesc 000007ffec6e4600 +0 System.Threading.Monitor.Wait(System.Object, Int32, Boolean))
00000070033ad828 000007ffede6810b clr!ObjectNative::WaitTimeout+0x7b, calling clr!LazyMachStateCaptureState
00000070033ad870 000007ff8fd1c1b6 (MethodDesc 000007ff8fcd53a8 +0x56 Lucene.Net.Index.DocumentsWriter.WaitReady(Lucene.Net.Index.DocumentsWriterThreadState)), calling (MethodDesc 000007ffec6e4600 +0 System.Threading.Monitor.Wait(System.Object, Int32, Boolean))
00000070033ad8e0 000007ff8fd1ab51 (MethodDesc 000007ff8fcd5338 +0x251 Lucene.Net.Index.DocumentsWriter.GetThreadState(Lucene.Net.Documents.Document, Lucene.Net.Index.Term)), calling (MethodDesc 000007ff8fcd53a8 +0 Lucene.Net.Index.DocumentsWriter.WaitReady(Lucene.Net.Index.DocumentsWriterThreadState))
00000070033ad900 000007ffedcd5b72 clr!CopyValueClassUnchecked+0x129, calling clr!ErectWriteBarrier_ASM
00000070033ad970 000007ff8fd1a19f (MethodDesc 000007ff8fcd5368 +0x4f Lucene.Net.Index.DocumentsWriter.UpdateDocument(Lucene.Net.Documents.Document, Lucene.Net.Analysis.Analyzer, Lucene.Net.Index.Term)), calling (MethodDesc 000007ff8fcd5338 +0 Lucene.Net.Index.DocumentsWriter.GetThreadState(Lucene.Net.Documents.Document, Lucene.Net.Index.Term))
00000070033ada00 000007ff8fd19e17 (MethodDesc 000007ff8f538d68 +0x67 Lucene.Net.Index.IndexWriter.AddDocument(Lucene.Net.Documents.Document, Lucene.Net.Analysis.Analyzer)), calling (MethodDesc 000007ff8fcd5368 +0 Lucene.Net.Index.DocumentsWriter.UpdateDocument(Lucene.Net.Documents.Document, Lucene.Net.Analysis.Analyzer, Lucene.Net.Index.Term))
00000070033ada80 000007ff8fd19756 (MethodDesc 000007ff8fcf6790 +0xa6 Sitecore.ContentSearch.LuceneProvider.LuceneUpdateContext.AddDocument(System.Object, Sitecore.ContentSearch.Linq.Common.IExecutionContext))
00000070033ada90 000007ff8f301bd7 (MethodDesc 000007ff8f21b040 +0x17 Sitecore.Common.Switcher`2[[System.Boolean, mscorlib],[System.__Canon, mscorlib]].Exit()), calling (MethodDesc 000007ff8f21b048 +0 Sitecore.Common.Switcher`2[[System.Boolean, mscorlib],[System.__Canon, mscorlib]].GetStack(Boolean))
00000070033adad0 000007ff8fc789e1 (MethodDesc 000007ff8f5344c0 +0x3d1 Sitecore.ContentSearch.SitecoreItemCrawler.DoAdd(Sitecore.ContentSearch.IProviderUpdateContext, Sitecore.ContentSearch.SitecoreIndexableItem)), calling CLRStub[VSD_DispatchStub]@7ff8efbc2a0
00000070033adba0 000007ff8fc78082 (MethodDesc 000007ff8f534140 +0x132 Sitecore.ContentSearch.HierarchicalDataCrawler`1[[System.__Canon, mscorlib]].CrawlItem(System.Tuple`3<System.__Canon,Sitecore.ContentSearch.IProviderUpdateContext,CrawlState`1<System.__Canon,System.__Canon>>))
00000070033adc80 000007ffecb3455e (MethodDesc 000007ffec7f1a48 +0x6e System.Threading.Tasks.Task.Execute())
00000070033adcf0 000007ffecaefd85 (MethodDesc 000007ffec6e1c28 +0x285 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean))
00000070033add00 000007ffecb3aeb3 (MethodDesc 000007ffec7fa718 +0x143 System.Threading.CancellationTokenRegistration.Dispose()), calling (MethodDesc 000007ffec807f08 +0 System.Threading.SparselyPopulatedArrayFragment`1[[System.__Canon, mscorlib]].SafeAtomicRemove(Int32, System.__Canon))
00000070033add10 000007ffede8e948 clr!ThreadLocalModule::GetGCStaticsBasePointer+0x28, calling clr!ThreadLocalModule::GetPrecomputedGCStaticsBasePointer
00000070033ade50 000007ffecaefae9 (MethodDesc 000007ffec6e1c18 +0x9 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)), calling (MethodDesc 000007ffec6e1c28 +0 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean))
00000070033ade60 000007ffedca4527 clr!LookupMap<MethodTable * __ptr64>::GetElement+0x67, calling clr!LookupMapBase::FindHotItemValuePtr
00000070033ade80 000007ffecb34335 (MethodDesc 000007ffec6ed5e8 +0x1c5 System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)), calling (MethodDesc 000007ffec6e1c18 +0 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean))
00000070033ade90 000007ffedccdce1 clr!Module::LookupTypeDef+0x51, calling clr!LookupMap<MethodTable * __ptr64>::GetElement
00000070033adea8 000007ffede8e79c clr!JIT_GetGCThreadStaticBase_Helper+0x5c, calling clr!LazyMachStateCaptureState

The "lucky" thread was writing data to specified files at that time. The files are  _7d.fdt, _7d.fdx, _7d.tvx, _7d.tvf, _7d.tvd. During that time other threads are suspending to wait writing index data to these files.

Actually, the Lucene.NET uses .NET parallel library to build index data:

00000070b266d910 000007ff8fd1a19f Lucene.Net.Index.DocumentsWriter.UpdateDocument(Lucene.Net.Documents.Document, Lucene.Net.Analysis.Analyzer, Lucene.Net.Index.Term)
00000070b266d9a0 000007ff8fd19e17 Lucene.Net.Index.IndexWriter.AddDocument(Lucene.Net.Documents.Document, Lucene.Net.Analysis.Analyzer)
// Some frames are not importmant.
00000070b266dc20 000007ffecb3455e System.Threading.Tasks.Task.Execute()
00000070b266dc90 000007ffecaefd85 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
00000070b266ddf0 000007ffecaefae9 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
00000070b266de20 000007ffecb34335 System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)
00000070b266deb0 000007ffecb34155 System.Threading.Tasks.Task.ExecuteEntry(Boolean)

This causes application creates more threads on 8 core CPU server than 2 core CPU laptop. After index analysis all data has to be written in specified files. This is the bottleneck.

Workaround

 To reduce concurrent threads count. You have to reduce CPU core count that can running the application. In application pool advanced settings there is a seting to specify CPU running core as below:

By default, this setting is disabled. You have to set Processor Affinity Enable = true. And then you have to set a correct Processor Affinity Mask value. The value 4294967295 seems strange. But if you put the number in calculator with programmer view you will understand immediately.

 

Each bit represents a CPU core. If you want only two cores to run the application, the correct value is 3 (the NO.1 and NO.2 to run the application). When finished setting and re-run the application, the indexing job was completed in 6 hours.