A One Stop Solution to a lot of statistical analysis for lucene internals is the all new lucid gaze for Lucene. Perhaps it has been around for a while for Solar and I'm left to wonder.. dude...where's the documentation? There aren't many places on the information superhighway where I could spot info on how to use lucid gaze. A Google for the same would prove my point.
After I did figure out how to, it seemed like a good tool, easy to use(after the eureka moment, at least for me). Here's how I analyzed various things using Lucid Gaze by Lucid Imagination.
Pre-requisites:
- lucene core jar from Lucid imagination [Here]
Write the indexing/search logic. Open the Reader/Writer/Searcher as usual.
Create a RamUsageEstimator:
- RamUsageEstimator estimator = new RamUsageEstimator();
At point where you'd want to analyze, do a
- estimator.estimateRamUsage(ir);
Where ir is an IndexReader/IndexWriter/IndexSearcher.
__Snip__
Stats s;
s = LuceneCore.getIndexStats(); //For getting IndexStats
s = LuceneCore.getStoreStats(); //For getting StoreStats
s = LuceneCore.getSearchStats(); //For getting SearchStats
s = LuceneCore.getAnalysisStats(); //For getting AnalysisStats
__Snip Ends__
Once the above step is done, s is populated with a Stats Object containing the Index/Store/Search/Analysis Stats (as per the function call).
__Snip__
HashMap h = (HashMap) s.getCurrentCounters(); // Retrieve counters accumulated since last
#resetStats()
.__Snip Ends__
This HashMap is populated with current stat counters. The key to these are found in the javadoc.
The following is the kind of output expected on iterating through the entire HashMap.
__Code__
IndexReader ir = IndexReader.open(indexName);
RamUsageEstimator estimator = new
estimator.estimateRamUsage(ir);
Stats s;
HashMap h;
s=LuceneCore.getIndexStats();
h = (HashMap) s.getCurrentCounters();
for(String key:h.keySet())
System.out.println("indexStats: "+key+"/"+h.get(key));
s = LuceneCore.getIndexStats();
h = (HashMap) s.getCurrentCounters();
for(String key:h.keySet())
System.out.println("storeStats: "+key+"/"+h.get(key));
s=LuceneCore.getSearchStats();
h = (HashMap) s.getCurrentCounters();
for(String key:h.keySet())
System.out.println("searchStats: "+key+"/"+h.get(key));
s=LuceneCore.getAnalysisStats();
h = (HashMap) s.getCurrentCounters();
for(String key:h.keySet())
System.out.println("analysisStats: "+key+"/"+h.get(key));
ir.close();
__Code Ends__
__Output__
indexStats: iw_adT/2000180
indexStats: ir_C/0
indexStats: ir_newC/1
indexStats: iw_C/1
indexStats: iw_segs/0
indexStats: iw_adC/15
indexStats: iw_buf/0
indexStats: iw_segC/1
indexStats: ir_ram/0
indexStats: iw_newC/1
indexStats: iw_ram/10487
storeStats: iw_adT/2000180
storeStats: ir_C/0
storeStats: ir_newC/1
storeStats: iw_C/1
storeStats: iw_segs/0
storeStats: iw_adC/15
storeStats: iw_buf/0
storeStats: iw_segC/1
storeStats: ir_ram/0
storeStats: iw_newC/1
storeStats: iw_ram/10487
analysisStats: toks/1
analysisStats: tss/30
__Output Ends__
This gives pretty much all of the desired information to optimize the search/index or any other process involving a lucene index Reader/Writer/Searcher.
Thanks to the developers @ Lucid Imagination for coming up with this.
Thanks to Jayant , Nitish for the help. :)
Download Lucid Gaze for Lucene Here [@Lucid Imagination]