Tuesday, October 29, 2013

Third Bangalore Apache Solr/Lucene Meetup

We just had the third Bangalore Apache Solr/Lucene meetup this last weekend. It's good fun to see the community grow to 200+. Actually, as I type this, we're already 3 short of the 250 mark.

As per the requests from a reasonable number of members, we had a "Solr 101" by Varun Thacker form Unbxd. Right before he leaves for his talk at Lucene Revolution Europe 2013, he did get a lot of attendees both introduced and interested in Solr.




His talk was followed up by a talk on a "DIY Bookmarks manager using Lucene" by Umar Shah from SimplyPhi, Bangalore. It was a nice demo and I'm sure this would have again motivated people to try out similar DIY stuff using either Lucene/Solr.



This was followed with a talk by Ganesh M from Dell Sonicwall. Ganesh gave a quick talk about "Building Custom Analyzers in Lucene". A short and quick take on a complex but interesting part of Lucene certainly got the advanced users interested.



After his talk, with the last talk, I spoke about "MoreLikeThis in Solr/Lucene". I started off with some history on the current MLT stuff and how it really works along with what are the kind of issues that people are known to run into when using this feature. I also spoke about a MoreLikeThis QueryParser for Solr that I've been working on as a part of my official work at LucidWorks. We plan to Open Source it as soon as I have time to put it out and document it a bit.





This may well be the last meetup I'd organize and attend in Bangalore for the year. Good luck to Varun and Shalin for organizing this going forward.

Thanks to Microsoft Accelerator for giving us the venue to host the meetup yet again. It's one of the most centrally located and well equipped spaces in Bangalore for such meetups.

In case you'd like to join the meetup group and be a part of the active community, here's where to do that: http://www.meetup.com/Bangalore-Apache-Solr-Lucene-Group/ .

Collection Aliasing in SolrCloud

One of the many features that have come out for SolrCloud has been collection aliasing. As the name suggests, it aliases the collection, in other words gives another name or a pointer to a collection (which can be changed) behind the scenes.
Among other things the most important uses of aliasing a collection could the power to change an index without having to change or modify the client applications. It helps in disconnecting the view from the actual index. So let's see how can we practically use this feature for rather common stuff.

Collection aliasing command:
http://<hostname>:<port>/solr/admin/collections?
                                         action=CREATEALIAS&
                                         name=alias-name&
                                         collections=list-of-collections

Fig 1. myindex with a read and a write alias each

Firstly, it gives users the ability to reindex their content into another collection and then swap it out. If you begin with a setup as in Fig. 1, you'd have to follow the following steps:

  • Switch the write alias to a new index.
  • Start re-indexing using the index update client that you use. That way you never change the name/alias but behind the scenes, all the updates go to a new index.
  • Once the re-indexing is complete, change the read alias to use the new index too.

Updating an existing alias:
An existing alias can be updated with just a fresh CREATEALIAS call with the new alias specifications.

Secondly, the collection aliasing command lets users to specify a single name for a set of collections. This comes in handy if the data has time windows e.g. month-wise. Every month can be a collection by itself and things like last-month, last-quarter can be aliases of appropriate months. It can also be useful when the data gets added e.g. in case of travel/geo search. A continent could be an alias, consisting of collections holding data for certain countries. As data for other countries from the continent comes in, you may create a new collection for those countries and add those to the existing continent aliases.

There's no limit as to what aliasing can be practically used for as far as use cases are concerned, but hope the ones mentioned above help you get an idea of what aliasing is broadly about.

Related Readings:
JIRA: https://issues.apache.org/jira/browse/SOLR-4497

Apache Solr Guide