Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project.
MongoDB (from "humongous") is a scalable, high-performance, open source, document-oriented data store.
I was happy using MongoDB and my very own search engine written using/extending lucene, until the trunks for Solr and Lucene were merged. This merge translated to Solr using the same release of lucene that I was using, unlike the past when there was some disconnect between the two. I realized that a lot of what I was trying to build was available through Solr.
Though Solr is used by a lot of organizations (which can be found here) and I'm sure that at least a few of them using Mongo, for some reason there was/is no straight forward out of the box import handler for data stored in MongoDB.
This made me search for a framework/module/plug to do the same, but in vain.
All said and done, here's a way that I finally was able to index my mongodb data into Solr.
I've used SolrJ to access my Solr instance and a mongo connector to connect to Mongo. Having written my own sweet layer that has access to both the elements of the app, I have been able to inject data as required.
--snip--
public SolrServer getSolrServer(String solrHost, String solrPort) throws MalformedURLException {
String urlString = "http://"+solrHost+":"+solrPort+"/solr";
return new CommonsHttpSolrServer(urlString);
String urlString = "http://"+solrHost+":"+solrPort+"/solr";
return new CommonsHttpSolrServer(urlString);
}
--/snip--
Fire the mongo query, iterate and add to the index
--snip--
SolrServer server = getSolrServer(..); //Get a server instance
DBCursor curr = ..; //Fire query @ mongo, get the cursor
while (curr.hasNext()) { //iterate over the result set
BasicDBObject record = (BasicDBObject) curr.next();
//Do some magic, get a document bean
BasicDBObject record = (BasicDBObject) curr.next();
//Do some magic, get a document bean
server.addBean(doc);
}
server.commit();
--/snip--
This will get you started on your track to index mongo data into a running Solr instance.
Also, remember to configure Solr correctly for this to run smooth.
Download Resources: