|Always working to set the federated search speed record.|
|Open source - extensible, extendable, understandable|
|Two clicks - one click to find, one click to get|
|Get it now - free!
[Cross posted at http://oregonstate.edu/~reeset/blog/archives/593]
One of the main UI complains with the 0.8.x branch of LibraryFind relates to what happens after the user searches. Since LibraryFind must collocate all the search results together before display, what happens as the user waits (or doesn’t happen) has been an admitted weakness of the program. That will change in 0.9. In 0.9, users will see what’s being queried, number of results, and have the option to kill the search at any time and view the results that have already been found. One additional benefit of this method has been in terms of thread lock. In 0.8.x, a single mongrel instance is locked until a query is completed. This is because the interface never refreshes, but waits for the query to complete. In 0.9, the UI uses micro-queries, hitting the server for short periods of times, allowing mongrels to answer a quick request and then move on. A quick comparison: In 0.8.x, a single query typically takes approximately takes 6-10 seconds to complete (depending on # of targets queried). In the previous model, a single mongrel thread is locked for the entire duration of this query. In 0.9, the average query on the server is 0.005 seconds. While the 0.9 interface makes more queries to the server, these are separate queries, allowing the server to better manage how its pack of mongrels are used and eliminating the thread locking that would take place during queries.
While the UI may change slightly as I finalize the 0.9 interface, here’s a snapshot of what this looks like now:
If you have questions about this, or other LibraryFind development, just give me a holler.
This will be the last intermediate release between the 0.8.x branch and the 0.9 branch. This release folds a few of the functions being added to 0.9 into the 0.8.x branch, as well as changes being made to accommodate some enhanced harvesting, etc.
So, changes to the 0.8.5.8 branch:
- JSON support finalized. I'll document how to access it, but there is now a json controller that will allow you to search, retrieve content, job statuses, groups and collections via a json interface.
- Harvester enhancement. The big changes are related to allowing harvesting of both sets or the entire server (and allow the harvester to be smart enough to know if a set is part of a larger set that's already been harvested). 0.9 will have additional harvesting changes -- but this is a good start.
- opensearch changes. I've removed all references to the rexml library in the opensearch interaction. This was done specifically to speed up the processing. Of course, by using Libxml, one thing that will need to be done is add some code to autosense namespaces -- but that will be included with 0.9
- mysql gem dependency has been removed. All database code now flows through active record. This was done to allow support for other database engines.
For 0.9, what's coming:
- UI changes: Its much more dynamic
- opensearch enhancements
- solr support. This will allow users to select either ferret or solr as your indexer.
- json enhancement (add a few more elements to the api)
- harvester enhancements
- mod_rails (so no more mongrel)
I get asked quite often what it takes to install the software, create a collection, etc. I'm currently working on some short, 5-10 minute videos (that I'll likely post to to youtube) to give folks some video presentations -- which I hope will be useful for current and potential LibraryFind users.
[cross-posted at: http://oregonstate.edu/~reeset/blog/archives/589]
As I've been working on 0.9, I've been trying to migrate few odds and ends into the current 0.8 branch so that I can move them into production faster on our end. To that end, I'll be posting an updated to LF by the beginning of next week. These updates will include:
- Update to the Harvester (this will make it a bit more fault tolerant) as well as allowing harvesting of sets (currently) and root level oai providers (not provided currently). This change required significant changes to the search component as well that deals with the harvested materials.
- Auto-detection of namespaces (for oai and sru -- needed for libxml)
- Removed rexml dependencies for opensearch component
- Frozen gems for oai, sru and opensearch into vendor directories (and have added that to the environment.rb file)
- Prep code for solr/ferret decision. I'll be adding support to use either ferret or solr as your backend indexer for harvesting for 0.9, but some changes are being made to make this easier. Ferret provides an integrated rails solutions, while solr would provide a hosted index option.
- In addition to this, some changes to the libxml module have deprecated a call being used in the oai gem (maybe the sru gem). I'll take a look at both this week and update appropriately [as well as keep backworks compatibility if possible]
Something else, we are starting to work with mod_rails. This allows apache to manage the rails environment -- eliminating the need to run rails through packs of mongrels (or other specialized serving mechanism). I'll write up something on our experiences for others that might be interested in this approach.
As mentioned on Friday, I've tagged and posted 0.8.5.3. You can get this update here: http://www.libraryfind.org/release-0.8.5.3.tgz. So what's in this update? A couple of things:
- Optimizations related to the facet generation
- Added source to allow for links to Google FullText and Google Preview links.
- Corrected a misspelling the code that could cause results to get truncated or not shown at all.
- Corrected Z39.50 parser to fix a small error when colons appear in the search strings under certain conditions.
- Corrected query type decifer -- was selecting the wrong search type under rare conditions.
Anyway, these are the high-level changes. All of these changes have already been integrated into the 0.9 testing branch, which is still on track to be tagged in Dec.