SourceForge.net Logo
Main Overview Wiki Issues Forum Build Fisheye
Issue Details (XML | Word | Printable)

Key: CMP-739
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Shay Banon
Reporter: Sergio Bossa
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Compass

Split TerracottaFile when exceeding a given threshold in order to avoid OutOfMemoryErrors

Created: 14/Oct/08 05:14 AM   Updated: 26/Oct/08 09:47 AM
Component/s: Compass::Needle
Affects Version/s: 2.0.2
Fix Version/s: 2.1.0 GA


 Description  « Hide
I think that the Terracotta integration implementation should provide a way to automatically split single, giant, TerracottaFile instances into smaller ones prior to add them to the TerracottaDirectory hash map.

That's because executing certain indexing operations over very large indexes often leads to very large TerracottaFile instances that, when added to the clustered TerracottaDirectory hash map, generate an in-memory Terracotta transaction that takes too much heap space and kills the client with an OutOfMemoryError.
Splitting too large TerracottaFile instances into smaller ones would solve the problem because wouldn't lead to big Terracotta in-memory transactions, acting the hash map put() method as a transaction boundary.

What do you think?



 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Shay Banon added a comment - 15/Oct/08 11:01 AM
Make perfect sense. Do you want to try and give it a go? I will try to get to it soon as well.

Shay Banon added a comment - 19/Oct/08 05:55 PM
ACtually, the mechanism was already there with the flush rate, just needed to add the file the the hashmap once the index output was created.

Sergio Bossa added a comment - 20/Oct/08 03:38 AM
Thank you very much for the fix!

Just two questions:
1) When will 2.1.0 GA be released? In the meantime, are SVN snapshots supposed to be stable?
2) I see that the 2.1.0 release upgrades Lucene to 2.4.0: given that I just use the Terracotta integration needle, can I stay with an older Lucene version? Are there any incompatibilities?

Thanks again,

Sergio B.


Shay Banon added a comment - 20/Oct/08 03:52 AM
It would be great if you can test it!. 2.1.0 GA will be released in a couple of weeks. Here is a link to a nightly build from tonight that includes the fixes: http://build.compass-project.org/browse/CMPTRK-NIGHTLY-292/artifact.

If you just use the Terracotta Directory implementation, you don't have to move to a newer version of Lucene.


Sergio Bossa added a comment - 20/Oct/08 04:04 AM
Sure, I'll test the nightly build and let you know.
Stay tuned!

Sergio Bossa added a comment - 26/Oct/08 05:48 AM
Hi Shay,
I've tested the Compass nightly build into my application, and it works pretty well: thank you again for your quick fix.
So when the GA release is supposed to go out?

Shay Banon added a comment - 26/Oct/08 09:47 AM
The GA will come out in a week. Great that it works.