SourceForge.net Logo
Main Overview Wiki Issues Forum Build Fisheye
Issue Details (XML | Word | Printable)

Key: CMP-581
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Shay Banon
Reporter: Michael Lossos
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Compass

Deadlock in TransIndex.commit after upgrading from 2.0.0M1 to 2.0.0M2.

Created: 17/Mar/08 11:17 AM   Updated: 18/Mar/08 12:27 PM
Component/s: Compass::Core
Affects Version/s: 2.0.0 M2
Fix Version/s: 2.0.0 M3

Environment:
Compass 2.0.0M2 (released) and also 2.0.0M2 nightly build #57. Lucene 2.3.
Spring Framework 2.0.7.0, transactions, etc.


 Description  « Hide
I'm trying to update our application from Compass 2.0.0M1 to 2.0.0M2. I'm holding all other things constant and only changing the Compass and Lucene jars. I'm recreating the search index for our data and I'm seeing deadlock in Lucene's IndexWriter. It appears to be waiting on a single from the merge thread. (This is called from Compass's TransIndex.commit().) When I look at the other Compass threads that are running (Compass Scheduled Executor and two Compass Executor threads), none of them are active, they're waiting on work, and none of them indicate they might be a merge thread. When I look at the sub index directory, there's a write.lock there, though it's not clear which thread is holding the lock. I'm deleting the entire index directory before running this scenario to ensure there's no existing lock.

Doing the exact same steps with 2.0.0M1 has no problems and recreates the search index.

I haven't had a chance to dig deeper into Compass / Lucene on this. Is it a known problem?



 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Shay Banon added a comment - 17/Mar/08 12:22 PM
It is not a known problem. I would suggest trying to run it against the latest Compass code (though I don't remember fixing something in that area). Is there a chance that you can create a simple test case?

Michael Lossos added a comment - 18/Mar/08 05:14 AM
Further digging indicates that this is a synchronization problem in Lucene's IndexWriter. I don't think that Compass's ExecutorMergeScheduler is responsible. You can close this bug out. Sorry for the misfiling.

Here's the Lucene bug I've filed:

https://issues.apache.org/jira/browse/LUCENE-1239


Michael Lossos added a comment - 18/Mar/08 07:54 AM
Nix my last comment. When I use Lucene's ConcurrentMergeScheduler, the deadlock goes away. Michael McCandless on the Lucene project says:

Michael McCandless - 18/Mar/08 05:20 AM
If you replace Compass's ExecutorMergeScheduler with Lucene's ConcurrentMergeScheduler, does the deadlock still happen?

One thing that makes me nervous about ExecutorMergeScheduler is this comment:

// Compass: No need to execute continous merges, we simply reschedule another merge, if there is any, using executor manager

and the corresponding change which is to schedule a new job instead of using the while loop to run new merges. If I understand that code correctly, the executorManager will re-call the run() method on MergeThread when there is a cascaded merge. But that won't do the right thing because it will run "startMerge" rather than the newly returned (cascaded) merge. That would then cause the deadlock because the cascaded merge is never issued.


Shay Banon added a comment - 18/Mar/08 09:33 AM
spot on. I hope that now I fixed it. Is there a chance that you can test with the latest trunk?

Michael Lossos added a comment - 18/Mar/08 11:02 AM
I'll pull tonight's build when Bamboo has it ready and test it.
Thanks for the fix!

Shay Banon added a comment - 18/Mar/08 11:29 AM
I kicked in a nightly build, should be ready soon. Lets get this nailed down.

Michael Lossos added a comment - 18/Mar/08 11:49 AM
It's almost 1am here in Hong Kong but I'll try to have it checked before you have to go to the pub there in London

Michael Lossos added a comment - 18/Mar/08 12:04 PM
Looks fixed! I grabbed build 59 and checked several scenarios that resulted in deadlock previously when using the ExecutorMergeScheduler. No deadlock.

Thanks for all the hard work Shay!


Shay Banon added a comment - 18/Mar/08 12:26 PM
Thanks for the effort mate!. I am glad that you found this one, its a nasty one. I will close this issue.