|
[
Permlink
| « Hide
]
Stefan Fussenegger added a comment - 09/Jul/08 07:16 AM
attached patched version
Mmm, I see the problem. I am wondering if having a global lock won't actually cause a bottleneck problem. Why do you think that subsegment locking is an overhead? The size method is a bit of a nasty one in ConcurrentHashMap, so the list method can actually do this:
public final String[] list() { Set<String> fileNames = fileMap.keySet(); return fileNames.toArray(new String[0]); } This will not cause size to be called, and will traverse the keyset and add it internally (within the concurrent hash map) into an array list, which will then be turned into an array. Using toArray(new String[0]) should fix the problem, right.
From my experience, batch indexing 800.000 (rather small) documents with concurrent threads was about 15% (don't have concrete numbers, just from my memory) faster than using a ConcurrentHashMap. I can't provide any evident reason, but probably because ConcurrentHashMap does not use read-write locking. Furthermore, regarding the rather small number of files in a directory (where small is everything < 1000), locking the whole directory for a full iteration does not seem to be a high price to pay for faster (and concurrent) reads. However, I'll try to come up with some concrete numbers, to support that. btw: I'd really like to see this implementation becoming part of Terracotta Forge, as I am not using Compass itself - sorry about that Best regards I fixed to new String[0].
I think that the benefits of concurrent hash map if the fact that terractotta can handle its fetching and discarding of "files" from the client better (when the whole directory does not fit into the JVM memory). But, if it turns out to the better in terms of performance, lets open another issue, and we can discuss it there. I will ask the terracotta fellows as well. Wow, that was quick!
" I will ask the terracotta fellows as well": shall we start a public discussion in their forums or do you have other channels to use? |
||||||||||||||||||||||||||||||||||||||||||||