I recognise this is obviously a difficult one to solve for the general case, but if we could have some more framework support for distributed index building it would be very useful, as with non-trivial index sizes the runtime can be substantial.
Moving to a Multi-JVM environment allows us to throw more hardware (CPU/Memory) at the problem.
Currently the parallel gps devices like Hibernate use an internal thread executor service to parallel-process the index buld (typically at a sub-index level).
I've managed to distribute this by overriding this and using coherence to distribute out the work, which works but is a bit of a hack.
What would be ideal is some sort of baseline JMS implementation which allows pushing the index tasks onto a work queue and then hook points for the GPS devices to pull the work off that queue, basically a special type of Executor.
AFAIR Hibernate search has something along these lines.