Dear Bullet community,
We work at Vector Fabrics , where we develop tools to perform semi-automatic parallelization of C/C++ software and offer parallelization consultancy services. Recently we have been looking for parallelization opportunities in the Bullet Physics engine and we wanted to share and check our initial findings with you.
We have been performing our tests on a modified version of the 1000 stacks benchmark of the AppBenchmarks test. We modified it so that there are several islands with a large number of objects. The version of Bullet we use is 2.81, SVN revision 2613. Our experiments were performed on a Intel Core i5 with four cores.
We have discovered that the loop that creates and solves islands in function btSimulationIslandManager::buildAndProcessIslands is parallelizable with relatively little effort. We have parallelized this loop using OpenMP and we obtain nice speedups over the sequential version. Of course, there has to be enough work in order to exploit the parallelism. So we obtain speedups only when there are multiple islands in the simulation.
What we wanted to ask is whether parallelizing this loop was considered before. We see from an old forum post  that the idea of solving one island per thread is not new. Is there a reason why it is not in the source tree? Maybe having multiple islands is not a situation that occurs often in practice?
We are planning to release the patch and our experiments sometime in the near future.
Kristian Kolev and Alexey Rodriguez Yakushev
See comment by RobW which mentions that the solver calculates an island per thread.