Page 1 of 1
Bullet performance problems - quite huge scene
Posted: Sat May 09, 2015 12:45 pm
by skides
After many days of fun with Bullet began to appear performance problems.
One of the things I'm really trying to achieve is reduction of physics time spended on static objects.
My scenario:
- around 60-70 ms stepsim on Q6600 @ 2,7ghz
- most/all objects have 0 mass (!)
- heightfield 4096x4096
- around 12k objects (shapes provided by HACD(HACD demo imported via btBulletWorldImporter), btCollisionShape, that i believe btCompundShape is inside)
- basic filtering between objects via masks (grass doesn't collide with other objects or other grass, around 40% total)
- removing heightfield or setting bitmask to 0 is not improving much
- most objects have diffrent sizes (I have one instance of shape in specyfic scale, if shape and scale matches - it uses same shape)
- setting scale of all objects to 1.0 (higher reusing of existing shapes) gives around 30% boost but it's not practical
- if I'm correct I can't use uniform or scaled shape to instance it even more because it works only with convex or triangle shapes not compund shapes(?)
- second parameter of stepsim is set to 5 because it was too common for player/objects to fell under heightfield or fly thru objects (is there any more way to avoid that? especialy player falling under terrain [mountainsides are worst])
- bullet 2.82
flags for heightfield and objects:
Code: Select all
rigid_body->setCollisionFlags(rigid_body->getCollisionFlags() | btCollisionObject::CF_DISABLE_VISUALIZE_OBJECT|btCollisionObject::CF_STATIC_OBJECT);
CF_STATIC_OBJECT doesn't seem to make any difference
I believe it is possible to make it work much better (especialy when almost/all objects are static [0 mass]).
Info about physics initialization:
Code: Select all
broadphase = new btDbvtBroadphase();
collisionConfiguration = new btDefaultCollisionConfiguration();
dispatcher = new btCollisionDispatcher(collisionConfiguration);
solver = new btSequentialImpulseConstraintSolver;
dynamicsWorld = new btDiscreteDynamicsWorld(dispatcher,broadphase,solver,collisionConfiguration);
debugDraw = new cEngine_physics_debugdraw;
Profiling info:
CProfileManager::dumpAll();
Code: Select all
Profiling: Root (total running time: 64.435 ms) ---
0 -- stepSimulation (99.99 %) :: 64.429 ms / frame (1 calls)
Unaccounted: (0.009 %) :: 0.006 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 64.429 ms) ---
...0 -- synchronizeMotionStates (0.00 %) :: 0.002 ms / frame (5 calls)
...1 -- internalSingleStepSimulation (98.59 %) :: 63.519 ms / frame (5 calls)
...Unaccounted: (1.409 %) :: 0.908 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 63.519 ms) --
-
......0 -- updateActivationState (0.00 %) :: 0.000 ms / frame (5 calls)
......1 -- updateActions (0.00 %) :: 0.000 ms / frame (5 calls)
......2 -- integrateTransforms (0.00 %) :: 0.003 ms / frame (5 calls)
......3 -- solveConstraints (0.08 %) :: 0.049 ms / frame (5 calls)
......4 -- calculateSimulationIslands (17.55 %) :: 11.145 ms / frame (5 calls)
......5 -- performDiscreteCollisionDetection (82.29 %) :: 52.267 ms / frame (5 c
alls)
......6 -- createPredictiveContacts (0.00 %) :: 0.003 ms / frame (5 calls)
......7 -- predictUnconstraintMotion (0.00 %) :: 0.001 ms / frame (5 calls)
......Unaccounted: (0.080 %) :: 0.051 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 0.049 ms) ---
.........0 -- solveGroup (44.90 %) :: 0.022 ms / frame (5 calls)
.........1 -- processIslands (2.04 %) :: 0.001 ms / frame (5 calls)
.........2 -- islandUnionFindAndQuickSort (18.37 %) :: 0.009 ms / frame (5 calls
)
.........Unaccounted: (34.694 %) :: 0.017 ms
............----------------------------------
............Profiling: solveGroup (total running time: 0.022 ms) ---
............0 -- solveGroupCacheFriendlyIterations (45.45 %) :: 0.010 ms / frame
(5 calls)
............1 -- solveGroupCacheFriendlySetup (31.82 %) :: 0.007 ms / frame (5 c
alls)
............Unaccounted: (22.727 %) :: 0.005 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 52.26
7 ms) ---
.........0 -- dispatchAllCollisionPairs (30.85 %) :: 16.124 ms / frame (5 calls)
.........1 -- calculateOverlappingPairs (0.09 %) :: 0.046 ms / frame (5 calls)
.........2 -- updateAabbs (69.01 %) :: 36.072 ms / frame (5 calls)
.........Unaccounted: (0.048 %) :: 0.025 ms
.........----------------------------------
.........Profiling: createPredictiveContacts (total running time: 0.003 ms) ---
.........0 -- release predictive contact manifolds (66.67 %) :: 0.002 ms / frame
(5 calls)
.........Unaccounted: (33.333 %) :: 0.001 ms
I would be grateful for any advice
Re: Bullet performance problems - quite huge scene
Posted: Sat May 09, 2015 4:24 pm
by gdlk
try this
Code: Select all
dynamicsWorld->setForceUpdateAllAabbs(false);
(actually I don't know what are the consequence of that, but it resolve the issue to me and I don't see any visible problem =P )
Re: Bullet performance problems - quite huge scene
Posted: Fri May 15, 2015 11:17 am
by skides
gdlk wrote:try this
Code: Select all
dynamicsWorld->setForceUpdateAllAabbs(false);
(actually I don't know what are the consequence of that, but it resolve the issue to me and I don't see any visible problem =P )
Thanks you for reply. It's boosting framerate around 20 to 50%. Still in almost-complete static scene, there is high overload. 30ms is far away from playable. Player is not running smoothly.
Any other ideas or suggestions?
Re: Bullet performance problems - quite huge scene
Posted: Fri May 15, 2015 1:43 pm
by xexuxjy
May be a dumb question, but are you using a debug or release build of bullet for you testing?
Re: Bullet performance problems - quite huge scene
Posted: Fri Jun 05, 2015 1:17 pm
by skides
xexuxjy wrote:May be a dumb question, but are you using a debug or release build of bullet for you testing?
It was builded as release. (around 3.4mb of static libs, debug has 22.8mb)
Also project is in release build. With -O3 flag.
Re: Bullet performance problems - quite huge scene
Posted: Fri Jun 05, 2015 3:03 pm
by drleviathan
When you set setForceUpdateAllAabbs(false) you say the time spent in stepSimulation() went down from about 64msec to 30msec. I notice in the profiling info that it used to be spending 36msec in updateAabbs() which means you cut it down to only about 2msec. This means that the next biggest chunks of time are being spent in dispatchAllCollisionPairs (about 16msec) and calculateSimulationIslands() (about 11ms).
I haven't experimented with lots of objects yet, so I don't have a good feel for what numbers should be expected, however after looking at the code in btCollisionWorld::performDiscreteCollisionDetection() my guess is that your broadphase has a large number of cached overlapping pairs. Since the collision group/mask feature is supposed to prevent such pairs from being created in the cache when their mutual collisions are disabled then I would further theorize that your collision groups/masks are not working like you expect or you have not disabled collision between groups that really shouldn't collide -- that is, many of your static objects are overlapping and would technically collide according to their group+mask configuration.
Note, I think (not 100% sure) you need to set group+mask correctly before you add the object to the world, and if you change the group+mask you basically have to remove and re-add it to the world in order to clear the cached overlapping pairs.
Re: Bullet performance problems - quite huge scene
Posted: Fri Jun 05, 2015 5:54 pm
by gdlk
other test =P (although maybe you already did it), add the static objects like collision object only, that means replace
Code: Select all
dynamicsWorld.addRigidBody(rigidBody);
with
Code: Select all
dynamicsWorld.addCollisionObject(rigidBody);
when rigidBody is static
Re: Bullet performance problems - quite huge scene
Posted: Sun Jun 07, 2015 1:40 pm
by skides
drleviathan wrote:When you set setForceUpdateAllAabbs(false) you say the time spent in stepSimulation() went down from about 64msec to 30msec. I notice in the profiling info that it used to be spending 36msec in updateAabbs() which means you cut it down to only about 2msec. This means that the next biggest chunks of time are being spent in dispatchAllCollisionPairs (about 16msec) and calculateSimulationIslands() (about 11ms).
I haven't experimented with lots of objects yet, so I don't have a good feel for what numbers should be expected, however after looking at the code in btCollisionWorld::performDiscreteCollisionDetection() my guess is that your broadphase has a large number of cached overlapping pairs. Since the collision group/mask feature is supposed to prevent such pairs from being created in the cache when their mutual collisions are disabled then I would further theorize that your collision groups/masks are not working like you expect or you have not disabled collision between groups that really shouldn't collide -- that is, many of your static objects are overlapping and would technically collide according to their group+mask configuration.
Note, I think (not 100% sure) you need to set group+mask correctly before you add the object to the world, and if you change the group+mask you basically have to remove and re-add it to the world in order to clear the cached overlapping pairs.
Thank you for response!
I took some time to fix collision masks and results are much better. Before fixes there was in fact checking collisions between static objects.
Results:
Code: Select all
----------------------------------
Profiling: Root (total running time: 8.962 ms) ---
0 -- stepSimulation (99.94 %) :: 8.957 ms / frame (1 calls)
Unaccounted: (0.056 %) :: 0.005 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 8.957 ms) ---
...0 -- synchronizeMotionStates (0.01 %) :: 0.001 ms / frame (1 calls)
...1 -- internalSingleStepSimulation (85.24 %) :: 7.635 ms / frame (1 calls)
...Unaccounted: (14.748 %) :: 1.321 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 7.635 ms) ---
......0 -- updateActivationState (0.00 %) :: 0.000 ms / frame (1 calls)
......1 -- updateActions (0.00 %) :: 0.000 ms / frame (1 calls)
......2 -- integrateTransforms (0.01 %) :: 0.001 ms / frame (1 calls)
......3 -- solveConstraints (0.14 %) :: 0.011 ms / frame (1 calls)
......4 -- calculateSimulationIslands (57.09 %) :: 4.359 ms / frame (1 calls)
......5 -- performDiscreteCollisionDetection (42.51 %) :: 3.246 ms / frame (1 calls)
......6 -- createPredictiveContacts (0.03 %) :: 0.002 ms / frame (1 calls)
......7 -- predictUnconstraintMotion (0.00 %) :: 0.000 ms / frame (1 calls)
......Unaccounted: (0.210 %) :: 0.016 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 0.011 ms) ---
.........0 -- solveGroup (54.55 %) :: 0.006 ms / frame (1 calls)
.........1 -- processIslands (9.09 %) :: 0.001 ms / frame (1 calls)
.........2 -- islandUnionFindAndQuickSort (0.00 %) :: 0.000 ms / frame (1 calls)
.........Unaccounted: (36.364 %) :: 0.004 ms
............----------------------------------
............Profiling: solveGroup (total running time: 0.006 ms) ---
............0 -- solveGroupCacheFriendlyIterations (33.33 %) :: 0.002 ms / frame (1 calls)
............1 -- solveGroupCacheFriendlySetup (33.33 %) :: 0.002 ms / frame (1 calls)
............Unaccounted: (33.333 %) :: 0.002 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 3.246 ms) ---
.........0 -- dispatchAllCollisionPairs (74.92 %) :: 2.432 ms / frame (1 calls)
.........1 -- calculateOverlappingPairs (0.22 %) :: 0.007 ms / frame (1 calls)
.........2 -- updateAabbs (24.61 %) :: 0.799 ms / frame (1 calls)
.........Unaccounted: (0.246 %) :: 0.008 ms
.........----------------------------------
.........Profiling: createPredictiveContacts (total running time: 0.002 ms) ---
.........0 -- release predictive contact manifolds (0.00 %) :: 0.000 ms / frame(1 calls)
.........Unaccounted: (100.000 %) :: 0.002 ms
13ms is much better, but I hope it still could be better as there is only static scene
To be clear here is my masks setup:
Code: Select all
#define BIT(x) (1<<(x))
enum collisionGroups
{
COL_NOTHING = 0, //<Collide with nothing
COL_SYSTEM = (int)BIT(0), //<Collide with raycast
COL_OBJECTS = (int)BIT(1), //<Collide with normal objects
COL_TERRAIN = (int)BIT(2), //<Collide with terrain
COL_PLAYER = (int)BIT(3), //collide with players
COL_OBJECTS_GHOSTS = (int)BIT(4),// collide with ghosts objects (that only collide with terrain)
COL_OBJECTS_STATIC = (int)BIT(5)//STATIC
};
enum collisionMasks
{
COL_M_NOTHING = (int)(COL_NOTHING), //<Collide with nothing
COL_M_SYSTEM = (int)(COL_SYSTEM), //<Collide with raycast
COL_M_OBJECTS = (int)(COL_OBJECTS|COL_TERRAIN|COL_PLAYER|COL_SYSTEM|COL_OBJECTS_STATIC), //NORMAL OBJECTS
COL_M_TERRAIN = (int)(COL_OBJECTS|COL_PLAYER|COL_OBJECTS_GHOSTS), //TERRAIN
COL_M_PLAYER = (int)(COL_TERRAIN|COL_OBJECTS|COL_PLAYER|COL_SYSTEM|COL_OBJECTS_STATIC), //PLAYERS
COL_M_OBJECTS_GHOSTS=(int)(COL_TERRAIN|COL_SYSTEM),//only with terrain
COL_M_STATIC=(int)(COL_SYSTEM|COL_OBJECTS|COL_PLAYER )//STATIC
};
Static objects are created via: addRigidBody(RigidBody,
COL_OBJECTS_STATIC,COL_M_STATIC);
gdlk wrote:other test =P (although maybe you already did it), add the static objects like collision object only, that means replace
Code: Select all
dynamicsWorld.addRigidBody(rigidBody);
with
Code: Select all
dynamicsWorld.addCollisionObject(rigidBody);
when rigidBody is static
Thank you for response
I have tried
Code: Select all
addCollisionObject(RigidBody,CollisionFilterGroups.StaticFilter,CollisionFilterGroups.DefaultFilter);
(version without flags was nightmare)
Here is results:
Code: Select all
----------------------------------
Profiling: Root (total running time: 20.109 ms) ---
0 -- stepSimulation (99.97 %) :: 20.103 ms / frame (1 calls)
Unaccounted: (0.030 %) :: 0.006 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 20.103 ms) ---
...0 -- synchronizeMotionStates (0.00 %) :: 0.000 ms / frame (1 calls)
...1 -- internalSingleStepSimulation (94.52 %) :: 19.001 ms / frame (1 calls)
...Unaccounted: (5.482 %) :: 1.102 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 19.001 ms) --
-
......0 -- updateActivationState (0.00 %) :: 0.000 ms / frame (1 calls)
......1 -- updateActions (0.00 %) :: 0.000 ms / frame (1 calls)
......2 -- integrateTransforms (0.01 %) :: 0.001 ms / frame (1 calls)
......3 -- solveConstraints (0.06 %) :: 0.012 ms / frame (1 calls)
......4 -- calculateSimulationIslands (15.92 %) :: 3.025 ms / frame (1 calls)
......5 -- performDiscreteCollisionDetection (83.96 %) :: 15.953 ms / frame (1 calls)
......6 -- createPredictiveContacts (0.01 %) :: 0.001 ms / frame (1 calls)
......7 -- predictUnconstraintMotion (0.00 %) :: 0.000 ms / frame (1 calls)
......Unaccounted: (0.047 %) :: 0.009 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 0.012 ms) ---
.........0 -- solveGroup (50.00 %) :: 0.006 ms / frame (1 calls)
.........1 -- processIslands (0.00 %) :: 0.000 ms / frame (1 calls)
.........2 -- islandUnionFindAndQuickSort (16.67 %) :: 0.002 ms / frame (1 calls)
.........Unaccounted: (33.333 %) :: 0.004 ms
............----------------------------------
............Profiling: solveGroup (total running time: 0.006 ms) ---
............0 -- solveGroupCacheFriendlyIterations (33.33 %) :: 0.002 ms / frame (1 calls)
............1 -- solveGroupCacheFriendlySetup (33.33 %) :: 0.002 ms / frame (1 calls)
............Unaccounted: (33.333 %) :: 0.002 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 15.953 ms) ---
.........0 -- dispatchAllCollisionPairs (0.03 %) :: 0.004 ms / frame (1 calls)
.........1 -- calculateOverlappingPairs (0.54 %) :: 0.086 ms / frame (1 calls)
.........2 -- updateAabbs (99.39 %) :: 15.856 ms / frame (1 calls)
.........Unaccounted: (0.044 %) :: 0.007 ms
.........----------------------------------
.........Profiling: createPredictiveContacts (total running time: 0.001 ms) ---
.........0 -- release predictive contact manifolds (100.00 %) :: 0.001 ms / frame (1 calls)
.........Unaccounted: (0.000 %) :: 0.000 ms
----------------------------------
Profiling: Root (total running time: 29.874 ms) ---
0 -- stepSimulation (99.96 %) :: 29.861 ms / frame (1 calls)
Unaccounted: (0.044 %) :: 0.013 ms
Comparision of results is really interesting.
addRigidBody is using around 6ms in
dispatchAllCollisionPairs while
addCollisionObject uses there almost nothing
Instead
addCollisionObject spends around 16 ms (!) on
updateAabbs which should be almost nothing like in
addRigidBody because there is:
Code: Select all
dynamicsWorld->setForceUpdateAllAabbs(false);
It looks like
addCollisionObject have no benefit from it. If it would then simulation time should drop to very low level (as it should in static scene!).
Its also worth to notice that
addCollisionObject adding/removing objects to/from simulation seems to work faster.
Any more suggestions? Ideas?
Re: Bullet performance problems - quite huge scene
Posted: Mon Jun 08, 2015 6:38 am
by Basroil
Have you tried splitting up your height-field into multiple independent sections? Depending on your setup it could be causing more checks than it should.
Re: Bullet performance problems - quite huge scene
Posted: Mon Jun 08, 2015 3:15 pm
by drleviathan
If the time spent in btCollisionWorld::updateAabbs() increased when you added static objects as "collision objects" rather than "rigid bodies" then something is messed up. After looking at the code of btRigidBody::addRigidBody() and btCollisionWorld::updateAabbs() I wouldn't expect that to happen (hint, you should look to see what they are doing). Did you perhaps accidentally lose your call to dynamicsWorld->setForceUpdateAllAabbs(false)?
Re: Bullet performance problems - quite huge scene
Posted: Mon Jun 08, 2015 9:55 pm
by skides
Basroil wrote:Have you tried splitting up your height-field into multiple independent sections? Depending on your setup it could be causing more checks than it should.
I will definitely try that!
drleviathan wrote:If the time spent in btCollisionWorld::updateAabbs() increased when you added static objects as "collision objects" rather than "rigid bodies" then something is messed up. After looking at the code of btRigidBody::addRigidBody() and btCollisionWorld::updateAabbs() I wouldn't expect that to happen (hint, you should look to see what they are doing). Did you perhaps accidentally lose your call to dynamicsWorld->setForceUpdateAllAabbs(false)?
I have double checked - posted results are correct. I have not lose
dynamicsWorld->setForceUpdateAllAabbs(false) call.
Results are repeatable via changing one line, so it shouldn't have effect on physics initialization (but, anyway I checked it)
After examining a bit of bullet source code I discovered that it could still updateaabbs even with
dynamicsWorld->setForceUpdateAllAabbs(false) if activation flag is diffrent than
ISLAND_SLEEPING or
DISABLE_SIMULATION. After manually selecting those flags the simulation time has been reduced by over 70%. On other hand I find out that collisions with player has stopped to work. (only when using
collisionobjects instead of
rigidbodies)
Another thing - if I use masks for
rigidbodies (player, dynamic objects), and
collision object (for static) how to set it correctly? I mean collision objects uses totaly diffrent masks. Defined as:
Code: Select all
enum CollisionFilterGroups
{
DefaultFilter = 1,
StaticFilter = 2,
KinematicFilter = 4,
DebrisFilter = 8,
SensorTrigger = 16,
CharacterFilter = 32,
AllFilter = -1 //all bits sets: DefaultFilter | StaticFilter | KinematicFilter | DebrisFilter | SensorTrigger
};
Also there is rigidBody->setCollisionFlags :
Code: Select all
enum CollisionFlags
{
CF_STATIC_OBJECT= 1,
CF_KINEMATIC_OBJECT= 2,
CF_NO_CONTACT_RESPONSE = 4,
CF_CUSTOM_MATERIAL_CALLBACK = 8,//this allows per-triangle material (friction/restitution)
CF_CHARACTER_OBJECT = 16,
CF_DISABLE_VISUALIZE_OBJECT = 32, //disable debug drawing
CF_DISABLE_SPU_COLLISION_PROCESSING = 64//disable parallel/SPU processing
};
And standard masks based on
http://www.bulletphysics.org/mediawiki- ... _Filtering
Code: Select all
#define BIT(x) (1<<(x))
enum collisionGroups
{
COL_NOTHING = 0, //<Collide with nothing
COL_SYSTEM = (int)BIT(0), //<Collide with raycast
COL_OBJECTS = (int)BIT(1), //<Collide with normal objects
COL_TERRAIN = (int)BIT(2), //<Collide with terrain
COL_PLAYER = (int)BIT(3), //collide with players
COL_OBJECTS_GHOSTS = (int)BIT(4),// collide with ghosts objects (that only collide with terrain)
COL_OBJECTS_STATIC = (int)BIT(5)//STATIC
};
enum collisionMasks
{
COL_M_NOTHING = (int)(COL_NOTHING), //<Collide with nothing
COL_M_SYSTEM = (int)(COL_SYSTEM), //<Collide with raycast
COL_M_OBJECTS = (int)(COL_OBJECTS|COL_TERRAIN|COL_PLAYER|COL_SYSTEM|COL_OBJECTS_STATIC), //NORMAL OBJECTS
COL_M_TERRAIN = (int)(COL_OBJECTS|COL_PLAYER|COL_OBJECTS_GHOSTS), //TERRAIN
COL_M_PLAYER = (int)(COL_TERRAIN|COL_OBJECTS|COL_PLAYER|COL_SYSTEM|COL_OBJECTS_STATIC), //PLAYERS
COL_M_OBJECTS_GHOSTS=(int)(COL_TERRAIN|COL_SYSTEM),//only with terrain
COL_M_STATIC=(int)(COL_SYSTEM|COL_OBJECTS |COL_PLAYER )//STATIC
};
It seems for me a bit messy. A lot of options, but so far not clarified enough ;/
Using same masks for
collisionobjects result with worse performance than
rigidbodies.
Re: Bullet performance problems - quite huge scene
Posted: Tue Jun 09, 2015 7:31 am
by drleviathan
Well I dunno. I looked over the code to see if I could see a reason why the simulation would be slower when adding static objects as btCollisionObjects rather than as btRigidBody's... I couldn't find it. I thought what was supposed to happen was that the btDbvtBroadphase container would maintain two distinct trees: one for static objects and one for dynamic and then would optimize things by never computing or caching overlapping pairs between Aabb's in the static tree, and also not considering objects in the static tree for simulation islands.
I tried to read the btDbvtBroadphase but found it rather obtuse. It will require more careful study.
The fact that you're seeing lots of time spent in updateAabbs() suggests that the static objects are considered "active" and are getting their Aabb's updated every frame.
I examined how the code logic pivots on the value of DISABLE_SIMULATION. As far as I can tell it primarily prevents setActivationState() from having any effect. So I guess the thing to try is to forceActivationState(DISABLE_SIMULATION) on all static objects after they are added to the world (and only on static objects). Of course, you also want to make sure to setCollisionFlags(CF_STATIC_OBJECT) on static objects.
Edit: your collision groups/masks look correct to me.
Re: Bullet performance problems - quite huge scene
Posted: Tue Jun 09, 2015 12:26 pm
by xexuxjy
You could try a simple change and add an implementation of updateAabbs in btDiscreteDynamicsWorld (it's a virtual method) that only iterates over the m_nonStaticRigidBodies set.
I'm sure I've looked into why static object aabb's are recalculated before but I can't remember the result of that, possibly because they are sometimes moved... If you know they never will be then that might be a good optimization