performance questions/uncertainties

Myagi
Posts: 5
Joined: Fri Jul 22, 2005 7:57 pm

performance questions/uncertainties

Post by Myagi »

After over a year of agonizing over havok vs bullet, I now have to make a decision. Well I had some other work come inbetween so (luckily) I wasn't forced to decide until now.

I badly want to use bullet, but the reality is I need to make the choice based on what serves the project's chances to see the light of day best, as I have very limited resources. So in the effort to get a little better overview of the two I wanted to compare performance, just to get the general idea where they stand. I'm a little uncertain if I'm interpreting the results correct though, maybe someone here could confirm or deny their sanity.

So my test case is the "Ragdoll On Stairs" demo in havok and "Ragdoll Demo" in bullet, running on a single core. In bullet I spawned a couple more so I have 5 capsule based ragdolls in each demo. I put them all in a pile on the open floor in both. Rendering vsync locked to 60 fps and cofirmed with fraps, no textures and shadows, to have the setup as similar as possible.

The results I'm getting are (and I've done it in multiple runs, re-arranging the pile signifficantly, always getting the same numbers)

when idle/deactivated:
bullet: stepSim 0.8 ms (seems quite high for a completely idle scene ?)
havok: 0.01something ms

when stuff is dragged a little so the island is alive:
bullet: stepSim 3.1 - 3.8 ms
havok: <0.9 ms


The differences are quite signifficant which I guess might not be that strange, but the high idle time in bullet makes me wonder if I'm doing something wrong or if it looks about right.
User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA

Re: performance questions/uncertainties

Post by Erwin Coumans »

Are you using the unmodified ReleaseRagdollDemo.exe (not AllBulletDemos.exe) using Windows MSVC 9.0 release build of the latest Bullet 2.75 RC7 (or later)?

You can spawn ragdolls using the 'e' key.
The full simulation time of 5 ragdolls take 1.1 ms on this 3Ghz Core 2 Duo Q6800 (single threaded version of Bullet).

What machine are you testing it on? Can you report more detailed timings, using CProfileManager::dumpAll(); right after 'stepSimulation'?
Thanks,
Erwin
Myagi
Posts: 5
Joined: Fri Jul 22, 2005 7:57 pm

Re: performance questions/uncertainties

Post by Myagi »

I used an out-of-the-box 2.75 RC6 version, that's the latest one on the download page.

Was using the AllBulletsDemo and an older VS, I recompiled ReleaseRagdollDemo using VC9 and got the following results for the same test

idle stepSim: 0.5 - 0.6 ms
active stepSim: 2.3 - 2.9 ms

(I always made sure that everything actually was resting/idle, as displayed with the debug color on the bodies, during idle tests)

edit: forgot to mention what machine I have, a P4 3GHz. Since I'm only comparing the relative performance between havok and bullet on the same machine I didn't think that was important.


dumpAll() for idle:

Code: Select all

Profiling: Root (total running time: 16.683 ms) ---
0 -- rayTest (0.00 %) :: 0.000 ms / frame (0 calls)
1 -- debugDrawWorld (0.02 %) :: 0.003 ms / frame (1 calls)
2 -- stepSimulation (3.28 %) :: 0.548 ms / frame (1 calls)
Unaccounted: (96.697 %) :: 16.132 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 0.548 ms) ---
...0 -- synchronizeMotionStates (0.36 %) :: 0.002 ms / frame (2 calls)
...1 -- internalSingleStepSimulation (97.26 %) :: 0.533 ms / frame (1 calls)
...Unaccounted: (2.372 %) :: 0.013 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 0.533 ms) ---
......0 -- updateActivationState (0.56 %) :: 0.003 ms / frame (1 calls)
......1 -- updateActions (0.19 %) :: 0.001 ms / frame (1 calls)
......2 -- integrateTransforms (0.38 %) :: 0.002 ms / frame (1 calls)
......3 -- solveConstraints (16.70 %) :: 0.089 ms / frame (1 calls)
......4 -- calculateSimulationIslands (7.13 %) :: 0.038 ms / frame (1 calls)
......5 -- performDiscreteCollisionDetection (64.17 %) :: 0.342 ms / frame (1 calls)
......6 -- predictUnconstraintMotion (8.44 %) :: 0.045 ms / frame (1 calls)
......Unaccounted: (2.439 %) :: 0.013 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 0.089 ms) ---
.........0 -- processIslands (7.87 %) :: 0.007 ms / frame (1 calls)
.........1 -- islandUnionFindAndQuickSort (80.90 %) :: 0.072 ms / frame (1 calls)
.........Unaccounted: (11.236 %) :: 0.010 ms
............----------------------------------
............Profiling: processIslands (total running time: 0.007 ms) ---
............0 -- solveGroup (0.00 %) :: 0.000 ms / frame (0 calls)
............Unaccounted: (100.000 %) :: 0.007 ms
...............----------------------------------
...............Profiling: solveGroup (total running time: 0.000 ms) ---
...............0 -- solveGroupCacheFriendlyIterations (0.00 %) :: 0.000 ms / frame (0 calls)
...............1 -- solveGroupCacheFriendlySetup (0.00 %) :: 0.000 ms / frame (0 calls)
...............Unaccounted: (0.000 %) :: 0.000 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 0.342 ms) ---
.........0 -- dispatchAllCollisionPairs (96.20 %) :: 0.329 ms / frame (1 calls)
.........1 -- calculateOverlappingPairs (0.29 %) :: 0.001 ms / frame (1 calls)
.........2 -- updateAabbs (1.46 %) :: 0.005 ms / frame (1 calls)
.........Unaccounted: (2.047 %) :: 0.007 ms
dumpAll() for active:

Code: Select all

Profiling: Root (total running time: 16.648 ms) ---
0 -- rayTest (0.00 %) :: 0.000 ms / frame (0 calls)
1 -- debugDrawWorld (0.01 %) :: 0.002 ms / frame (1 calls)
2 -- stepSimulation (16.38 %) :: 2.727 ms / frame (1 calls)
Unaccounted: (83.608 %) :: 13.919 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 2.727 ms) ---
...0 -- synchronizeMotionStates (2.35 %) :: 0.064 ms / frame (2 calls)
...1 -- internalSingleStepSimulation (96.99 %) :: 2.645 ms / frame (1 calls)
...Unaccounted: (0.660 %) :: 0.018 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 2.645 ms) ---
......0 -- updateActivationState (0.15 %) :: 0.004 ms / frame (1 calls)
......1 -- updateActions (0.04 %) :: 0.001 ms / frame (1 calls)
......2 -- integrateTransforms (1.25 %) :: 0.033 ms / frame (1 calls)
......3 -- solveConstraints (27.26 %) :: 0.721 ms / frame (1 calls)
......4 -- calculateSimulationIslands (1.36 %) :: 0.036 ms / frame (1 calls)
......5 -- performDiscreteCollisionDetection (67.71 %) :: 1.791 ms / frame (1 calls)
......6 -- predictUnconstraintMotion (1.74 %) :: 0.046 ms / frame (1 calls)
......Unaccounted: (0.491 %) :: 0.013 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 0.721 ms) ---
.........0 -- processIslands (90.29 %) :: 0.651 ms / frame (1 calls)
.........1 -- islandUnionFindAndQuickSort (8.18 %) :: 0.059 ms / frame (1 calls)
.........Unaccounted: (1.526 %) :: 0.011 ms
............----------------------------------
............Profiling: processIslands (total running time: 0.651 ms) ---
............0 -- solveGroup (86.79 %) :: 0.565 ms / frame (1 calls)
............Unaccounted: (13.210 %) :: 0.086 ms
...............----------------------------------
...............Profiling: solveGroup (total running time: 0.565 ms) ---
...............0 -- solveGroupCacheFriendlyIterations (56.99 %) :: 0.322 ms / frame (1 calls)
...............1 -- solveGroupCacheFriendlySetup (41.24 %) :: 0.233 ms / frame (1 calls)
...............Unaccounted: (1.770 %) :: 0.010 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 1.791 ms) ---
.........0 -- dispatchAllCollisionPairs (97.38 %) :: 1.744 ms / frame (1 calls)
.........1 -- calculateOverlappingPairs (0.11 %) :: 0.002 ms / frame (1 calls)
.........2 -- updateAabbs (2.12 %) :: 0.038 ms / frame (1 calls)
.........Unaccounted: (0.391 %) :: 0.007 ms
User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA

Re: performance questions/uncertainties

Post by Erwin Coumans »

dispatchAllCollisionPairs (97.38 %) :: 1.744 ms / frame (1 calls)
Most of the time is probably spend in capsule versus capsule collision tests. Bullet uses a generic (bit slowish) GJK algorithm for this. Havok probably has a dedicated/optimized capsule versus capsule collision test. We could add an optimized version if necessary.

You might want to try a recent 2.75 RC (SVN trunk) build, and enable some optimizations:

Code: Select all

	m_dynamicsWorld = dynamicsWorld = new btDiscreteDynamicsWorld(m_dispatcher,m_overlappingPairCache,m_solver,m_collisionConfiguration);
	dynamicsWorld->getDispatchInfo().m_useConvexConservativeDistanceUtil = true;
	dynamicsWorld->getDispatchInfo().m_convexConservativeDistanceThreshold = 0.01;
Also, it might be worth trying out the multi-threaded narrowphase collision dispatcher (instead of btCollisionDispatcher), it might be faster (see USE_PARALLEL_DISPATCHER/SpuGatheringCollisionDispatcher in Bullet/Demos/BulletMultiThreaded.

Hope this helps,
Erwin