Physics Simulation Forum

 

All times are UTC




Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Making a faster rayTest?
PostPosted: Tue Feb 14, 2017 2:54 pm 
Offline

Joined: Tue Feb 14, 2017 2:14 pm
Posts: 7
Besides physics, I'm also using Bullet for various collision detection tasks, like tracing bullets (duh).

The thing is, rayTrace is terribly slow, so slow in fact that I feel like I'm doing something wrong?!?
Look at this test scene: 40 AABBs and tracing a ray from left to right. I initially figured that btCollisionWorld::ClosestRayResultCallback will walk the underlying partitioned space in a front-to-back manner so that the ray stops efficiently at the first collision. But it looks like it does the same thing as btCollisionWorld::AllHitsRayResultCallback - it does precise collision detection with ALL the boxes that the ray goes through, then only keeps the closest one?

100 ray tests in this scene take up 14 milliseconds (the same with ClosestCallback as with AllHitsCallback)

Some code:
Code:
struct sMyFilterClosestRayResultCallback : public btCollisionWorld::ClosestRayResultCallback
{
   public:
      sMyFilterClosestRayResultCallback( const btVector3& _rayFromWorld, const btVector3& _rayToWorld, unsigned int _uiColFlags )
         : ClosestRayResultCallback( _rayFromWorld, _rayToWorld )
         , uiColFlags( _uiColFlags )
      {
      }

   public:
      unsigned int uiColFlags;

   public:
      virtual bool needsCollision( btBroadphaseProxy* proxy0 ) const
      {
         bool bBaseResult = ClosestRayResultCallback::needsCollision( proxy0 );
         if( bBaseResult ) // if collision test passed, further test with our own criteria
         {
            btCollisionObject* pColObj = (btCollisionObject*)proxy0->m_clientObject;
            sCollisionObject* pMyColObj = (sCollisionObject*)pColObj->getUserPointer();

            // game collision info ref; related to visibility / movement / bullet hits etc
            if( pMyColObj->GetFlags() & uiColFlags )
               return true;
            return false;
         }
         return false;
      }
};
sMyFilterClosestRayResultCallback singleResult( from, to, uiColFlags );
m_pWorld->rayTest( from, to, singleResult );


Is it something I'm doing?


Attachments:
Untitled.jpg
Untitled.jpg [ 36.75 KiB | Viewed 3295 times ]
Top
 Profile  
 
PostPosted: Tue Feb 14, 2017 8:15 pm 
Offline

Joined: Thu Apr 14, 2016 3:44 pm
Posts: 44
I think changing the btBroadphaseInterface used can improve performance in some scenarios but in all instances all collision objects that overlap the rays path are going to be processed, the underlying algorithm is generalized and doesn't account for only wanting the closest object.

The performance your getting seems especial poor but I think you might improve it by jamming all your walls into a single btBvhTriangleMeshShape or you can try writing your own rayTest with early out.


Top
 Profile  
 
PostPosted: Wed Feb 15, 2017 7:46 am 
Offline

Joined: Tue Feb 14, 2017 2:14 pm
Posts: 7
ktfh wrote:
...or you can try writing your own rayTest with early out.

That's one option, but grasping Bullet's inner-workings would take quite some time. Do you know of any good info of the underlying space partitioning structure? Walking an octree or a bsp in a front-to-back manner would be perfect for rayTracing.


Top
 Profile  
 
PostPosted: Wed Feb 15, 2017 2:53 pm 
Offline

Joined: Wed Mar 04, 2015 6:16 pm
Posts: 28
Pintea wrote:
But it looks like it does the same thing as btCollisionWorld::AllHitsRayResultCallback - it does precise collision detection with ALL the boxes that the ray goes through, then only keeps the closest one?

Are you sure? I use ClosestRayResultCallback and I get good performance.
Setting the kF_FilterBackfaces flag might help.


Top
 Profile  
 
PostPosted: Wed Feb 15, 2017 4:48 pm 
Offline

Joined: Thu Apr 14, 2016 3:44 pm
Posts: 44
Pintea wrote:
ktfh wrote:
...or you can try writing your own rayTest with early out.

That's one option, but grasping Bullet's inner-workings would take quite some time. Do you know of any good info of the underlying space partitioning structure? Walking an octree or a bsp in a front-to-back manner would be perfect for rayTracing.

I really think you should try doing more benchmarks before writing your own rayTest with early out and try comparing btBvhTriangleMeshShape performance vs your btBoxShapes.

Also bullet has a btDbvtBroadphase that can cull a lot of superfluous ray tests. Something simple like ray testing incrementally biggger distances could probably improve your test scenes performance. Like ray test afew meters ahead and quadruple the tested distance until you hit something? I think you've designed a worst case scenario of an entire scenes objects overlapping the rays path, no broadphase optimizations can have an effect. Brute forcing 100 rays against all 40 objects could probably even improve performance slightly.


Top
 Profile  
 
PostPosted: Wed Feb 15, 2017 5:10 pm 
Offline

Joined: Tue Feb 14, 2017 2:14 pm
Posts: 7
ktfh wrote:
I really think you should try doing more benchmarks before writing your own rayTest with early out and try comparing btBvhTriangleMeshShape performance vs your btBoxShapes.

Also bullet has a btDbvtBroadphase that can cull a lot of superfluous ray tests. Something simple like ray testing incrementally biggger distances could probably improve your test scenes performance. Like ray test afew meters ahead and quadruple the tested distance until you hit something? I think you've designed a worst case scenario of an entire scenes objects overlapping the rays path, no broadphase optimizations can have an effect. Brute forcing 100 rays against all 40 objects could probably even improve performance slightly.


Shouldn't ray vs box tests be way faster than ray vs random mesh?

The broadphase IS used inside the world->rayTest() function, it's just that instead of stopping at the first hit, it checks ALL the objects along the ray and does precise collisions with all of them.
It is implemented as a GatherAllCollisionsAlongRay, instead of StopAtFirstHitRay (as it would be in a standard ray tracer).

For example, ray tracing into a BSP tree would start with checking the primitives inside the current leaf and then walking the tree in a front-to-back manner. This way you can stop at the closest collision.
The same with ray tracing into an octree (as used for voxel rendering), where you go from the current cell in the direction of the ray by using the Bresenham3D algorithm and stop at the first collision.
Maybe bullet has a similar implementation somewhere, but I can't find it?


Top
 Profile  
 
PostPosted: Wed Feb 15, 2017 5:53 pm 
Offline

Joined: Thu Apr 14, 2016 3:44 pm
Posts: 44
Pintea wrote:
Shouldn't ray vs box tests be way faster than ray vs random mesh?

I haven't benchmarked but my suspicion is that ray vs whatever isn't necessarily that costly relative to the number of virtual function calls, indirection, branching, allocations that can happens inside a rayTest. Easy fixes would be to incrementally rayTest greater distances until a hit or use a btBvhTriangleMeshShape. Bullet has bounding volume hierarchies for accelerating spacial queries but no bsp or octree. Try benchmarking afew more scenarios before writing a whole new broadphase and rayTest.


Top
 Profile  
 
PostPosted: Thu Feb 16, 2017 2:43 pm 
Offline

Joined: Tue Feb 14, 2017 2:14 pm
Posts: 7
So, I took a closer look and I've been using bullet's bounding volume tree broadphase (btDbvt).

The rayTestInternal function doesn't do a directional traversal of the tree. I hacked it in by calculating the distance to the two children and adding to the stack the closest one first, like this:
Code:
btVector3 center1 = (node->childs[0]->volume.Mins() + node->childs[0]->volume.Maxs()) * 0.5f;
btVector3 center2 = (node->childs[1]->volume.Mins() + node->childs[1]->volume.Maxs()) * 0.5f;
float dist1 = (center1-rayFrom).length2();
float dist2 = (center2-rayFrom).length2();
if (dist1 < dist2)
{
   stack[depth++]=node->childs[1];
   stack[depth++]=node->childs[0];
}
else
{
   stack[depth++]=node->childs[0];
   stack[depth++]=node->childs[1];
}

This "sort-of" works for the directional traversal of the tree, but doesn't stop at the first collision, because of how Bullet works internally. It would require a lot of changes to do it.
So I added some code to the ClosestRayResultCallback::needsCollision() callback, where I do a segment versus aabb test myself to see if the collision is still worth doing. Really bad, but I still got a tenfold increase in performance.


Top
 Profile  
 
PostPosted: Thu Feb 16, 2017 6:46 pm 
Offline

Joined: Thu Apr 14, 2016 3:44 pm
Posts: 44
Pintea wrote:
I hacked it in by calculating the distance to the two children and adding to the stack the closest one first...

I do a segment versus aabb test myself to see if the collision is still worth doing. Really bad, but I still got a tenfold increase in performance.

Pretty clever solutions you came up with.

I got curious what the performance difference is like between the different shape setups and tried to replicate your test scene 40 boxs vs 100 rays and take an average run time.

2.405ms 100 rays vs one big btBvhTriangleMeshShape of 40 boxs
2.902ms 100 rays vs 40 btBvhTriangleMeshShape boxs
4.392ms 100 rays vs 40 btBoxShape

ray triangle intersection uses an analytical solution while ray vs convex shape uses btGjkConvexCast or btSubsimplexConvexCast and iterates toward an intersection point. Switching might save you a millisecond?


Top
 Profile  
 
PostPosted: Thu Feb 16, 2017 8:48 pm 
Offline

Joined: Tue Feb 14, 2017 2:14 pm
Posts: 7
Wow, that's quite some difference between boxShape and meshShape, would have never expected it. I'll look into what it's doing, I would have expected a specialized ray vs AABB code instead of sweeping (or iterating...) a convex shape?!
Did you time it in debug mode or release?


Top
 Profile  
 
PostPosted: Sun Feb 19, 2017 12:05 pm 
Offline

Joined: Wed Mar 04, 2015 6:16 pm
Posts: 28
You know if ClosestRayResultCallback does the exact same as AllHitsRayResultCallback, that sounds like a bug to me. Maybe you should log it as a bug on github?


Top
 Profile  
 
PostPosted: Sun Feb 19, 2017 12:36 pm 
Offline

Joined: Sat Jul 04, 2015 10:33 am
Posts: 349
Location: Bern, Switzerland
If you log a bug, can you build a simple example browser example (based on BasicExample or the raytest one) and maybe add some timings with btClock to it? By that you speed up the debugging as it can then be used as a unit test for debugging. Also it shows for everyone that the bug actually is there. You will see that it greatly speeds up the bug fixing process/response time to your bug.


Top
 Profile  
 
PostPosted: Sun Feb 19, 2017 11:56 pm 
Offline

Joined: Thu Apr 14, 2016 3:44 pm
Posts: 44
ed_welch wrote:
You know if ClosestRayResultCallback does the exact same as AllHitsRayResultCallback, that sounds like a bug to me. Maybe you should log it as a bug on github?

There is no bug, ClosestRayResultCallback and AllHitsRayResultCallback behave as intended. The rayTest implementation is very generalized to support as many queries as possible but is unfortunately slow compared to a rayTest designed to only return the first hit.


Top
 Profile  
 
PostPosted: Tue Feb 21, 2017 10:12 pm 
Offline

Joined: Sat Jul 04, 2015 10:33 am
Posts: 349
Location: Bern, Switzerland
Maybe somebody could contribute the faster version of ClosestRayResultCallback which does not calculate all other ray results?


Top
 Profile  
 
PostPosted: Wed Feb 22, 2017 1:42 pm 
Offline

Joined: Tue Feb 14, 2017 2:14 pm
Posts: 7
benelot wrote:
Maybe somebody could contribute the faster version of ClosestRayResultCallback which does not calculate all other ray results?

Someone should definitely do that :)

For the moment I'm just playing around with it, but when we'll have to be production ready with our game I'll try to do a proper version. Hopefully someone with deeper Bullet understanding will have already done it by then.

My requirements would be:
- parse the bounding volume hierarchy in a front-to-back order, starting from the ray's origin
- do specialized ray-vs-primitive (obb, sphere, cylinder etc.) tests where possible instead of the iterative shape-sweeping algorithm it's now using
- stop at first hit

PS:
Is it called shape-sweeping if it's iterative? Why iterative? There's an actual while (iterations < 30) in the code right now in btSubsimplexConvexCast::btSubsimplexConvexCast


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 21 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group