Making a faster rayTest?

Pintea
Posts: 8
Joined: Tue Feb 14, 2017 2:14 pm

Making a faster rayTest?

Post by Pintea »

Besides physics, I'm also using Bullet for various collision detection tasks, like tracing bullets (duh).

The thing is, rayTrace is terribly slow, so slow in fact that I feel like I'm doing something wrong?!?
Look at this test scene: 40 AABBs and tracing a ray from left to right. I initially figured that btCollisionWorld::ClosestRayResultCallback will walk the underlying partitioned space in a front-to-back manner so that the ray stops efficiently at the first collision. But it looks like it does the same thing as btCollisionWorld::AllHitsRayResultCallback - it does precise collision detection with ALL the boxes that the ray goes through, then only keeps the closest one?

100 ray tests in this scene take up 14 milliseconds (the same with ClosestCallback as with AllHitsCallback)

Some code:

Code: Select all

struct sMyFilterClosestRayResultCallback : public btCollisionWorld::ClosestRayResultCallback
{
	public:
		sMyFilterClosestRayResultCallback( const btVector3& _rayFromWorld, const btVector3& _rayToWorld, unsigned int _uiColFlags )
			: ClosestRayResultCallback( _rayFromWorld, _rayToWorld )
			, uiColFlags( _uiColFlags )
		{
		}

	public:
		unsigned int uiColFlags;

	public:
		virtual bool needsCollision( btBroadphaseProxy* proxy0 ) const
		{
			bool bBaseResult = ClosestRayResultCallback::needsCollision( proxy0 );
			if( bBaseResult ) // if collision test passed, further test with our own criteria
			{
				btCollisionObject* pColObj = (btCollisionObject*)proxy0->m_clientObject;
				sCollisionObject* pMyColObj = (sCollisionObject*)pColObj->getUserPointer();

				// game collision info ref; related to visibility / movement / bullet hits etc
				if( pMyColObj->GetFlags() & uiColFlags )
					return true;
				return false;
			}
			return false;
		}
};
sMyFilterClosestRayResultCallback singleResult( from, to, uiColFlags );
m_pWorld->rayTest( from, to, singleResult );
Is it something I'm doing?
Attachments
Untitled.jpg
Untitled.jpg (36.75 KiB) Viewed 50474 times
ktfh
Posts: 44
Joined: Thu Apr 14, 2016 3:44 pm

Re: Making a faster rayTest?

Post by ktfh »

I think changing the btBroadphaseInterface used can improve performance in some scenarios but in all instances all collision objects that overlap the rays path are going to be processed, the underlying algorithm is generalized and doesn't account for only wanting the closest object.

The performance your getting seems especial poor but I think you might improve it by jamming all your walls into a single btBvhTriangleMeshShape or you can try writing your own rayTest with early out.
Pintea
Posts: 8
Joined: Tue Feb 14, 2017 2:14 pm

Re: Making a faster rayTest?

Post by Pintea »

ktfh wrote:...or you can try writing your own rayTest with early out.
That's one option, but grasping Bullet's inner-workings would take quite some time. Do you know of any good info of the underlying space partitioning structure? Walking an octree or a bsp in a front-to-back manner would be perfect for rayTracing.
ed_welch
Posts: 43
Joined: Wed Mar 04, 2015 6:16 pm

Re: Making a faster rayTest?

Post by ed_welch »

Pintea wrote:But it looks like it does the same thing as btCollisionWorld::AllHitsRayResultCallback - it does precise collision detection with ALL the boxes that the ray goes through, then only keeps the closest one?
Are you sure? I use ClosestRayResultCallback and I get good performance.
Setting the kF_FilterBackfaces flag might help.
ktfh
Posts: 44
Joined: Thu Apr 14, 2016 3:44 pm

Re: Making a faster rayTest?

Post by ktfh »

Pintea wrote:
ktfh wrote:...or you can try writing your own rayTest with early out.
That's one option, but grasping Bullet's inner-workings would take quite some time. Do you know of any good info of the underlying space partitioning structure? Walking an octree or a bsp in a front-to-back manner would be perfect for rayTracing.
I really think you should try doing more benchmarks before writing your own rayTest with early out and try comparing btBvhTriangleMeshShape performance vs your btBoxShapes.

Also bullet has a btDbvtBroadphase that can cull a lot of superfluous ray tests. Something simple like ray testing incrementally biggger distances could probably improve your test scenes performance. Like ray test afew meters ahead and quadruple the tested distance until you hit something? I think you've designed a worst case scenario of an entire scenes objects overlapping the rays path, no broadphase optimizations can have an effect. Brute forcing 100 rays against all 40 objects could probably even improve performance slightly.
Pintea
Posts: 8
Joined: Tue Feb 14, 2017 2:14 pm

Re: Making a faster rayTest?

Post by Pintea »

ktfh wrote:I really think you should try doing more benchmarks before writing your own rayTest with early out and try comparing btBvhTriangleMeshShape performance vs your btBoxShapes.

Also bullet has a btDbvtBroadphase that can cull a lot of superfluous ray tests. Something simple like ray testing incrementally biggger distances could probably improve your test scenes performance. Like ray test afew meters ahead and quadruple the tested distance until you hit something? I think you've designed a worst case scenario of an entire scenes objects overlapping the rays path, no broadphase optimizations can have an effect. Brute forcing 100 rays against all 40 objects could probably even improve performance slightly.
Shouldn't ray vs box tests be way faster than ray vs random mesh?

The broadphase IS used inside the world->rayTest() function, it's just that instead of stopping at the first hit, it checks ALL the objects along the ray and does precise collisions with all of them.
It is implemented as a GatherAllCollisionsAlongRay, instead of StopAtFirstHitRay (as it would be in a standard ray tracer).

For example, ray tracing into a BSP tree would start with checking the primitives inside the current leaf and then walking the tree in a front-to-back manner. This way you can stop at the closest collision.
The same with ray tracing into an octree (as used for voxel rendering), where you go from the current cell in the direction of the ray by using the Bresenham3D algorithm and stop at the first collision.
Maybe bullet has a similar implementation somewhere, but I can't find it?
ktfh
Posts: 44
Joined: Thu Apr 14, 2016 3:44 pm

Re: Making a faster rayTest?

Post by ktfh »

Pintea wrote: Shouldn't ray vs box tests be way faster than ray vs random mesh?
I haven't benchmarked but my suspicion is that ray vs whatever isn't necessarily that costly relative to the number of virtual function calls, indirection, branching, allocations that can happens inside a rayTest. Easy fixes would be to incrementally rayTest greater distances until a hit or use a btBvhTriangleMeshShape. Bullet has bounding volume hierarchies for accelerating spacial queries but no bsp or octree. Try benchmarking afew more scenarios before writing a whole new broadphase and rayTest.
Pintea
Posts: 8
Joined: Tue Feb 14, 2017 2:14 pm

Re: Making a faster rayTest?

Post by Pintea »

So, I took a closer look and I've been using bullet's bounding volume tree broadphase (btDbvt).

The rayTestInternal function doesn't do a directional traversal of the tree. I hacked it in by calculating the distance to the two children and adding to the stack the closest one first, like this:

Code: Select all

btVector3 center1 = (node->childs[0]->volume.Mins() + node->childs[0]->volume.Maxs()) * 0.5f;
btVector3 center2 = (node->childs[1]->volume.Mins() + node->childs[1]->volume.Maxs()) * 0.5f;
float dist1 = (center1-rayFrom).length2();
float dist2 = (center2-rayFrom).length2();
if (dist1 < dist2)
{
	stack[depth++]=node->childs[1];
	stack[depth++]=node->childs[0];
}
else
{
	stack[depth++]=node->childs[0];
	stack[depth++]=node->childs[1];
}
This "sort-of" works for the directional traversal of the tree, but doesn't stop at the first collision, because of how Bullet works internally. It would require a lot of changes to do it.
So I added some code to the ClosestRayResultCallback::needsCollision() callback, where I do a segment versus aabb test myself to see if the collision is still worth doing. Really bad, but I still got a tenfold increase in performance.
ktfh
Posts: 44
Joined: Thu Apr 14, 2016 3:44 pm

Re: Making a faster rayTest?

Post by ktfh »

Pintea wrote:I hacked it in by calculating the distance to the two children and adding to the stack the closest one first...

I do a segment versus aabb test myself to see if the collision is still worth doing. Really bad, but I still got a tenfold increase in performance.
Pretty clever solutions you came up with.

I got curious what the performance difference is like between the different shape setups and tried to replicate your test scene 40 boxs vs 100 rays and take an average run time.

2.405ms 100 rays vs one big btBvhTriangleMeshShape of 40 boxs
2.902ms 100 rays vs 40 btBvhTriangleMeshShape boxs
4.392ms 100 rays vs 40 btBoxShape

ray triangle intersection uses an analytical solution while ray vs convex shape uses btGjkConvexCast or btSubsimplexConvexCast and iterates toward an intersection point. Switching might save you a millisecond?
Pintea
Posts: 8
Joined: Tue Feb 14, 2017 2:14 pm

Re: Making a faster rayTest?

Post by Pintea »

Wow, that's quite some difference between boxShape and meshShape, would have never expected it. I'll look into what it's doing, I would have expected a specialized ray vs AABB code instead of sweeping (or iterating...) a convex shape?!
Did you time it in debug mode or release?
ed_welch
Posts: 43
Joined: Wed Mar 04, 2015 6:16 pm

Re: Making a faster rayTest?

Post by ed_welch »

You know if ClosestRayResultCallback does the exact same as AllHitsRayResultCallback, that sounds like a bug to me. Maybe you should log it as a bug on github?
benelot
Posts: 350
Joined: Sat Jul 04, 2015 10:33 am
Location: Bern, Switzerland
Contact:

Re: Making a faster rayTest?

Post by benelot »

If you log a bug, can you build a simple example browser example (based on BasicExample or the raytest one) and maybe add some timings with btClock to it? By that you speed up the debugging as it can then be used as a unit test for debugging. Also it shows for everyone that the bug actually is there. You will see that it greatly speeds up the bug fixing process/response time to your bug.
ktfh
Posts: 44
Joined: Thu Apr 14, 2016 3:44 pm

Re: Making a faster rayTest?

Post by ktfh »

ed_welch wrote:You know if ClosestRayResultCallback does the exact same as AllHitsRayResultCallback, that sounds like a bug to me. Maybe you should log it as a bug on github?
There is no bug, ClosestRayResultCallback and AllHitsRayResultCallback behave as intended. The rayTest implementation is very generalized to support as many queries as possible but is unfortunately slow compared to a rayTest designed to only return the first hit.
benelot
Posts: 350
Joined: Sat Jul 04, 2015 10:33 am
Location: Bern, Switzerland
Contact:

Re: Making a faster rayTest?

Post by benelot »

Maybe somebody could contribute the faster version of ClosestRayResultCallback which does not calculate all other ray results?
Pintea
Posts: 8
Joined: Tue Feb 14, 2017 2:14 pm

Re: Making a faster rayTest?

Post by Pintea »

benelot wrote:Maybe somebody could contribute the faster version of ClosestRayResultCallback which does not calculate all other ray results?
Someone should definitely do that :)

For the moment I'm just playing around with it, but when we'll have to be production ready with our game I'll try to do a proper version. Hopefully someone with deeper Bullet understanding will have already done it by then.

My requirements would be:
- parse the bounding volume hierarchy in a front-to-back order, starting from the ray's origin
- do specialized ray-vs-primitive (obb, sphere, cylinder etc.) tests where possible instead of the iterative shape-sweeping algorithm it's now using
- stop at first hit

PS:
Is it called shape-sweeping if it's iterative? Why iterative? There's an actual while (iterations < 30) in the code right now in btSubsimplexConvexCast::btSubsimplexConvexCast
Post Reply