The InputWhat information is important in controlling a vehicle when we are driving? Firstly, we need the position of obstacles relative to us. Is it to our right, to our left or straight ahead? If there are buildings on both side of the road, but none if front, we will accelerate. But if a car is stopped in front of us, we will brake. Secondly, we need the distance from our position to the object. If an object is far away we will continue driving until it is close, in which case, we will slowdown or stop. That is exactly the information that we use for our neural network. To keep it simple we have three relative directions: left, front and right. And we need the distance from the obstacle to the vehicle. We will define what kind of field of vision our AI driver should see and make a list of all the objects it can see. For simplicity we are using a circle in our example but we could use a real frustum with 6 intersecting planes. Now for each object in this circle, we check to see if it is in the left field of view, right field of view or center. The input to the neural network will be an array: float Vision [3]. The distance to the closest obstacle to the left, center, and right of the vehicle will be stored in Vision [0], Vision [1] and Vision [2] respectively. In Figure 3, we show how the array works. The obstacle on the left is at 80% of the maximum distance, the obstacle on the right is at 40% of the maximum distance and there is no obstacle in the center. To be able to do this, we need the position (x,y) of every object, the position (x,y) of the vehicle and the angle of the vehicle. We also need 'r' (the radius of the circle) and dright, dleft, the vectors from the craft to the lines Lright, Lleft. Those lines are parallel to the direction of the craft. Both vectors are perpendicular to the lines. Even though it is a 3D world, all the mathematics is 2D because the vehicle doesn't go into the 3rd dimension since it is not flying. All the equations will only consider x and y and not z. First, we compute the equations of the lines Lright and Lleft which will help us determine whether an obstacle is to the left, right or center of the vehicle. Check figure 4 for an illustration of all the variables.
with
Then we compute a point on the line
with Vx, Vy the position of the vehicle. Then we can finally compute cR
We proceed the same way to compute the equation of the line Lleft using the vector dleft. Second, we need to compute the center of the circle. Anything within the circle will be seen by the AI. The center of the circle C(x,y) is at the distance r from the vehicle's position V(x,y).
with Vx, Vy the position of the vehicle and Cx, Cy the center of the circle. Then we check every object in the world to find out if they are within the circle (if the objects are organized in quadtree or octtree, the process is much faster than with a linked list). If then the object is within the circle, with Ox, Oy the position of the obstacle. For every object within the circle, we then must check if they are to the right, left or center of the vehicle. If then the object is in the right part of the circle else if then the object is in the left part of the circle else it is in the center. We compute the distance from the object to the vehicle Now we store the distance in the appropriate part of the array (Vision[0], Vision[1] or Vision[2]) only if the distance already stored is larger than the distance we just computed. The Vision must have been initialized to 2r prior to that. After checking every object, we have the Vision array with the distance to the closest object to the right, center and left of the vehicle. If no object were found for one of those sections, we will have the default value which means: "no object in view". Because the neural network is a using a sigmoid function, the input needs to be between 0.0 and 1.0. 0.0 should mean that an object is touching the vehicle and 1.0 means that there is no object as far at it can see. Since we set a number for the maximum distance the AI driver should see, we can easily modify all the distances to a floating point between 0.0 and 1.0. |
|