How does an autonomous vehicle manage to navigate its way around obstacles and avoid pedestrians, other moving vehicles and cats dashing out to cross the road?
To teach an autonomous vehicle to do all these things, we need to start by gathering huge quantities of data. To do this a data gathering car is used.
These custom-made vehicles, such as the example above produced by Pilgrim Motorsports with guidance from the Academy of Robotics in the UK, carry specialist sophisticated camera and computing equipment to be able to gather the required autonomous car data. Its job is to go around a town to capture visual the data is in the form of video footage from up to 12 cameras with a combined 360 view around the car as well as capturing feedback from sensors and infrared detectors. This is all to gain a comprehensive understanding of the road environment and the road’s users, particularly in residential areas
This data is taken back to a bank of supercomputers which watch it over and over again to learn. This type of computer science is called machine learning and uses evolutionary neural networks. Neural Networks are a computer system modelled on the human brain and nervous system, we run computer algorithms on neural networks. In this way the algorithms not only learn but also evolve with each iteration.
Much like a child is taught what objects are at school, we take images of similar scenes to roads where the car will drive. We mark out what objects are (annotation) and using Machine Learning, we apply the annotated data to an algorithm which now begins to compare images and learn the difference between a car, a pedestrian, cyclist road, sky, etc. After some time of doing this and us showing the computer more complex or harder to understand scenes, the algorithm in the computer eventually figures out the rest by applying what it has been taught and what it sees.
Now that the algorithm can tell what objects are, we attach multiple cameras looking in all directions. And in real-time, the algorithm is able to identify pretty much everything that is relevant in a scene. Using onboard supercomputers, that are performing up to 7 trillion calculations per second, the camera data is interpreted to reveal something like the image below.
Understanding what is in the scene is just one small part of the puzzle. The next step is to predict what each person, car, bicycle, traffic light, is going to do next. Just as you can predict that your phone will hit the ground if you drop it, the vehicle is able to see and identify pedestrians, cars, and bicycles etc. and then predict multiple realistic potential scenarios, taking action based on which potential scenarios are more likely to happen.
We are using computer power not only to see everything in the scene but then to predict what everything is likely to do in the next three seconds. While three seconds sounds like not a very long time to calculate potential futures, from the frame of reference of the car, everything is happening very, very slowly. It sees the world at 1000 frames per second. To it, all objects on screen, are moving as slow as snails do to you and me.
We fuse the findings from each camera creating a combined view of the world as seen by the car. This combined view gives us a more accurate account of what is happening in the world around it.
There is a similar process for a vehicle to know how to keep in lane, where the road is and where it needs to be driving.
The example below shows a vehicle driving through a residential street in the UK.
As the vehicle needs to give way, it highlights in red the areas that it cannot drive and in green the areas that it considers space which is free on the road. There is an entire algorithm with its own neural network which has been trained to understand just the road taking into account details like texture, colour, obstructions etc.
These are a few ways an autonomous car understands the world around it – but there is more. We also have subsystems for reading road markings, reading traffic signs, Infra-red and more. All these subsystems running in their own Neural Network are combined to create one super view of the world as the car sees it.
The end result is that currently, some of our test vehicles driven by neural networks are already outperforming human counterparts in many scenarios. For what we do, autonomous delivery in the last mile, we have no need to learn how to drive on every road in the UK; we only need to master specific postcodes for residential last mile delivery, which is why we are already so close to deployment.
The first smartphones were giant bricks which could not do much more than making phone calls, as time went on, they got more advanced and could do more.
Self-driving vehicles are the result of years of computer science and their arrival is the next step in the evolution of vehicles. First, we saw vehicles with cruise control, then cruise control with lane assist, then self-parking and now we’re moving onto self-driving. The first autonomous cars will do an excellent job of driving themselves on very specific routes. With time, the vehicles will begin to drive more complex roads and routes, eventually, they will connect to each other and share data with each other; it is a step by step process.
I predict that we will begin seriously to see passenger carrying self-driving cars on the roads by 2020 and then a period of mass adoption between 2021 and 2025. The first self-driving cars you will see on the road are likely to be autonomous cars which deliver goods and don’t carry people. This is a simple, low-risk start with a valid use. Our own autonomous delivery vehicle Kar-go is scheduled for trials later this year.
Written by William Sachiti, Academy of Robotics