An autonomous vehicle is a complete system that integrates functions like sensing the environment, planning routes, making decisions, controlling motion, and providing various levels of assistance for driving. It heavily relies on technologies such as high-performance computing, advanced sensors, data fusion, V2X communication, AI, and automatic control.
Self-driving vehicles use sensors to perceive their surroundings, similar to the way humans use their eyes. There are different types of sensors, including cameras, ultrasonic radars, millimeter-wave radars, and LiDARs.
Because of the various approaches to autonomous driving technology, the levels of achievement vary, along with the types of sensors used. An example of this variation is LiDAR. Currently, LiDAR technology is a shortcut to achieving L3+ autonomous driving. It is expected that L3 intelligent driving vehicles will feature mass production of LiDAR by 2022. Major car manufacturers, especially traditional companies, opt for an autonomous driving approach that primarily uses cameras as the main sensing device to gain an advantage in the autonomous driving field. This provides both scale and cost advantages, with scale effects contributing to the advancement of technology maturity, cost reduction, and creating a positive industry cycle.
Camera
Cameras capture visual information, mimicking human vision. The captured images are then analyzed by computer algorithms to identify various environmental elements such as pedestrians, bicycles, vehicles, road paths, curbs, street signs, and traffic lights. Additionally, algorithms enable distance measurement and road tracking, allowing for forward collision warning (FCW) and lane departure warning (LDW).
Cameras are extensively used in the automotive industry and are a well-established, cost-effective technology. Automotive cameras can be categorized into monocular, binocular, and multi-eye, and can be placed at various positions including front view, rear view, side view, and surround view. At present, Mobileye is a leader in the development of monocular Advanced Driver Assistance Systems (ADAS) globally.
Advantages:
– Mature technology.
– Low cost.
– Wide range of information collection.
Disadvantages:
– Limited sense of three-dimensional space.
– Greatly affected by environmental conditions, leading to reduced recognition rates in low visibility conditions such as dark nights, rain, snow, and fog.
Millimeter Wave Radar
Millimeter waves are used to measure the distance to objects by emitting and receiving radio signals. The frequency typically ranges from 10 to 300 GHz. Compared to centimeter-wave radars, millimeter-wave radars are smaller, lighter, and offer higher spatial resolution. They also outperform optical radars like infrared, laser, and television in penetrating through fog, smoke, and dust. Additionally, millimeter-wave radars exhibit better anti-interference capabilities than other microwave radars.
With its wavelength ranging between 1-10mm, millimeter-wave radar combines the advantages of microwave and optical radars, making it well-suited for automotive applications:
Ⅰ.24GHz frequency band: used for blind spot monitoring and lane change assistance in vehicles.
Ⅱ.77GHz frequency band: This higher frequency band enables superior radar performance compared to the 24GHz radar. It is primarily used to detect distances between vehicles and the speed of the leading vehicle, forming the basis for active braking and adaptive cruise control.
Advantages:
Compact size, lightweight, and high spatial resolution.
Effective penetration through fog, smoke, and dust.
Long transmission range.
Disadvantages:
High component costs. Strict production quality requirements.
Narrow detection angle.
Signal attenuation in high-humidity conditions such as rain, fog, and snow.
Limited penetration through trees.
LiDAR
LiDAR is a radar system that uses laser beams to detect a target’s position, speed, and other relevant information. Its working principle involves emitting a laser beam towards a target, then comparing the received echo with the transmitted signal. After analysis and calculations, various target parameters such as distance, azimuth, altitude, speed, pose, and even shape can be obtained.
A LiDAR system consists of a laser transmitter, optical receiver, turntable, and information processing system. A mechanical LiDAR employs a motor to drive the overall 360-degree rotation of the optical-mechanical structure, making it the most classic and mature structure. Meanwhile, a solid-state LiDAR features a stationary transceiver module, with only the scanner moving mechanically to obtain point cloud data of the surrounding space. This enables real-time mapping of the vehicle’s surrounding three-dimensional space.
Advantages:
– High resolution;
– High precision;
– Strong ability to resist interference;
– 3D point cloud technology
Disadvantages:
– High production requirements and high cost.
– Unable to identify color patterns, text, and other signs.
– Requires high computing power.
– Global Positioning System (GPS) + Inertial Measurement Unit (IMU)
When humans drive from one point to another, they need to know the map from the starting point to the destination as well as their current location. This knowledge helps them decide whether to turn right or go straight at the next intersection.
The same principle applies to unmanned systems. They rely on GPS + IMU to determine their longitude, latitude, and driving direction (heading). The IMU can also provide additional information including yaw rate, angular acceleration, etc. This data is crucial for locating and controlling autonomous vehicles.
Advantages:
– The integrated navigation system consolidates information from various navigation systems, providing the vehicle with full-attitude navigation information such as attitude, speed, and position.
– The frequency of output navigation data is high, and the accuracy of the navigation information exceeds that of a single navigation system.
– By integrating data from various navigation systems, complementary advantages can be achieved without each system interfering with the others.
Disadvantages:
– If a system malfunctions during deep integration, it may impact the navigation performance of other systems if not correctly identified and isolated in time.
– In shallow combination, it is also necessary to promptly identify and isolate abnormal navigation systems to avoid affecting the integrated navigation accuracy and improve the system’s stability and reliability.
Autonomous driving is impossible without appropriate hardware. The sensor components are the most critical elements, serving as the “eyes and ears” of autonomous vehicles. Accurate and effective perception of the surrounding environment is essential for providing reliable information for automated driving control decisions.
Intelligent algorithms can make sound path planning decisions to control the vehicle’s motion. In the early days of autonomous driving, inaccurate detection of autonomous driving sensors or failure to detect obstacles led to several accidents. The industry now profoundly understands the significance of sensors for autonomous driving technology and systems. Cost can be controlled, and a technology route with lidar as the core of perception is much more reliable and safer.
Autonomous Driving Control Units
These units must be capable of sensor fusion, location services, route planning, control strategy, and high-speed wireless communication. They are typically connected to several external devices including cameras, mmWave radars, LiDARs, IMUs, and more. They handle tasks such as image recognition and data processing.
Due to heavy computational loads, autonomous driving domain control units often work in conjunction with high computational power chips from various manufacturers like NVIDIA, HUAWEI, Renesas, NXP, TI, Mobileye, Xilinx, etc.
Different companies select distinct solutions based on their varying customer demands. However, these solutions share some common characteristics. For example, in an autonomous driving system, the 3 most computational power demanding features, in sequence, are the image recognition module, the multi-sensors data processing module, and the fusion decision module.
Domain control offers the advantage of modularizing the entire car on-board electronic system by splitting functionalities into singular domains. This improves function safety and network security of each subsystem and eases the development and deployment of autonomous algorithms. Domain control further simplifies extension of functionalities in each subsystem. As autonomous driving technologies advance, more Tier 1 and other suppliers show interest in this field.
To surpass human driving capabilities, autonomous vehicles must first achieve superior vision capabilities.
Creating dependable vision capabilities for self-driving cars has been a major technical challenge. However, by combining various sensors, developers have successfully built a detection system that can perceive a vehicle’s environment even better than human eyesight.
The keys to this system are diversity — different types of sensors — and redundancy — overlapping sensors to verify the accuracy of the car’s detections.
The three primary autonomous vehicle sensors are camera, radar, and lidar. Their collaboration provides the car with views of its surroundings and helps it detect the speed, distance, and three-dimensional shape of nearby objects.
In addition, sensors known as inertial measurement units assist in tracking a vehicle’s acceleration and location.
To comprehend how these sensors operate on a self-driving vehicle — and substitute and enhance human driving perception — let’s begin by examining the most commonly utilized sensor, which is the camera.
The Camera Never Deceives
From images to video, cameras are the most precise way to generate a visual depiction of the world, particularly for self-driving vehicles.
Autonomous cars depend on cameras positioned on every side — front, rear, left, and right — to merge a 360-degree view of their surroundings. Some have a broad field of view — up to 120 degrees — and a shorter range. Others concentrate on a narrower view to offer long-range visuals.
Some vehicles even incorporate fish-eye cameras, which feature super-wide lenses that offer a panoramic view, enabling the vehicle to park itself.
Although they provide precise visuals, cameras have their limitations. They can discern details of the surrounding environment, but the distances of those objects need to be calculated in order to determine their exact locations. It is also more challenging for camera-based sensors to detect objects in low visibility conditions, such as fog, rain, or nighttime.
On the Radar
Radar sensors can complement camera vision in instances of low visibility, such as nighttime driving, and enhance detection for self-driving vehicles.
Traditionally utilized to detect ships, aircraft, and weather formations, radar operates by transmitting radio waves in pulses. Once those waves strike an object, they return to the sensor, providing data on the speed and location of the object.
Similar to the vehicle’s cameras, radar sensors typically surround the car to detect objects from every angle. They are capable of determining speed and distance, however, they are unable to distinguish between different types of vehicles.
While the data provided by 360-degree radar and camera are adequate for lower levels of autonomy, they do not cover all situations without a human driver. This is where lidar comes into play.
Laser Focus
Camera and radar sensors are standard: most new cars today already utilize them for advanced driver assistance and parking assistance. They can also cover lower levels of autonomy when a human is supervising the system. However, for full driverless capability, lidar — a sensor that measures distances by emitting laser pulses — has proven to be extremely beneficial.
Lidar enables self-driving vehicles to have a three-dimensional view of their surroundings. It provides shape and depth to surrounding vehicles and pedestrians, as well as the road layout. Additionally, like radar, it functions effectively in low-light conditions.
By emitting invisible lasers at incredibly rapid speeds, lidar sensors are capable of creating a detailed three-dimensional image from the signals that bounce back instantaneously. These signals create “point clouds” that represent the vehicle’s surrounding environment, enhancing safety and diversifying sensor data.
Vehicles only require lidar in a few key locations in order to be effective. However, these sensors are more costly to implement — up to 10 times the cost of camera and radar — and have a more limited range.
Putting It All Together
Camera, radar, and lidar sensors provide abundant data about the car’s surroundings. However, similar to how the human brain processes visual data captured by the eyes, an autonomous vehicle must be capable of making sense of this continuous flow of information.
Self-driving vehicles accomplish this through a procedure known as sensor fusion. The sensor inputs are fed into a high-performance, centralized AI computer, such as the NVIDIA DRIVE AGX platform, which integrates the relevant portions of data for the vehicle to make driving decisions.
Therefore, rather than relying solely on one type of sensor data at specific moments, sensor fusion makes it feasible to combine various information from the sensor suite — such as shape, speed, and distance — to ensure reliability.
It also offers redundancy. When deciding to switch lanes, receiving data from both camera and radar sensors before moving into the next lane significantly enhances the safety of the maneuver, similar to how current blind-spot warnings serve as a backup for human drivers.
The DRIVE AGX platform executes this process as the vehicle operates, ensuring that it possesses a complete, up-to-date picture of the surrounding environment. This means that unlike human drivers, autonomous vehicles do not have blind spots and are constantly aware of the evolving and changing world around them.
Taking steps to hasten the development of self-driving vehicles, NVIDIA was announced today as an Autonomous Grand Challenge winner at the Computer Vision and Pattern Recognition (CVPR) conference, which is taking place in Seattle this week.
Building on last year’s victory in 3D Occupancy Prediction, NVIDIA Research topped the leaderboard this year in the End-to-End Driving at Scale category with its Hydra-MDP model, outperforming over 400 entries from around the world.
This achievement underscores the importance of generative AI in constructing applications for physical AI deployments in autonomous vehicle (AV) development. The technology can also be applied to industrial settings, healthcare, robotics, and other areas.
The CVPR Innovation Award was given to the victorious entry, acknowledging NVIDIA’s method of enhancing “any end-to-end driving model using learned open-loop proxy metrics.”
Moreover, NVIDIA unveiled NVIDIA Omniverse Cloud Sensor RTX, a collection of microservices enabling precise sensor simulation to speed up the development of fully autonomous machines of all kinds.
How End-to-End Driving Functions:
The endeavor to create self-driving cars is not a short race, but more of an ongoing triathlon, with three distinct yet vital elements functioning concurrently: AI training, simulation, and autonomous driving. Each necessitates its individual accelerated computing platform, and together, purpose-built full-stack systems for these steps form a potent trio that enables continuous development cycles, perpetually improving in performance and safety.
To achieve this, a model is initially trained on an AI supercomputer like NVIDIA DGX. It is subsequently tested and verified in simulation — utilizing the NVIDIA Omniverse platform and running on an NVIDIA OVX system — before being integrated into the vehicle, where, finally, the NVIDIA DRIVE AGX platform processes sensor data through the model in real time.
Developing an autonomous system to navigate safely in the intricate physical world is exceedingly challenging. The system must comprehensively perceive and understand its surroundings, then make accurate, safe decisions in a fraction of a second. This necessitates human-like situational awareness to handle potentially hazardous or uncommon scenarios.
AV software development has conventionally been founded on a modular approach, with distinct components for object detection and tracking, trajectory prediction, and path planning and control.
End-to-end autonomous driving systems streamline this process by utilizing a unified model to take in sensor input and produce vehicle trajectories, helping avoid excessively intricate pipelines and providing a more comprehensive, data-driven approach to address real-world scenarios.
Navigating the Grand Challenge:
This year’s CVPR challenge tasked participants with developing an end-to-end AV model, trained using the nuPlan dataset, to generate driving trajectory based on sensor data.
The models were entered for testing inside the open-source NAVSIM simulator and were assigned the task of navigating thousands of scenarios they had not previously encountered. Model performance was evaluated based on metrics for safety, passenger comfort, and deviation from the original recorded trajectory.
NVIDIA Research’s triumphant end-to-end model processes camera and lidar data, in addition to the vehicle’s trajectory history, to generate a secure, optimal vehicle path for five seconds after sensor input.
The workflow NVIDIA researchers utilized to win the competition can be reproduced in high-fidelity simulated environments with NVIDIA Omniverse. This means AV simulation developers can replicate the workflow in a precise environment before testing their AVs in real life. NVIDIA Omniverse Cloud Sensor RTX microservices will be accessible later this year. Register for early access.
Additionally, NVIDIA secured the second position for its submission to the CVPR Autonomous Grand Challenge for Driving with Language. NVIDIA’s method connects vision language models and autonomous driving systems, integrating the prowess of large language models to aid in decision-making and accomplishing generalizable, explainable driving behavior.
Due to sensors costing between $15 and $1, automobile manufacturers seek to ascertain the quantity of sensors necessary for the complete autonomous driving of vehicles.
These sensors are utilized to amass data about the surrounding environment, including images, LiDAR, radar, ultrasound, and thermal sensors. One type of sensor is inadequate, as each sensor has its individual constraints. This is the primary impetus behind sensor fusion, which combines multiple types of sensors to achieve safe autonomous driving.
All vehicles at L2 level or higher rely on sensors to “survey” the surrounding environment and execute tasks like lane centering, adaptive cruise control, emergency braking, and blind spot warning. Thus far, original equipment manufacturers are embracing considerably diverse design and deployment approaches.
In May 2022, Mercedes Benz launched its first vehicle in Germany capable of attaining L3 level autonomous driving. L3 level autonomous driving is an option for S-level and EQS, and is scheduled for launch in the United States in 2024.
According to the company, the DRIVE PILOT, based on the driver assistance kit (radar and camera), has incorporated new sensors, including LiDAR, advanced stereo cameras for the front window, and multifunctional cameras for the rear window. Microphones (especially for detecting emergency vehicles) and humidity sensors have also been added in the front cabin. A total of 30 sensors were installed to capture the requisite data for safe autonomous driving.
Tesla has chosen a unique direction. In 2021, Tesla revealed plans to incorporate Tesla’s vision camera autonomous driving technology in Model 3 and Model Y, with plans to extend this to Model S and Model X in 2022. The company also decided to remove the ultrasonic sensor.
One current challenge in designing autopilot systems is the limitations of various sensors. To achieve safe autonomous driving, sensor fusion may be necessary. The critical issue is not just the quantity, type, and positioning of sensors, but also the interaction between AI/ML technology and sensors to analyze data for optimal driving decisions.
To address sensor limitations, sensor fusion, which integrates data from multiple sensors, may be essential for achieving peak performance and safety in autonomous driving.
“Artificial intelligence technology is extensively employed in autonomous driving,” stated Thierry Koulton, Rambus Security IP Technology Product Manager. Even basic ADAS functions in autonomous driving necessitate vehicles to demonstrate a level of environmental awareness equivalent to or surpassing that of human drivers. Vehicles must recognize other vehicles, pedestrians, and roadside infrastructure and accurately determine their positions. This demands AI deep learning technology to effectively handle pattern recognition capabilities.
Visual pattern recognition, a sophisticated field in deep learning, is widely utilized in vehicles. Additionally, vehicles must continually calculate their optimal trajectory and speed, necessitating artificial intelligence to effectively manage route planning capabilities. LiDAR and radar can provide distance information critical for accurately mapping the vehicle’s surroundings.
Sensor fusion, which amalgamates data from different sensors to enhance understanding of the vehicle environment, remains an active area of research.
“Each sensor type has its limitations,” Koulton explained. “Cameras excel at object recognition but provide limited distance information and require substantial computing resources for image processing. Conversely, LiDAR and radar offer excellent distance information but with lower clarity. Furthermore, LiDAR performs poorly in adverse weather conditions.”
Determining the required number of sensors for an autonomous driving system is not straightforward. Original equipment manufacturers are presently grappling with this issue. Other factors to consider include the varied demands for truck driving on open roads versus urban robot taxis.
“This is a complex calculation because each automotive OEM has its own architecture that prioritizes vehicle protection by providing improved spatial positioning, longer distances, high visibility, object recognition, classification, and differentiation between various objects,” noted Amit Kumar, Director of Product Management and Marketing at Cadence.
“This also hinges on the level of autonomy that car manufacturers choose to adopt (e.g., to offer breadth). In short, to achieve partial autonomy, the minimum number of sensors required may range from 4 to 8 of different types. For full autonomy, 12 or more sensors are currently in use.”
Kumar highlighted that Tesla currently utilizes 20 sensors (8 camera sensors plus 12 level 3 or lower ultrasonic sensors) without LiDAR or radar. The company strongly believes in computer vision, and its sensor kit is suitable for L3 Autonomous. Reports suggest that Tesla may incorporate radar to enhance autonomous driving.
Zoox has integrated four LiDAR sensors, along with a combination of cameras and radar sensors. This fully autonomous vehicle is designed to operate on clear and easily understandable routes on the map. Commercial deployment is pending, although there will soon be limited use cases, unlike passenger cars.
Nuro’s autonomous delivery truck prioritizes functionality over aesthetics. It utilizes a 360-degree camera system with four sensors, along with a 360-degree LiDAR sensor, four radar sensors, and ultrasonic sensors.
Implementing these systems is not a straightforward process.
“The number of required sensors is contingent on an organization’s acceptable risk level and the specific application,” explained Chris Clark, Senior Manager of Automotive Software and Security at Synopsys Automotive Group. When developing robot taxis, sensors are needed not only for road safety but also inside the vehicle to monitor passenger behavior and ensure passenger safety. In this scenario, the environment is characterized by a large population and high urban density, which has unique features compared to vehicles used for highway driving, where longer distances and greater reaction space are available.
On highways, the likelihood of road invasion is relatively low. There is no fixed rule specifying the number of sensors and cameras essential for all angles of an autonomous vehicle.
Nevertheless, the number of sensors will be determined by the specific use cases the vehicle will be dealing with. “For example, in the case of a robot taxi, it’s essential to utilize LiDAR and standard cameras, as well as ultrasound or radar, due to the high density of information that needs to be processed,” stated Clark. Additionally, a sensor for V2X needs to be included, ensuring that the data received by the vehicle aligns with what the vehicle perceives in its surrounding environment. In the context of highway truck transportation solutions, a variety of sensor types will be employed.
Ultrasound isn’t as useful for highway applications unless for something akin to teaming up, but this doesn’t qualify as a forward-looking sensor. Conversely, it may function both forwards and backwards so as to connect with all team assets. However, LiDAR and radar have grown increasingly crucial due to the distances and ranges that trucks need to consider when operating on highways.
Another aspect to take into account is the level of analysis required. “There’s such a vast amount of data to process that it’s necessary to determine how much data is truly significant,” Clark explained. This is where the types and functions of sensors become intriguing. For instance, if LiDAR sensors can conduct local analyses in the initial stages of the process, it will reduce the amount of data being sent back to the sensor fusion for other analyses. This reduction in data, in turn, will lower the overall computational power and system design costs. Otherwise, the vehicle would require additional processing in the form of an integrated computing environment or a dedicated ECU focused on sensor grid partitioning and analysis.
Sensor fusion might be costly. In the early days, the expense of a LiDAR system comprising multiple units could amount to as much as $80000. The high cost was due to the mechanical components within the device. Today, the cost is significantly lower, and some manufacturers anticipate that at some point in the future, it may drop to as low as $200 to $300 per unit. The emerging thermal sensor technology will be priced around a few thousand dollars.
Overall, OEM manufacturers will continue to experience pressure to reduce the costs associated with deploying sensor systems. Opting for more cameras instead of LiDAR systems will aid OEMs in cutting down on manufacturing expenses.
“The fundamental definition of safety in urban environments involves preventing all avoidable collisions,” noted David Fritz, Vice President of Siemens Digital Industrial Software Hybrid and Virtual Systems. The minimum number of sensors required varies depending on the use case. Some individuals believe that in the future, complex and widespread smart city infrastructure will diminish the necessity for in-vehicle sensors in urban environments.
Car-to-car communication might also impact sensors.
“Here, the number of onboard sensors may decrease, but we have not,” observed Fritz. “Moreover, there will always be situations where AVs will need to assume that, due to power failure or other outages, it’s not possible to obtain all external information. Hence, some sensors will always need to be installed on the vehicle – not only in urban areas but also in rural areas.
Many of the designs we’ve been exploring necessitate eight external cameras and multiple internal cameras. By properly calibrating the two front cameras, we can achieve low latency, high-resolution stereoscopic vision, provide depth range of objects, and decrease the reliance on radar. We do this at the front, rear, and both sides of the vehicle to obtain a complete 360° view.”
As all the cameras handle object detection and classification, essential information will be relayed to the central computing system to make control decisions.
“Fritz mentioned that if infrastructure or vehicle information is available, it will be combined with data from onboard sensors to create more comprehensive 3D views for better decision-making. Additional cameras are used for driver monitoring and detecting occupancy conditions internally, such as detecting left behind objects. A low-cost radar might be included to handle harsh weather conditions such as heavy fog or rain, which is an advanced addition to the sensor kit.
There has been limited use of LiDAR recently. In some cases, the performance of LiDAR can be impacted by echoes and reflections. Initially, autonomous driving prototypes heavily relied on GPU processing of LiDAR data, but more intelligent architectures now tend to favor high-resolution, high FPS cameras, and their distributed architecture can better optimize the data flow of the entire system.
Optimizing sensor fusion can be a complicated task. Identifying the combination that can provide the best performance requires functional testing as well as modeling and simulation solutions provided by companies like Ansys and Siemens, which are relied upon by OEMs.
Enhanced technologies such as V2X, 5G, advanced digital maps, and GPS in intelligent infrastructure will make autonomous driving feasible with fewer in-vehicle sensors. However, the advancement of these technologies will require the support of the entire automotive industry and the development of smart cities.
Frank Schirrmeister, Vice President of IP Solutions and Business Development at Arteris, mentioned that various enhanced technologies serve different purposes. Combining map information digital twins with path planning can create a safer experience under limited visibility conditions to enhance sensor-based local decision-making in the car. V2V and V2X information can supplement locally available information in the car to make security decisions, increase redundancy, and provide more data points for security decisions.
Furthermore, the potential for real-time collaboration between vehicles and roadside infrastructure in the Internet of Vehicles requires technologies such as ultra-reliable low-latency communication (URLLC).
Kouthon mentioned that as a result of these requirements, various artificial intelligence technologies are being applied in traffic prediction, 5G resource allocation, congestion control, and other areas. AI can optimize and reduce the heavy impact of autonomous driving on network infrastructure. It is anticipated that original equipment manufacturers will utilize software-defined vehicle architecture to build autonomous vehicles, where the ECU is virtualized and updated wirelessly. Digital twin technology is crucial for testing software and updating on-vehicle cloud simulation that closely replicates real vehicles.
For the final implementation, L3 level autonomous driving may require 30+ sensors or a dozen cameras, depending on the OEM architecture. However, there is currently no consensus on which approach is safer, or whether the autonomous driving sensor system will provide the same level of safety in urban environments as it does on highways.
With the cost of sensors expected to decrease in the coming years, there may be opportunities to introduce new sensors to the combination to enhance safety in adverse weather conditions. However, OEM manufacturers may require a considerable amount of time to establish a standardized set of sensors deemed sufficient to ensure safety across all conditions and extreme situations.”