On August 18, 2021, Senators Blumenthal and Markey asked the US Federal Trade Commission (FTC) to investigate Tesla for a misleading advertisement about the capabilities of its “Autopilot” technology. The request follows high-profile accidents involving Tesla vehicles and prior requests for FTC action by others. The FTC ought to investigate by considering what evidence, if any, warrants the implicit claim by Tesla that its vehicles on autopilot, without attentive human supervision, are as safe or safer than the average human driver. It should not simply consider whether Tesla has any substantiating evidence for its claims about the capabilities of its technology.
If this claim is false, then Tesla’s branding, which implies full autonomy, and the statements in a Tesla video which asserts: “The person in the driver’s seat is only there for legal reasons. He is not doing anything. The car is driving itself” are misleading. Naming a product can be misleading even though it is not “false” because terms like “Autopilot” and “Full Self-Driving” imply fully autonomous Level 4 and 5 performance when, in fact, Tesla’s technology only merits a Level 2 designation in the Society of Automotive Engineers (SAE) scale. The FTC may block future use of a brand name or trade name when less restrictive remedies, such as affirmative disclosures, would be insufficient to eliminate the deception conveyed by the name, as it did in ABS Tech Sciences, Inc., by enjoining use of the term “ABS” as part of its trademark or trade name because consumers would likely confuse its product with factory-installed anti-lock braking systems.
Statements in the video might be “true” in a narrow sense: at the moment the video was taken, the driver may, in fact, have been doing nothing. This video implies, however, that there never is any need for a human driver, which may not be true in any operating design domain (we know it is not true in all situations because Tesla admitted as much in a letter to the California DMV). The FTC will likely look into the net impression created by the video as a whole if it treats the video as an ad. The net impression caused by the video conflicts directly with the fine print of the Tesla Owner’s manuals. Other videos, and statements by Tesla’s CEO, Elon Musk, also mislead in similar ways.
The FTC, however, need not provide evidence warranting a belief in the claim that “Teslas on autopilot are not safer than human drivers.” Rather, to show that an ad is misleading, the FTC need only show that Tesla did not possess supporting substantiation evidence in advance of advertising which must be reliable scientific evidence for a safety claim. In its seminal Pfizer decision, the FTC held that an advertiser must have a “reasonable basis” for making objective product claims, whether express or by implication. Without substantiation evidence prior to publication, the FTC deems the advertising misleading. Under Section 5 of the Federal Trade Commission Act, 1914 (FTC Act), an advertising claim is deceptive if it is “likely to mislead consumers acting reasonably in the circumstances, and . . . is material.”
Some argue that requiring an advertiser to have prior substantiation evidence for a safety claim relieves the government of its affirmative burden to prove that the safety claim is false with its own evidence. In the words of one observer, “Speech not demonstrably false, and also not demonstrably true, is precisely that which feeds debate central to the evolution of a free society and is protected by the intended meaning of our First Amendment”.
What if Tesla asserts that a Level 2 Tesla on autopilot (without supervision) is in fact safer than the average human driver?
The FTC typically requires a high level of substantiation for health and safety claims, usually “competent and reliable scientific evidence,” typically defined as “tests, analyses, research, studies, or other evidence based upon the expertise of professionals in the relevant area, that has been conducted and evaluated in an objective manner by persons qualified to do so, using procedures generally accepted in the profession to yield accurate and reliable results.” Tesla may not have substantial evidence of this sort because autonomous vehicle standards are in a flux. If, however, Tesla did possess some evidence of this sort, then the FTC would need to provide evidence of its own which warrants a belief, to a reasonable certainty, that Tesla’s safety claim is false despite the prior substantiation.
The FTC might simply point out that Tesla is designated Level 2 and full autonomy only comes with Levels 4 and 5 on the SAE scale to imply that Tesla’s claim is false. But labels are not necessarily conclusive. As a thought experiment, let us follow a harder path to show that the apparent simplicity of dueling claims like “Tesla is safer” and “Tesla is not safer” than a human driver conceals vast complexity.
It is too simplistic to claim that a similarly situated human driver would not have made the mistake that caused the crash in any individual accident case. Machine drivers will make different kinds of mistakes than human drivers. We expect the identities of those killed in Tesla autopilot crashes will differ from those in crashes, had autopilot not existed. The identity of those saved also will differ. Consider the following stop sign example.
In case 1, the human driver did not see a stop sign while distracted from texting. In case 2, the machine driver failed to perceive a stop sign because it had graffiti on it. If, for every 1 person killed by Autopilot’s failure to recognize a stop sign due to graffiti, 10 persons were killed by a failure to notice a stop sign while texting, we might consider the machine driver safer, at least in this respect. This is true even if the particular accident under examination resulted from Autopilot’s failure to perceive a stop sign when the average human driver would not have been fooled by the graffiti.
Consider a simplified model in which all the mistakes made by the machine driver relate to failures of the perception function to identify objects and properly classify them. And, assume all the mistakes made by the human driver follow from drinking, texting and falling asleep at the wheel. A simple test evaluates the allegation of false advertising: does the machine driver’s mistakes cause fewer fatal accidents than the average human driver’s mistakes based on comparable miles traveled?
Public policy makers often decide upon a course of action based on a cost/benefit analysis like this. There is a subsidiary concern. What if the evidence showed that the machine driver, while overall safer, had the unfortunate byproduct of placing highway and emergency workers at a greater risk of injury and death, a particular concern of the Senators calling for an FTC inquiry? Should public policy then disfavor the machine driver? This starts to look a bit like discrimination based on personal characteristics that MIT’s moral machine experiment warns about.
Suppose, for the moment, that investigations will show that many of the accidents involving Tesla vehicles resulted from a perception failure: the software simply did not “see” an object. In a general sense, the technology in the Tesla is to blame; So what? The legal system will develop liability rules to make one or more persons on the Tesla side of the ledger responsible for the payment of damages for perception failures, whether that is the Tesla owner, Tesla itself, or a component manufacturer.
This same exercise occurs when a human driver makes a perception mistake by drinking, texting or falling asleep which lead to accidents. In these cases, the human driver is to blame and the legal system makes the human driver responsible for payment of damages.
In the abstract, it is not a moral failing or false advertising for Tesla to deploy vehicles which cause accidents due to system failures like this, any more than it is a moral failing for an auto dealer to sell a car to a person who might text and drive. We expect accidents to happen. False advertising comes from deploying a machine driver that is less safe than a human driver. We need a standard for comparison. (Tesla’s other moral failings relate to a design which easily allows use outside the operating design domain, thus creating an attractive nuisance).
Measuring relative safety based on comparable miles traveled is easier to describe than to perform. Are the statistics collected by an independent party or the manufacturer? If the latter, how are the numbers audited or verified? Are types of driving conditions noted, such as highway versus city, night versus day, and how is the number of human safety driver interventions treated? Should simulations count in the statistics for machine drivers? Are only fair-weather miles included? Should accidents involving drinking, texting and sleeping be factored out because we want to measure performance against an average driver under normal conditions? Assessing the relative safety of machine drivers and human drivers at a high level of abstraction does not require development or use of safety case methodology. It is simply bean counting. The fear is that there are not enough comparable beans to make this simple comparison statistically significant. The defense of a safety case supplements the bean counting, and the bean counting may form part of the evidence in a safety case.
Start with a claim: the machine driver is safer than a human driver. Perhaps limit the claim to an operating design domain, like fair weather or divided highways. What evidence warrants our belief that this claim is true?
Provide a metric to ground the basis for comparison: the number of comparable miles traveled per fatality. The hard part comes with determining how many miles the machine driver travels per fatality. For most AV companies, there are no meaningful fatality statistics because the road testing on public highways takes place with a safety driver as back up. When the automated system fails, an attentive safety driver intervenes to prevent an accident. To compensate for the lack of fatalities, counterfactual computer simulations are run to project the consequences of a failure to intervene.
Suppose we accept the computer simulations of fatalities as sound. We then need to normalize the actual data from fatalities involving human drivers to make a valid comparison to machine driver fatalities in simulations. And, to make the comparison ‘apples to apples’, some adjustments may need to be made to the accident statistics for human drivers.
Tesla is a partial exception to this pattern because, in actual use, some Tesla owners operate their vehicles as if there were no safety driver back up. Some fatalities in accidents involving Tesla vehicles thus provide a small window into relative performance between a machine driver with current Tesla technology and a human driver. But based on actual miles driven, and the small numbers of fatalities, it may not be possible to draw any statistically significant conclusions about relative safety, even in a limited operational design domain such as divided highway usage.
What ought not suffice as substantiation evidence, however, are arm chair reflections on how merely eliminating drunk driving, texting, and sleeping, makes the machine driver safer than a human driver.
So far, we have treated the AV and the human car/driver combination as black boxes with outputs in the form of fatalities per comparable miles traveled. We might, however, make a safety case for the machine driver by looking inside the black box. In a simplified example, consider three types of AV failures which might lead to a perception failure: failure of hardware; failure in maintenance; and a failure of object recognition and classification by the software.
A safety case might assess the potential for a chip failure based on the frequency of failures in other chips from the same factory which are used in mission critical applications. This allows us to make a reasoned assumption in the safety case for the frequency of failure in the chips used in our AVs. A chip malfunction could result in a perception failure to properly identify and classify an object.
As for maintenance, suppose that the AV employs a wiper system to clean lenses in the cameras used to collect data about the environment. A dirty lens might result in a perception failure as well. Some assumptions need to be made about keeping wiper fluid in a tank for cleaning purposes, including the addition of anti-freeze. Perhaps reliability of conventional windshield wiper systems using wiper fluid is relevant to assessing re-positioned hardware from conventional vehicles now used to clean these perception function sensors. For operation in cold weather climates, we make assumptions about how the components interact with road salt, etc.
As for software failures of image recognition software to identify and properly classify objects, we have no ready answer for how to make an assumption about the nature and frequency of failure because the software architecture uses artificial neural networks which are unpredictable and little understood. We might make some judgments about the relative safety between two different software systems by testing them on the same dataset. Perhaps a system which uses both tracking and tiling methodology to identify objects results in more accurate object identification. Image size considered in number of pixels might matter, as might the amount of computing power and system memory. Use of a combination of RGB images and Lidar might exceed the performance of a system using RGB images alone, etc. But these are relative rankings, not absolute numbers which we might use to infer relative frequencies.
It seems doubtful that Tesla has in hand the kind of substantiation evidence that the FTC requires prior to making an advertising claim. On the other hand, if the FTC itself had to provide evidence to warrant belief in the claim that Tesla technology is less safe than a human driver, that showing will require some complex examination of what is in the “black box” combined with some degree of statistical data showing relative frequency of fatal accidents in at least some operating design domains. If the FTC wins a false advertising claim against Tesla solely because of the allocation of the burden of proof, we confront a disturbing possibility: The FTC can’t prove Tesla vehicles are less safe, and Tesla can’t prove that its vehicles are more safe to any reasonable certainty.
If this is the unhappy state of the current science, the public faces the possibility of deployment of autonomous vehicles at scale when the relative safety of a machine driver versus a human driver is, for all practical purposes, unknown. That seems like a large gamble. An FTC investigation that considered the merits of both claims would shed light on whether the public is, in fact, making this bet. Perhaps the time has come for the National Highway Transportation Safety Administration to stop “dithering” and establish some standards and metrics to measure safety.
William H. Widen is a Professor at University of Miami School of Law, Coral Gables, Florida.
Suggested citation: William H. Widen, Machine Driver Vs. Human Driver in Possible FTC Action Against Tesla, JURIST – Academic Commentary, August 26, 2021, https://www.jurist.org/commentary/2021/08/william-widen-machine-driver-vs-human-driver-ftc-tesla/.
This article was prepared for publication by Sambhav Sharma, a JURIST Staff Editor. Please direct any questions or comments to him at commentary@jurist.org