How Big is Big Tech? Nobody Really Knows.

Last month, The New York Times published the comical notion that Google makes $4.7 billion in annual revenue from news publishers. The statistic was based solely on a 10-year extrapolation from a single off-handed comment Marissa Mayer made in 2008. By the same methodology used in this study, I can now sprint 100 meters in about three seconds and drink around 200 Bud Lights in a single night.

As this “fact” was remarkably egregious, it got its day in the court of public opinion and was roundly mocked across the web. Unfortunately, when it comes to statistics haphazardly tossed out regarding the size of tech companies, scrutiny is more the exception than the rule. Across the zeitgeist, our notion of the scale of Big Tech is guided by extremely questionable metrics, adding a complex wrinkle to the antitrust discussion with no clear resolution in sight.

Alitany of factors makes collecting accurate data on the scale of Big Tech difficult. For starters, while public companies are required to disclose all revenue, they are not always required to break it down in a way that shows their relative position in the market.

A microcosmic example of this phenomenon is illustrated by public coverage of Amazon’s ad business, itemized as a piece of “other” on Amazon’s SEC filings. For much of 2018, NYU Stern School of Business professor Scott Galloway and his company, L2, estimated that Amazon’s advertising revenue would exceed $10 billion. Morgan Stanley, J.P. Morgan, and eMarketer placed it at closer to $5 billion. Both numbers were regularly cited in press with almost no additional examination. When all was said and done, Amazon reported $3.4 billion in ‘other’ revenue in the fourth quarter of 2018 alone, suggesting that Galloway was ultimately right on the money.

However, it really gets dicey when we try to place revenues reported by public companies in the context of their respective markets. The market sizings are ostensibly based on extrapolations from the best available public data. But often, these numbers seem plucked from thin air with highly opaque methodologies. They’re generally released into the world by large analyst firms or management consultancies and, once published, receive close to zero peer review. If Accenture or Gartner says something, it becomes true.

Often, the numbers change with little explanation as to why. For two years, eMarketer has reported that Amazon is responsible for 50% of all online sales in the United States. As first reported by The Information, they recently amended this to say Amazon is actually responsible for closer to 37% of online retail. Amazon does not work with eMarketer directly to corroborate these numbers, so the main statistic in the market around Amazon may well be the work of a couple of junior analysts.

These statistics are ripe for manipulation depending on what message you are trying to promote.

Speaking of eMarketer, the company pegs U.S. retail as a $5 trillion industry. However, this market sizing includes all sales at restaurants, gas stations, and car dealerships — industries included only under the broadest possible definition of retail. Removing these sectors, the National Retail Federation (NRF) projects that retail will total $3.8 trillion in 2019. Yet with little explanation, the company says in its boilerplate that retail contributes $2.6 trillion in GDP.

Taken together, these statistics are ripe for manipulation depending on what message you are trying to promote.

If you’re Lina Khan, Matt Stoller, or any other antitrust zealot, you can cherry-pick the above numbers to argue that Amazon accounts for 10% of the entire retail market in the United States.

If you’re Amazon, which has built a brilliant PR machine around trying to seem smaller than it really is, you can use eMarketer’s data, which puts Amazon eCommerce revenue at $185 billion against $5 trillion of total retail sales. Suddenly, Amazon is less than 4% of total retail.

Itgets more confusing. One of the most commonly referenced statistics around Amazon’s dominance is that 55% of consumers begin their product searches on Amazon. Based on a survey of 2,000 shoppers conducted by Bloomreach, the findings were originally published by Bloomberg, Vox, CNBC, and Business Insider and have been referenced dozens of times since.

Here’s how the question that led to this statistic was posed to respondents:

When shopping online, where do you generally begin your search for products?

Amazon
Search engine
A retailer’s website

Now, let’s imagine the same question posed with a more extensive set of choices.

Amazon
Google
Instagram
Pinterest
Alibaba
Craigslist
Directly on a retailer’s website
I exclusively shop in physical stores

Simply by adding a few more options, the number of shoppers starting their search on Amazon is likely to be reduced significantly. This kind of subtle data manipulation isn’t limited to tech, it’s easy to pull off in politics as well. Imagine how conclusions on which issues are important to voters will differ based on the following two questions:

What is the most important issue to you in advance of the 2020 election?

Gun Rights
Healthcare
Immigration

What is the most important issue to you in advance of the 2020 election?

Healthcare
Immigration
Gun Rights
Education
Terrorism
Jobs/Economy
Other
Disenfranchisement of the elderly
The rent is too damn high
What is reallygoing on in Area 51?

Of course, countless other principles from behavioral economics influence the results, including the order choices are displayed in, and whether certain questions are framed before or after a given question. With modest sophistication, the survey designer can easily nudge towards the output he is looking for to make his point. As my GEN colleague, Timothy Kreider, said: “Truth, selectively edited and cynically deployed, makes the most effective lie.”

However, even if the sample size is significant and the survey is designed fairly, the data is not reliable if presented without context as to how it was gathered. Often, this gets lost in the game of hyperlink telephone that plays out on the internet. The first news outlet that picks up the story will often cite the methodology and link to the original survey. But then the second outlet simply publishes the stat and links back to the first news story. Pretty soon the statistic is accepted gospel without any clear acknowledgment of how it was derived. With malice towards none but laziness by all in the press, dodgy statistics enter the mainstream.

Pretty soon, 65% of jobs for the next generation don’t yet exist and smelling farts cures cancer. What a time to be alive.

While Americans aren’t going to run out and start training for jobs they can’t fathom or become enthusiastic fart-sniffers, questionable data will be relied upon to decide the fate of our biggest corporations.

With such easily manipulated “facts,” antitrust is being battled more on more on moral terms.

The challenge is hard enough already. Our elected officials are tasked with taking antitrust frameworks developed for utilities and railroads and somehow applying them to internet monopolies. To make matters stickier, large tech companies have effectively gotten so big that we’re unable to correctly ascertain how big they are. The danger is that with such easily manipulated “facts,” antitrust is being set up to be battled more on moral terms. To spin a Jim Barksdale classic, if we have data, let’s go with data. If all we have are ideologies and feelings, let’s go with mine.

In this context, Facebook would appear most vulnerable. I’d argue that among the big four tech companies, Facebook is the least entrenched in their competitive position and most subject to market forces. Yet, early indications suggest they will draw the bulk of antitrust scrutiny because of charts like these:

In a traditional sense, this is wrong. We don’t break up companies simply because they are too big, their founders are too powerful, or they’ve committed a series of grave moral lapses. We break them up if their monopoly represents a threat to fair competition. Since a rough-riding American icon invented antitrust, we have a unique tendency to view it as a panacea for litigating ethical transgressions of big companies, even when that doesn’t cleanly apply.

This lens may help explain Facebook’s timing in hastily announcing Libra, rather than waiting until a few weeks had gone by without scandal to ask for access to our wallets. In the wake of a co-founder of the company writing that Facebook should be broken up, perhaps Mark Zuckerberg’s existential fear is antitrust. It would make sense to prove that you are creating something of undisputed value to try and hold the bureaucratic barbarians at the gate. For all of the compelling arguments I’ve seen for breaking up big tech, I’ve yet to see a really good case for breaking up a company that makes life better for its end consumers.

To illustrate the complexity of this debate, I propose the following thought experiment. Imagine in the months and years ahead that Amazon ratchets up its anti-competitive behavior by adding scores of additional Amazon basics, docking third-party sellers in its marketplace, and aggressively pushing its own products to consumers. Using the juiced profits from its monopolistic endeavors, Amazon launches direct-to-consumer health insurance through Prime. The initiative is a massive success, lowering the cost of healthcare for millions of underinsured Americans. Cui bono if that version of Amazon is broken up?

Hypothetical scenarios like this rear their ugly heads in the absence of good data in which to ground decision-making. And at the moment, we simply don’t have the data we need to keep emotional decision-making at bay. You can be sure that large tech companies are lining up the best lobbyists to tell Washington that, in the grand scheme of things, they’re still the harmless little disruptors working out of slightly bigger garages. Your move, FTC.