We did more than read. We wrote. We talked. We dissected, for meaning and history. Me, and a dozen other kids I’d just met. It was school, after all.
The Odyssey is great. A proper story. Easy to read, and easy to see why it stuck around.
The Iliad is… not. It’s hard to read. Everyone in it is kind of a jerk. The biggest jerks are the biggest stars. The entire story rotates around a woman - Helen - without giving her agency. Maybe she didn’t want to go home?
For all its difficulty, it’s the more important book. Studying it taught me a lot.
Founders could learn from it even today.
In a hard book to read, one section is by far the hardest, weirdest, and seemingly most pointless. We called it the Parade of Ships, but Wikipedia uses the less glamorous “Catalogue of Ships.” It is exactly what it sounds like: A description of a lot of ships. More than a thousand. You know. Because Helen’s face was so beautiful it launched a thousand ships.
This gives us the millihelen: Enough beauty to launch one ship.
First the Boeotians, led by Peneleos, Leitus, Arcesilaus, Prothoenor and Clonius; they came from Hyrie and stony Aulis, from Schoenus, Scolus and high-ridged Eteonus; from Thespeia and Graea, and spacious Mycalessus; from the villages of Harma, Eilesium and Erythrae; from Eleon, Hyle, Peteon, Ocalea and Medeon’s stronghold; from Copae, Eutresis, and dove-haunted Thisbe; from Coroneia and grassy Haliartus, Plataea and Glisas, and the great citadel of Thebes; from sacred Onchestus, Poseidon’s bright grove; from vine-rich Arne, Mideia, holy Nisa and coastal Anthedon. They captained fifty ships, each with a hundred and twenty young men.
That’s just the first paragraph! Every time I read this I delight in its nothingness. Now that I don’t have an essay due.
This litany, 2,500 years later, wakes our deepest fears about dusty old books. You’re probably feeling pretty good about skipping it. Yet it drove people to tell this story again and again. Being in it mattered. To your family. To your village. To everyone in Greece. Without the Catalogue of Ships, The Iliad might not survive.
Retelling a great story would always draw a crowd. (Remember: Both of these books were told in oral form long before they were ever written down.) But giving every listener a chance to brag or shrink because of the behavior of one of their ancestors… jackpot!
Investors in the $10.1 million round for the company were led by ArcTern Ventures and joined by new backers Capricorn Investment Group, Incite Ventures. Previous financiers in the company included Wireframe Ventures, Congruent Ventures, Ulu Ventures, Energy Foundry, Hardware Club, 1/0 Capital, and Wells Fargo Strategic Capital […].
That’s a long list. Especially so for a company likely raising only its second round of funding (based on the amount).
Then it hit me:
These investors are listed for the exact same reason the ships are catalogued in The Iliad!
Investors are fighting for the modern equivalent (named, ironically, after a different, also unpleasant Greek story). Now it’s earned in investor announcements on sites like TechCrunch, not ship descriptions in stories told in the town square.
Still. Seeing the parallel was a delightful lift to the morning. I have a science degree but a liberal arts education. I love what the combination has done for my career. It’s nice to have it be a source of humor, too.
The parallel provides a lesson for founders:
The catalogue of ships describes a thousand vessels, and far more people. But most of them were never mentioned again in the story.
Don’t look for those involved in the investment. Look for who helped the company succeed. Who wrote the first check.
Well. Not all of it. Mostly the social sciences and medicine. And I don’t just mean the fact that they consider Freud canon.
It started with a trickle. A retracted paper here. A study that couldn’t be repeated, there.
Then someone decided to get systematic. It opened the floodgates. A study in 2016 showed that 70% of scientists had failed to replicate another scientist’s work, and fully half had failed to reproduce their own work.
Reproducibility is fundamental to the scientific method - it’s supposed to be a study of the natural world, which doesn’t change all that often - so what does its absence mean? Are we incompetent? Can we trust anything? Do we know anything?
The high failure rate of venture-backed startups is its own kind of replication crisis: “How could my company fail? I followed the growth-hacking, blitz-scaling advice from the founders who made it big!” I don’t mean to give blogs and podcasts the weight of peer-reviewed science. But our industry seems to trust them as if they deserve it.
What does it mean if a founder can’t get similar results when following the practices of another?
Science has begun to heal itself. It’s time for startups to go through their own reckoning. Their methods are failing most people. It’s time to learn why and how to get better.
What’s wrong with science?
The crisis in science has multiple, interconnected causes. A lot of them come down to taking techniques from simpler systems and applying them to the far more complex study of humans. The practices useful for studying minerals also worked great on metals, but with people? Not so much.
One of the most famous examples of these studies that fizzle under scrutiny is the marshmallow experiment, conducted at Stanford University in 1972 on the children of students enrolled there. It produced original, important conclusions on the ability of children to endure delayed gratification, and later studies showed that ability was highly correlated to success later in life. Suddenly we’ve got a new tool for understanding how successful you’ll be at a very young age.
Or… maybe not. Further studies showed the original work was actually just exposing the socioeconomic background of the kids. If your family is well off, you are comfortable with delayed gratification and, just coincidentally, are also likely to be well off when you’re older. If you’re from a poor family, delayed gratification is harder to accept and, huh, you’re also more likely to be poor than those kids of rich parents.
Once someone reran the study with a larger group of kids (900 instead of 90) and controlled for socioeconomic background… the effect largely disappeared. It’s not all that surprising that kids with no food insecurity are better at delaying gratification and also will be more successful in life. It certainly doesn’t grab the headlines like announcing that kids who can wait five minutes to eat a marshmallow will earn more money than those who can’t. No HBR article for that one.
It’s been almost fifty years since this study was published. That’s five decades of science based on flawed work, five decades of science that has to be unwound and retried. The longer these mistakes last, the more expensive they are to fix. And like that HBR article above, many conclusions never get retracted.
One particular “technique” has helped trigger the crisis in science. Many a growth-hacking product manager has fallen into the same trap. They can only be rescued through discipline and rigor.
The how and why of P-hacking
Abusing data is a sure way to get bad results. Unlike startups, scientists rarely just make up their data. They make more subtle mistakes, like P-Hacking. This probably sounds pretty cool, but it’s actually a common form of data misuse. Wikipedia describes it this way:
…performing many statistical tests on the data and only reporting those that come back with significant results.
It works like this:
A researcher comes up with an idea for a study. He collects a bunch of data, runs the experiment and… no dice. The idea didn’t pan out.
Hmm. “I have all this data. I can’t just throw it away.”
So he starts slicing the data looking for something that stands out. After a while, sure enough, he finds some correlation that is strong enough to stand up - usually its P-value is under 0.05, and thus considered statistically significant. He publishes this in a paper and looks like a genius. It gets big exposure in the press. Journalists love weird and surprising science. They can report on it without understanding it.
But no one can reproduce the work. The paper gets retracted. He gets uninvited from the big conferences. (Don’t worry. The papers never follow up and publish the retraction.)
What went wrong?
He left out one key piece: How he got the data.
Let’s say he thinks breastfed kids are healthier than bottle-fed kids. He sets up a study that tries to isolate just these variables, which means he wants his population to be reasonably homogenous (similar quality of life, similar locations, etc). Put simply, the difference being researched should be the only material one in the population (unlike in the marshmallow experiment).
He could just toss the data. But, well, he’s already paid to collect it. He’s got all these graduate students who are working nearly for free. He might as well try something. So he puts a student or two on trying to find useful results.
They nearly always do, but… that success kills his work. All those controls to make it work for his original experiment fatally bias it for other studies.
Let’s say he discovers that the study participants who were bottle-fed tended to move around a lot more than people who were breastfed. He concludes, oh, wow, getting bottle-fed causes you to hate your parents and move away. (Yes, this is exactly the kind of headline that would get picked for a result like this.)
He has not proven that. All he has shown is in this particular - probably small, and certainly narrow - data set, that happens to be the case.
He should throw away all existing data. Start from scratch controlling for everything except this new variable under test. Only then can you look for correlations between how a baby was fed and mobility.
But he was too lazy or scared to do that. He found a match in that smaller, biased data set, and then published the results without admitting the problems in either his data or his methods. A few decades ago he would have gotten away with it: A big splashy result on publication, and then everyone just assuming this was true, with no attempt to reproduce and no real questioning of the result.
Today, no chance. Science has developed defenses against this kind of malpractice.
Researchers register with a central database that they are going to study the health of breastfed vs. bottle-fed babies. When they get results, they point to that registration and say, see, this is what led to my data collection.
If they then wanted to publish some other study, people would say, no, you didn’t pre-register this, which makes us suspect you’re p-hacking, so we’re going to do a deep dive on how you got your data. On second thought, we’re just going to reject your paper. Come back when the results hold on a clean dataset.
From social science to startups
This might not initially seem to have anything to do with startups. Product managers and marketers aren’t commissioning studies - and they certainly aren’t controlling for variables!
Hmm. If you look at it a bit funny… Every data-backed marketing campaign and feature launch is an experiment.
Let’s build an analogous example.
A product manager builds a new feature, and because he’s growth hacking, he has lots of telemetry to tell him exactly how people are using it.
His theory is that people will use this new feature in some specific way. But he builds it, ships it, and observes, well, hmm, no, almost no one is using it. It’s a bust. I’m sure you’ve never worked on a project like this, but trust me, it happens.
Except… hey, there’s this small group that is using it, and widely. He looks into it more closely, and realizes they’re using it at 10x the rate people use the rest of the product. So he changes plans, and he rebuilds the feature around the specific thing those few people were doing with it.
Wait, what? No one uses that feature, either, and even worse, the people who originally used it aren’t any more, now that it’s focused on their actual usage!
What went wrong?
You got caught p-hacking
The data set from his failed feature is bad data. He got the most important result: This feature did not work well for his users. He wasn’t willing to let go of failed work. Just like the scientists, he went looking for some other way to reuse it. And instead of developing new hypotheses and running new experiments, he took his biased data and tried to find new correlations cheaply.
Unfortunately for him, he did.
But when he published the new feature, he is faced with a harsh truth: Those few people who were using the feature in unexpected ways don’t look like the rest of his users. A new feature built for that purpose doesn’t help everyone else. And because he relied on data to make his decisions instead of talking to actual users, he learned too late that those unrepresentative users were doing something even more weird. His simplified feature actually removed that weirdness in the name of simplicity that everyone can use.
So now he’s two features in and nothing to show for it. So much for growth-hacking.
How do I fix it?
The solution is very similar to what science has done.
Connect your data to experiments. With discipline. You must get new, clean data for each new test. I know this is anathema to modern data-oriented product management. But it’s the only real way to trust your results.
That word discipline is key. You don’t need to build some international central registry. Whatever your mission statement says, you’re not really saving the world, and you’re not actually doing science. You’re just trying to build a product people love. What you need is rigorous internal practices, and to hold each other accountable so you can’t cheat at statistics.
Unfortunately, this requires you let go of one of Silicon Valley’s most cherished and wrong beliefs.
Experiments fail. This might be an important part of the process, but it’s not very valuable. Congratulations. Of all the possible ways you could fail, you’ve discovered one of them. Don’t let it go to your head.
Don’t work too hard to salvage that failure. You’re p-hacking, and just making it worse. Yes, obviously, you get personal lessons. You might be lucky enough to learn something that triggers your next experiment. But you have to go run that separately.
You can’t build on the detritus of failure.
So my data is now worthless?!
Of course not. I still rely on data for all kinds of problems. One of the great things about building a company today is how easily you can get information at scale.
But never let yourself forget that your data is heavily biased, especially by how it was collected. One of my favorite examples is from when YouTube dramatically reduced response time. Their average response times went up! Suddenly people with much worse connectivity found it worth using, making the average worse. The developers thought they were helping existing users, but the biggest impact was in creating new ones.
You have to recognize your job isn’t to find some way to make the data valuable. Your job is to make high-quality decisions. Use data when you can. If you don’t have data, go get it.
But the job of the data is to inform you, not give you answers. Use it to hone your instinct, to improve your decision-making. When something doesn’t add up, go talk to the actual humans who are the source of the data. And even, spend some time with people not represented in it.
If you’re working at a software startup, you’re not doing science (even if, like me, you have a science degree). But you should still take advantage of its discipline and practices.
Don’t stop at protecting yourself from P-hacking. One founder’s success might be hard to replicate for many reasons. Gain what lessons you can. But don’t blindly trust others’ story of their work.
Because failure on your part won’t be paired with the retraction of a Nature paper, it’ll be an announcement of layoffs in TechCrunch.
Automation is not to blame for all the job destruction and wage stagnation. But you can still do great harm if you build it for the wrong reasons.
We’re told that automation is destroying jobs, that technology is replacing people, making them dumber, less capable. These are lies, with just enough truth to confuse us. You can have my robot washing machines when you pry them from my cold, wet hands.
I’m not some Pollyanna, thinking tech is only ever positive. Its potential for abuse and hurt is visible across the centuries, and especially so today. But I’m more optimistic about the upside than I am pessimistic about the down, and I’m uninterested in scaremongering screeds against it.
And yet. Technology and automation are not forces of nature. They’re made by people. By you. And the choices you make help to determine just how much good or bad they do. Even with the best of intentions, you might be doing great harm. And if you don’t have good intentions at all, or you don’t think ethics are part of your job, then you are probably downright dangerous.
I’m here to convince you that you have a role in deciding the future impact of the technology you build, and to provide you - especially you founders, tool builders, automators - some tactical advice on how to have the best impact, and avoid the dark timeline.
As I was building Puppet, explaining that I was developing automation for operations teams, execs and sales people would think they got it: “Oh, right, so you can fire SysAdmins!”
When prospective customers asked for this, I offered them a choice: You can keep the same service quality and cut costs, or you can keep the same cost, and increase service quality. For sysadmins, that meant shipping better software, more often.
Their response? “Wait, that’s an option?!” They only knew how to think about their jobs in terms of cost. I had to teach them to think about quality. This is what the whole DevOps movement is about, and the years of DevOps reports Puppet has published: Helping people understand what quality means, so they can stop focusing on cost.
And those few people who said they still wanted to reduce cost, not increase quality? I didn’t sell to them.
Not because they were wrong. There were real pressures on them to reduce costs, but I was only interested in helping people who wanted to make things better, not cheaper. My mission was completely at odds with their needs, so I was unwilling to build a product to help them fire their people.
This might have been stupid. There are good reasons why a CEO might naturally build what these people want. The hardest thing in the world to find for a new product is a motivated prospective customer who has spending authority, and here they are, asking for help. The signal is really clear:
You do a bunch of user interviews, they all tell the same story of needing to reduce cost, and in every case, budgets are shrinking and the major cost is labor. Great, I’ll build some automation, and it will increase productivity by X%, thus enabling a downsizing. The customer is happy, I get rich, and, ah, well, if you get fired you probably deserved it for not investing enough in your career. (I heard this last bit from a founder recently. Yay.)
This reasoning is common, but that does not make it right. (Or ethical.) And you’ll probably fail because of your bad decisions.
Let’s start with the fact that you have not done any user interviews. None.
The only users in this story are the ones you’re trying to fire. Executives aren’t users. Managers aren’t users. It seems like you should listen to them, because they have a lot of opinions, and they’re the ones writing checks, but nope.
This has a couple of consequences. First, you don’t understand the problem if you only talk to buyers, because they only see it at a distance. You have to talk to people on the ground who are doing the work. Be careful when talking to them, though, because you might start to empathize with them, which makes it harder to help fire them.
Even if you do manage to understand the problem, your product will still likely fail. As much as buyers center themselves in the story of adopting new technology, they’re largely irrelevant. Only the people at the front line really matter. I mean, it’s in the word: Users use the software. Someone, somewhere, has to say: Yes, I will use this thing you’ve built, every day, to do my job.
If you’ve only talked to buyers, you have built a buyer-centric product, rather than a user-centric one. Sure, maybe you got lucky and were able to build something pretty good while only talking to managers and disrespecting the workers so much that you think they’re worthless. But I doubt it. You’ll experience the classic enterprise problem of closing a deal but getting no adoption, and thus not getting that crucial renewal. Given that you usually don’t actually make money from a customer until the second or third year of the relationship… not so great.
Users aren’t stupid. Yes, I know we like to act like they are. But they aren’t. If your value promise is, “Adopt my software and 10% of your team is going to get fired,” people know. And they won’t use it, unless they really don’t have a choice. Some of that is selfish - no one wants to help team members get fired, and even if they’re safe today, they know they’re on the block for the next round of cuts. But it’s just as likely to be pragmatic. You’re so focused on downsizing the team that you never stopped to ask what they need. Why would someone adopt something that didn’t solve their problems?
What’s that you say? You ignored their problems because you were focused on the boss’s needs? This is why no one uses your software. Your disrespect resulted in a crappy product.
Call me a communist, but I think most people are skilled at their jobs. I am confident that I can find a learned skill in even the “low skill” labor. I absolutely know I can in most areas people are building software.
I was talking to a friend in a data science group in a software company recently, and he was noting how hard it was to sell their software. He said every prospective buyer had two experts in the basement who they could never seem to get past. So I asked him, are you trying to help those experts, or replace them?
He said, well, our software is so great, they aren’t really necessary any more.
There’s your problem. You’re promising to fire the only two people in the whole company who understand what you do. So I challenged him: What would your product, your company look like if you saw your job as making them do better work faster, rather than eliminating the need for them?
It’s a big shift. But it’s an important one. In his case, I think it’s necessary to reduce the friction in his sales process, and even more importantly, to keep those experts in house and making their employers smarter, rather than moving them on and losing years of experience and knowledge.
The stakes can get much bigger than downsizing. In his new book, Ruined By Design, Mike Monteiro has made it clear that designers and developers make ethical choices every day. Just because Uber’s and Instacart’s business model requires that they mistreat and underpay workers doesn’t mean you need to help them. While I don’t think technology is at fault for most job losses, there absolutely are people out there who see the opportunity to make money by destroying industries.
This is not fundamentally different than the strip mining that happened to corporations in the 1980s, except back then they were making money by removing profit margin in companies and now they’re making money by removing “profit” margin in people’s lives. Jeff Bezos of Amazon has famously said your margin is his opportunity, and his warehouse workers’ experiences makes clear that he thinks that’s as true of his employees as it is of his suppliers and competitors.
Just because they’re going to get rich ruining people’s lives doesn’t mean you have to help.
I think your job matters. I think software can and should have a hugely positive impact on the world; not that one project can by itself make the world better, but that every person could have their life improved by the right product or service.
But that will only happen if we truthfully, honestly try to help our users.
When, instead, we focus too much on margin, on disruption, on buyers, on business problems…. we become the problem.