On Launching versus Landing

If Medium puts this content behind a paywall, you can also view it here (LinkedIn).

Carlos Arguelles
11 min readDec 28, 2024

If you put code in production and no one uses it, does it still have an impact?

If Medium puts this content behind a paywall, you can also view it here (LinkedIn).

Back at Google, during a 1:1 with one of my engineers, they mentioned that they had launched a feature into production a couple of months prior. I asked if the feature had been successful. This engineer gave me a confused look and started describing the technical complexity. That was great, but I wanted to know if it had made a difference to a real human being. Turns out that only 2 googlers were (sparingly) using it. It was a great feature, it just hadn’t been properly landed.

There is a huge difference between Launching something and Landing something. As an engineer, the work you do to launch is “fun” and “shiny” — you design a feature and write code, which is what got you into software engineering to begin with. Every code review provides a tiny little hit of dopamine and is a concrete artifact to show others how intelligent you are. The work you do to land is boring and often non-technical. A lot of it has no immediate reward. You don’t control 100% of your destiny. It’s fuzzy and dissatisfying.

But without Landing, your Launch is useless.In the software industry, the old “If a tree falls in a forest and no one is around to hear it, does it make a sound?” translates to “If you put code in production and no one uses it, does it still have an impact?”

In fact, I would argue it has a negative impact. You have increased the complexity and surface of your product and created additional operational load, for no value.

After that wake up call, I started looking closer at all our launches. The majority lacked the most basic discipline around landing. People put code in production and happily moved on to the next intellectual challenge that would sound good on their performance review.

Determined to drive some cultural change, I put together some slides on “Launching v. Landing” and gave a presentation to the entire org, recorded it, and shared it broadly. A couple of hundred people watched it, which I thought was positive.

But there was no behavioral change whatsoever.

I later understood why. I was new to the team (and fairly new to Google), and I had not Earned Trust from the engineers yet. I was just a random dude that had shown up one day as a Senior Staff Engineer without any artifacts. I had written 100k+ lines of production code at Amazon and launched a dozen products used by tens of thousands of amazonians every day, and spoken at dozens of internal conferences, but at Google I was a Nobody.

I needed credibility to inspire behavioral change. And I needed technical artifacts to get credibility. I decided to embed myself in my teams as a part-time engineer. This would help me gain a deeper understanding of the nuances of the products I was leading, and earn trust from my engineers.

I started looking for code I could write that spanned multiple teams, and found a perfect little problem. It was not particularly complex, but it could potentially help a lot of customers, nobody was doing anything about it, and required changing code in three different codebases and convincing multiple teams.

At the time, I was a Tech Lead for company-wide Infrastructure for Integration Testing at Google (I’ve written about this here and here). My product launched ephemeral, hermetic test environments for Google’s CI/CD system (we called them “SUT” for “System Under Test”). Launching these SUTs could take an extremely long time, but where that time was actually being spent was a black box to frustrated customers that wanted to improve it.

Here is an example of a test run that was part of CI/CD, some text redacted. In this case, launching the SUT took almost 28 minutes, and it became the long pole for the entire release process. But you didn’t know why it took 28m.

Here’s an example of the more granular metrics that I envisioned we could expose to our customers, zooming in on what is happening during those 28 minutes. When you stand up an SUT, you first need to build a bunch of components, then you often create a database (Spanner) to hold whatever data you want your system to return, then you need to data-seed that database, and finally you can start your server(s).

In a lot of cases, you could just visually spot the component that was the long-pole. In the example below, it’s easy to see that one component takes 2x to build as the others, and another component takes 2x to start as the others. Now you know what to go optimize.

Also, if I emitted these metrics, not only could I surface them per-customer-run on the UI, but also, I could aggregate them at scale in our own operational dashboards, for my team to understand our boot latency percentiles for the millions of SUTs we were starting every day, identify regressions, and quantify improvements. Here’s an example of our aggregated metrics over a period of a couple of weeks.

So I talked to the teams, got buy-in, did some of the work myself, and convinced others to do the rest. The last step, surfacing the metrics on the UI, turned into an intern project. And sure enough, once the code was pushed to prod and the internship was over, I noticed the team called the thing “done” and moved on to the next thing. It was the same behavioral pattern I had seen before.

Only that, of course, we were NOT “done.” It was in production but nobody was using it. Here was an opportunity for me to lead by example. I came up with a list of things-to-do to guide my process.

1. Was the feature discoverable?

A new feature must be easily discoverable.

I put on my “new customer” hat and navigated the UI. Once you clicked on the SUT, an “Operations History” tab would open up, and you were greeted with a Wall-Of-Text containing the massive command lines that had been executed. Underneath each, a tiny little tab called “Event Timings” was collapsed and easy to miss.

Very few people would want to scrutinize a 25-line command line, so I collapsed it so that the event timings tab was easier to see. I had the theory that more people would want to see event timings than they would the command line — and I actually proved that theory by emitting counters for how many times our customers expanded each one.

This trivial little change (about ten lines of code) increased customer adoption of event timings by 10x. You don’t have to do a lot of hard work to make something more discoverable, you just need to put on your customer hat and view things from their perspective. Most products have massive amounts of low-hanging fruit.

2. Was the feature usable?

I thought our UI was good for a first-launch, but again there was a lot of low-hanging fruit. A few brave customers tried it and gave feedback like this:

I was excited to see this feedback. This individual was motivated enough to use my little feature and cared enough to take the time to write down feedback. So I pinged him and we chatted 1:1 for a bit. Always, always, take the chance to attentively listen to your early customers. Their feedback and perspective is invaluable.

Based on my conversations with him and a few others, I made a bunch of little improvements. Each one of them was a trivial code change, but it improved the usability of the product significantly.

For example, the original bars weren’t labeled, so it was hard to see what was what, so I added labels. It took me 5 minutes to do and it made my customer’s lives better.

Some of the bigger SUT had hundreds of components which made the Event Timing pages long, and customers were scrolling up and down trying to figure out timing for the events of a particular component. So I grouped all the events for a specific component in the same row, which made the graph a lot more compact and made it very easy to compare timing for specific event types across components.

Again: none of these things was particularly difficult to do from a coding perspective. Most changes were ten lines of code here or there. But they collectively improved the usability of the product by a lot. To me, as an engineer, it’s polish and attention to detail.

For example, there was a field “Duration” which just had a double, i.e. “101.12639.” Putting on my customer hat: what unit is this? Probably seconds? Why do I need so much precision in the seconds for something that generally takes minutes? So I changed it to 2 digits followed by an English readable version, i.e. “101.13 seconds (1m41s)”

3. Was the feature useful?

Usable and useful are two different things. I monitored tickets, emails and forums, to see if anybody mentioned the feature. I had dashboards with metrics (more on this later), but I wanted to augment those with subjective anecdotes from actual customers to understand whether this thing had “landed” or not.

Getting emails like these was encouraging:

Or this one:

The takeaway here is look at both hard-data and anecdotes. Both should tell a consistent story.

4. Did we have good documentation?

This was another pet peeve of mine. A lot of our features were either undocumented or poorly documented.

I needed to lead by example. I created a page in our documentation to explain not only how to use the Event Timings pane, but also some best practices and real-world examples of how you could use it to find components to optimize.

I also like to be able to find the corresponding documentation when I’m using the product itself, so I added a little “For tips on how to understand this pane, go <here>” link in the Event Timings pane UI itself.

Lastly, our documentation had to be discoverable, so I made sure it was the first hit when you did common searches like “slow SUT” or “optimize SUT”, etc. After all, Google is a search company, so there was ways to tweak the internal search to bubble up specific urls for specific terms.

5. Did I have usage metrics?

You should never launch a feature without having at least a scrappy way to understand who is using it.

There were all kinds of sophisticated metrics I envisioned, but I didn’t want perfection to be the enemy of good. For my initial landing, the UI emitted a counter that tracked how often the Event Timing pane was displayed. Since it was collapsed by default, I figured a Customer deliberately clicking on the little triangle that expanded the pane was a good signal that they had noticed it and were at least curious enough to click on it. I added another counter to track if a customer had hovered inside the expanded pane and interacted with the bars. Roughly, the “Display” metric gave me an upper bound and the “Hover” metric gave me a lower bound. It wasn’t perfect, but it was a decent initial signal. If the feature took off and was used by thousands of googlers, I could invest in more sophisticated metrics, but until then, I needed to be pragmatic and scrappy. I saw between 150 and 200 unique people interacting with it every day, which was a decent landing for a little feature. This is a 3-week period:

I also tracked metrics on how many people went to the documentation page I had created — it was getting about 100 reads per day.

6. Had I socialized this sufficiently?

I had been working on improving our customer-facing announcements (I wrote about this here: “Injecting Customer Obsession into a foreign culture”). To land my feature, I needed to socialize it, so I leveraged that.

The first paragraph of an announcement is critical. People are very busy and if you don’t get their attention they’ll move on to their next email. You have to clearly articulate the exact customer painpoint that your feature is addressing. Why should your customers keep reading?

My email announcement had a little tracker too, to see how many people actually opened and read the email. This gave me the insight that 1400 googlers read my announcement that first week. Just because somebody read my email didn’t mean they wanted to use the feature, but it did tell me they cared enough to at least open the email, so it was another imperfect yet useful metric.

Lastly, my email had a special link to the documentation with another counter, so I could see how many people not only opened the email but actually clicked on the “Learn more” link — this was a higher fidelity signal than the first one.

Conclusion

Always be mindful that launching is not the end, it’s just the beginning. To land it, make sure your feature is discoverable, usable, useful, that you have documentation and metrics, that you socialize it properly, and that you value and learn from your early customers.

--

--

Carlos Arguelles
Carlos Arguelles

Written by Carlos Arguelles

Hi! I'm a Senior Principal Engineer (L8) at Amazon. In the last 26 years, I've worked at Google and Microsoft as well.

Responses (6)