by Marcelo Cortes10/25/2022
Engineering is typically the function that grows fastest at a scaling startup. It requires a lot of attention to make sure the pace of execution does not slow and cultural issues do not emerge as you scale.
We’ve learned a lot about pace of execution in the past five years at Faire. When we launched in 2017, we were a team of five engineers. From the beginning, we built a simple but solid foundation that allowed us to maintain both velocity and quality. When we found product-market fit later that year and started bringing on lots of new customers, instead of spending engineering resources on re-architecturing our platform to scale, we were able to double down on product engineering to accelerate the growth. In this post, we discuss the guiding principles that allowed us to maintain our engineering velocity as we scaled.
Faire’s engineering team grew from five to over 100 engineers in three years. Throughout this growth, we were able to sustain our pace of engineering execution by adhering to four important elements:
You want to hire the best early team that you can, as they’re going to be the people helping you scale and maintain velocity. And good people follow good people, helping you grow your team down the road.
This sounds obvious, but it’s tempting to get people in seats fast because you have a truckload of priorities and you’re often the only one doing engineering recruiting in those early years. What makes this even harder is you often have to play the long game to get the best engineers signed on. Your job is to build a case for why your company is the opportunity for them.
We had a few amazing engineers in mind we wanted to hire early on. I spent over a year doing coffee meetings with some of them. I used these meetings to get advice, but more importantly I was always giving them updates on our progress, vision, fundraising, and product releases. That created FOMO which eventually got them so excited about what was happening at Faire that they signed up for the ride.
While recruiting, I looked for key competencies that I thought were vital for our engineering team to be successful as we scaled. These were:
In early stages, you need to move extremely fast and you cannot afford to make mistakes. We wanted the best engineers who had previously built the components we needed so they knew where mistakes could happen, what to avoid, what to focus on, and more. For example, we built a complex payments infrastructure in a couple of weeks. That included integrating with multiple payment processors in order to charge debit/credit cards, process partial refunds, async retries, voiding canceled transactions, and linking bank accounts for ACH payouts. We had built similar infrastructure for the Cash App at Square and that experience allowed us to move extremely quickly while avoiding pitfalls.
Faire’s mission is to empower entrepreneurs to chase their dreams. When hiring engineers, we looked for people who were amazing technically but also understood our business, were customer focused, were passionate about entrepreneurship—and understood how they needed to work. That is, they understood how to use technology to add value to customers and product, quickly and with quality. To test for this, I would ask questions like: “Give me examples of how you or your team impacted the business.” Their answers would show how well they understood their current company’s business and how engineering can impact customers and change a company’s top-line numbers.
I also learned a lot when I let them ask questions about Faire. I love when engineering candidates ask questions about how our business works, how we make money, what our market size is, etc. If they don't ask these kinds of questions, I ask them things like: “Do you understand how Faire works?” “Why is Faire good for retailers?” “How would you sell Faire to a brand?” After asking questions like these a few times, you’ll see patterns and be able to quickly identify engineers who are business-minded and customer-focused.
Another benefit of hiring customer-focused engineers is that it’s much easier to shut down projects, start new ones, and move people around, because everyone is focused on delivering value for the customer and not wedded to the products they helped build. During COVID, our customers saw enormous change, with in-person trade shows getting canceled and lockdowns impacting in-person foot traffic. We had to adapt quickly, which required us to stop certain initiatives and move our product and engineering teams to launch new ones, such as our own version of online trade shows.
When we first started, we couldn’t afford to build the most beautiful piece of engineering work. We had to be fast and agile. This is critical when you are pre-product-market fit. Our CEO Max and a few early employees would go to trade shows to present our product to customers, understand their needs, and learn what resonated with them. Max would call us with new ideas several times a day. It was paramount that our engineers were gritty and able to quickly make changes to the product. Over the three or four days of a trade show, our team deployed changes nonstop to the platform. We experimented with offerings like:
By trying different value propositions in a short time, our engineering team helped us figure out what was most valuable to our customers. That was how we found strong product-market fit within six months of starting the company.
Our trade show storefront back when we were called Indigo Fair.
The number one impediment to engineering velocity at scale is a lack of solid, consistent foundation. A simple but solid foundation will allow your team to keep building on top of it instead of having to throw away or re-architecture your base when hypergrowth starts.
To create a solid long-term foundation, you first need to get clear on what practices you believe are important for your engineering team to scale. For example, I remember speaking with senior engineers at other startups who were surprised we were writing tests and doing code reviews and that we had a code style guide from the very early days. But we couldn’t have operated well without these processes. When we started to grow fast and add lots of engineers, we were able to keep over 95% of the team focused on building features and adding value to our customers, increasing our growth.
Once you know what long-term foundations you want to build, you need to write it down. We were intentional about this from day one and documented it in our engineering handbook. Today, every engineer is onboarded using this handbook.
The four foundational elements we decided on were:
The most important thing is to build your data muscle early. We started doing this at 10 customers. At the time, the data wasn’t particularly useful; the more important thing was to start to collect it. At some point, you’ll need data to drive product decision-making. The longer you wait, the harder it is to embed into your team.
Here’s what I recommend you start doing as early as possible:
When choosing a language and database, pick something you know best that is also scalable long-term. If you choose a language you don’t know well because it seems easier or faster to get started, you won’t foresee pitfalls and you’ll have to learn as you go. This is expensive and time-consuming. We started with Java as our backend programming language and MySQL as our relational database. In the early days, we were building two to three features per week and it took us a couple of weeks to build the framework we needed around MySQL. This was a big tradeoff that paid dividends later on.
Many startups think they can move faster by not writing tests; it’s the opposite. Tests help you avoid bugs and prevent legacy code at scale. They aren’t just validating the code you are writing now. They should be used to enforce, validate, and document requirements. Good tests protect your code from future changes as your codebase grows and features are added or changed. They also catch problems early and help avoid production bugs, saving you time and money. Code without tests becomes legacy very fast. Within months after untested code is written, no one will remember the exact requirements, edge cases, constraints, etc. If you don’t have tests to enforce these things, new engineers will be afraid of changing the code in case they break something or change an expected behavior.
There are two reasons why tests break when a developer is making code changes:
Every language has tools to measure and keep track of test coverage. I highly recommend introducing them early to track how much of your code is protected by tests. You don’t need to have 100% code coverage, but you should make sure that critical paths, important logic, edge cases, etc. are well tested. Here are tips for writing good tests.
We started doing code reviews when we hired our first engineer. Having another engineer review your code changes helps ensure quality, prevents mistakes, and shares good patterns. In other words, it’s a great learning tool for new and experienced engineers. Through code reviews, you are teaching your engineers patterns: what to avoid, why to do something, the features of languages you should and shouldn’t use.
Along with this, you should have a coding style guide. Coding guides help enforce consistency and quality on your engineering team. It doesn’t have to be complex. We use a tool that formats our code so our style guide is automatically enforced before a change can be merged. This leads to higher code quality, especially when teams are collaborating and other people are reviewing code.
We switched from Java to Kotlin in 2019 and we have a comprehensive style guide that includes recommendations and rules for programming in Kotlin. For anything not explicitly specified in our guide, we ask that engineers follow JetBrains’ coding conventions.
These are the code review best practices we share internally:
Tracking metrics is imperative to maintaining engineering velocity. Without clear metrics, Faire would be in the dark about how our team is performing and where we should focus our efforts. We would have to rely on intuition and assumptions to guide what we should be prioritizing.
Examples of metrics we started tracking early (at around 20 engineers) included:
This is a dashboard we created in the early days of Faire to track important engineering metrics. It was updated manually by collecting data from different sources. Today, we have more comprehensive dashboards that are fully automated.
Once our engineering team grew to 100+, our top-level metrics became more difficult to take action against. When metrics trended beyond concerning thresholds, we didn’t have a clear way to address them. Each team was busy with their own product roadmap, and it didn’t seem worthwhile to spin up new teams to address temporary needs. Additionally, many of the problems were large in scale and would have required a dedicated group of engineers.
We found that the best solution was to build dimensions so that we could view metrics by team. Once we had metrics cut by team, we could set top-down expectations and priorities. We were happy to see that individual teams did a great job of taking ownership of and improving their metrics and, consequently, the company’s top-level metrics.
Coming out of our virtual trade show, Faire Summer Market, we knew we needed significant investment in our database utilization. During the event, site usage pushed our database capacity to its limits and we realized we wouldn’t be able to handle similar events in the future.
In response, we created a metric of how long transactions were open every time our application interacted with the database. Each transaction was attributed to a specific team. We then had a visualization of the hottest areas of our application along with the teams responsible for those areas. We asked each team to set a goal during our planning process to reduce their database usage by 20% over a three-month period. The aggregate results were staggering. Six months later, before our next event—Faire Winter Market—incoming traffic was 1.6x higher, but we were nowhere close to maxing out our database capacity. Now, each team is responsible for monitoring their database utilization and ensuring it doesn’t trend in the wrong direction.
We’re moving towards a model where each team maintains a set of key performance indicators (KPIs) that get published as a scorecard reflecting how successful the team is at maintaining its product areas and the parts of the tech stack it owns.
We’re starting with a top-level scorecard for the whole engineering team that tracks our highest-level KPIs (e.g., Apdex, database utilization, CI wait time, severe bug escapes, flaky tests). Each team maintains a scorecard with its assigned top-level KPIs as well as domain-specific KPIs. As teams grow and split into sub-teams, the scorecards follow the same path recursively. Engineering leaders managing multiple teams use these scorecards to gauge the relative success of their teams and to better understand where they should be focusing their own time.
Scorecard generation should be as automated and as simple as possible so that it becomes a regular practice. If your process requires a lot of manual effort, you’re likely going to have trouble committing to it on a regular cadence. Many of our metrics start in DataDog; we use their API to extract relevant metrics and push them into Redshift and then visualize them in Mode reports.
As we’ve rolled this process out, we’ve identified criteria for what makes a great engineering KPI:
We plan to keep investing in this area as we grow. KPIs allow us to work and build with confidence, knowing that we’re focusing on the right problems to continue serving our customers.
When we were a company of 25 employees, we had a single engineering team. Eventually, we split into two teams in order to prioritize multiple areas simultaneously and ship faster. When you split into multiple teams, things can break because people lose context. To navigate this, we developed a pod structure to ensure that every team was able to operate independently but with all the context and resources they needed.
When you first create a pod structure, here are some rules of thumb:
(1) Standardized tooling/processes across the engineering team and balanced leadership between functions
(2) Standardized career frameworks and performance calibration. We give our managers guidance and tools to make sure this is happening. For example, I have a spreadsheet for every manager that I expect them to update on a monthly basis with a scorecard and brief summary of their direct reports’ performance.
Our engineering priorities change often. We need to be able to move engineers around and create, merge, split, or sunset pods. In order to keep track of who is on which team—taking into account where that person is located, their skill set, tenure at the company, and more—we built a tool called Census.
Census is a real-time visualization of our team’s structure. It automatically updates with data from our ATS and HR system. The visual aspect is crucial and makes it easier for leadership to make decisions around resource allocation and pod changes as priorities shift. Alongside Census, we also built an algorithm to evaluate the “horsepower” of a pod. If horsepower is showing up as yellow or red, that pod either needs more senior engineers, has a disproportionate number of new employees, or both.
Pods are colored either green, yellow, or red depending on their horsepower.
One of the most common questions that founders have is how to balance speed with everything else: product quality, architecture debt, team culture. Too often, startups stall out and sacrifice their early momentum in order to correct technical debt. In building Faire, we set out to both establish a unified foundation and continue shipping fast. These four guiding principles are how we did it, and I hope they help others do the same.
Marcelo Cortes is a co-founder and the CTO of Faire, an online wholesale marketplace connecting mostly small brands to independent, local retailers.