This article is written by Mengying Li, who leads the growth data science team at Notion and was previously a data science manager at Facebook and a data scientist at Microsoft. She also advises early-stage companies on how to build their data science team, choose third-party data tools and develop early data science prototypes.
There are plenty of early founders who are keen to add data scientists to their teams. These days, even early-stage startups are generating massive amounts of data, and it’s appealing to bring someone aboard to help parse those individual data points and turn them into actionable directives that shape the product or core business functions.
But when you ask these eager founders how exactly data science can make their business better, most of the time you get back some vague, hand-wavy answers like, “So we can understand the user better, which will help us grow the business.”
As an early–stage startup advisor, I’ve had plenty of these conversations. I’ve found that, often, the push to build out a data science function is driven by FOMO (fear of missing out) — that founders are afraid of falling behind competitors who are more “data-driven.” But without thoroughly understanding the data science needs of your particular business, you risk hiring someone not well-equipped to tackle your particular company’s unique goals and challenges — a mismatch that sets you back, rather than springing you forward.
At a startup where time and resources are strapped, diving into data science too quickly can distract from more pressing challenges facing the business on the path to scale.
The following guide is a primer for data science at startups — and whether you should invest in this key hire. We’ll cover whether your business is well-equipped to build out the function now, or if you’re better off outsourcing in the early days. We’ll also explore the different types of data scientists, and which are the best fit for your particular business needs. Let’s dive in.
ARE YOU READY TO HIRE YOUR FIRST DATA SCIENTIST?
To gauge whether it’s the right time to bring your first data scientist onto your team, there are a few key points for reflection: First, has your company reached a point where you have enough data to generate quality insights? And second, do you have the right tools and support to realize the ROI from a data scientist’s insights?
Do you have enough data to begin with?
Very rarely should any company include data science in their first few hires (unless they are building a data science product, such as an experimentation platform or an analytics consulting company). Why? In the early days, when you only have an MVP and a few users, you likely won’t have enough data to analyze. Most of the time, people can use interviews, direct customer feedback or even social media to gather the next iteration of feedback and product ideas.
But the key question is: when do you have enough data to make that first hire? It depends on what your company does. B2C companies relying on ads or subscription revenue might want to bring on a data scientist early on while B2B companies with very few customers in the early days can afford to wait out a bit. One rule of thumb to lean on is to start thinking about a hire when you reach 1,000 monthly users for at least six months. This establishes a stable user base with enough data for a data scientist to parse usage patterns and identify trends. Another good benchmark to keep an eye out for is that when your company hits the 50+ employees mark, where more specific functions are being built out (like finance), it’s likely time to consider a full-time data scientist.
If a company's data has grown to the point that qualitative assessments or basic Excel analyses no longer inform business decisions and accurately monitor business health, it may be time to bring in a data scientist.
How much do you already know about your customers and your roadmap?
Even if you have enough data, it doesn't necessarily mean you need to have a data science team. Data scientists help your company discover unknowns and patterns and validate your hypotheses — for example, data scientists can pinpoint the user segmentations with the highest conversion rates to help with acquisition strategy or assess how good an onboarding feature is at retaining new users. If you’ve already done a bunch of customer discovery work and developed a strong hypothesis about the product roadmap moving forward, you likely don’t need to spend the money on a data scientist to cosign your decisions.
But if, for example, you can't answer simple questions about your customer base, such as where some subset of customers came from or how much time they spent within your product, this could be a signal that you would benefit from the hire.
If you do think you need some data science help, can you justify the ROI?
Data scientists can be very costly, with the median salary at about $160k per year. So it’s important to envision the return you're likely to get on this hire (a.k.a. how much you predict data science will help you grow your business). Sometimes, you might be better off having an external consultant or a contractor who can help guide your existing engineering and product orgs to look at the data by themselves. These external folks might also give you a taste of how it looks to properly work with data before you commit to building out a team.
For example, if your engineers or product managers have a solid data background and know how to write queries, you might consider starting with a contractor or a consultant. Or if you’re looking to get started with one specific project that requires data scientist expertise, but don’t anticipate any other immediate needs in the near future, that might be another indicator that you’re better off starting with external help.
Do you have the foundational infrastructure needed for a data scientist to get started?
If there is no good quality data, whatever fancy analysis a data scientist might be doing won't matter. Garbage in, garbage out. That’s where the right set of tooling comes in.
Here’s a quick primer on a few essentials of the modern data science stack.
Data collection and storage tool to ingest your raw logs to a data warehouse (such as Snowflake).
Data pipeline tool to automate the processing of the raw logs to make it more accessible to end users as most raw logs are in very complicated and non-intuitive formats.
Business Intelligence tool to visualize and report the data easily. This can be either in real-time or updated on a certain cadence. Real-time BI tools are more costly, but they can help you identify any issues more quickly.
Interactive query interface. A lot of storage tools have integrated interactive query interfaces already. But sometimes, to streamline the analysis process, we might want to have a more powerful querying interface that can do both SQL queries and visualization directly, such as Hex.
Of course, not every company needs every component of this stack (and there are plenty of other tools not included on this list, including observability, ELT, reverse ELT, and more). But at a minimum, you need solid logging for the key events on your products and have a data transformation layer to get the data into a warehouse so that your data scientists can at least crunch some numbers.
You might be thinking, why not just download a gigantic Excel sheet to analyze? Without proper instrumentation and transformation, your data won’t format properly into an Excel sheet, wasting costly hours parsing insights in unstructured Excel files rather than just querying structured data using SQL/R/Python.
Do you have enough scope and support for a data science team to grow and feel valued?
Even if executive leadership thinks data is important, not everyone in the company will agree with you. It is important to make sure all the relevant functions know how to involve data as part of decision-making processes. For example, data science should be considered a key business function, on par with product management and engineering. When you are making product changes, you should instinctively consider whether there are any data insights that can help you make the decision. You should expect insights to be incorporated into product improvements, not just FYIs or nice-to-haves.
If expectations are misaligned and your team doesn't start to embrace data culture, data scientists might just end up like fancy window dressing, rather than driving business impact.
This is where you need to think ahead: given current business growth, where do you see bigger issues without a data team? For example, do you plan to start running growth experiments in the next year? That roadmap will help justify why you have to hire now so that you can be more prepared for the bigger challenges down the road.
Do you have the right expectations about your data science team?
As you consider bringing on your first data science hire, set realistic expectations for the first few months. For example, a new hire probably won't be able to tell you any fascinating insights until at least three months in because they might be busy figuring out your broken logging and setting up the foundational data infrastructure.
Don’t expect your earliest data scientists to immediately create fancy models because your product probably doesn't need a model to begin with. Instead, in the early days, they’ll spend a lot of time communicating their logging needs with engineers, battling with inefficient data tools and getting the right data for analysis.
FINDING YOUR FIRST DATA SCIENCE HIRE.
After you confirm that your company does need a data scientist and successfully get all the key stakeholders on board, it is time to start hunting for the right one.
The first data scientist is always critical because they will build the foundation of your data model, define the role of data science within your organization, conduct interviews to decide who will be joining the early team and shape the culture of your data science org. But the perfect puzzle piece hire won’t just fall in your lap.
Who are (great) data scientists?
Data scientists vary a lot across different companies, just like chefs have different titles (sous chef, saucier, patissier, just to name a few) and different restaurants have different types of cuisine.
For example, a data scientist who is working on the well-being survey at Meta might only work with thousands of samples per day and not do any machine learning, while a data scientist who is working on newsfeed ranking may need to parse through billions of rows of raw impression data, understand all the core machine learning concepts and sometimes run ranking models themselves. However, they could both share the same title: Data Scientist, Product Analytics.
Below are some common skills you might see often in data science job descriptions:
And while these technical skills are an important part of the day-to-day job, the key elements that distinguish a good data scientist from a mediocre one are often not technical skills, but their storytelling and thought leadership abilities. In order to tell a story that can potentially change the team’s thinking around the product, the data scientists you hire should be able to identify what questions data can help answer, what data they need, how to find it, when to build the necessary data foundations and then write a document with easy-to-understand visualizations and clear recommendations.
It’s also important to understand some common titles and other functions that are often confused with data science. Below are some common titles:
Look for hybrid skill sets, not specialists.
Of course, you want to hire a master chef who can cook any cuisine with any ingredients you provide, ideally at a lower cost. However, master chefs are usually expensive — and more importantly, you probably don't need a master chef after all, but rather someone who can cater to your own needs.
I am a strong advocate of hiring a hybrid-type data scientist as your first data science hire — either a traditional data scientist with a strong ETL background or a data engineer who knows some foundational data analytics techniques. The reason is that, at this stage, your kitchen's setup is likely basic without any fancy appliances. So this first chef has to handle the full cooking cycle — from washing veggies to garnishing — even though it might be less efficient and accurate. Once the company starts to scale and reaches a certain size (likely over 200 with a more clear org structure), it is likely time to hire data scientists for specific functions to pair with your individual org.
I have seen companies hiring their first data scientist in three different fashions:
Fresh graduate out of college
Senior data science leaders
Seasoned independent contributors
Each has its own pros and cons:
So the short answer is there is no perfect first hire.
Another question I often get is whether startup experience or domain experience matters in the job description. As someone who has spent the first six years of my career in large organizations and switched domains five times, I have to be pretty biased to say not necessarily.
Of course, startup experience is helpful. At least it means they have experienced the chaos and craziness at a startup before, so you will be mentally prepared for your next one. Also, the tool stacks used by startups are very similar, so the learning curves for tools might be smoother. But data scientist roles at startups can vary quite dramatically, which means their working experience could be wildly different from what you are looking for, even if both companies are startups.
On the contrary, data scientists from large companies probably lack experience using the "modern data stack," as they may be used to using tools built in-house. But I wouldn't say all the data scientists at a large company lack the mindset of working through the chaos and vague problem spaces.
An underappreciated aspect of folks from big companies is the exposure to some of the strongest data scientists in the industry. For example, at Meta, we had a Data Science Career blog series for experienced data scientists to share their journeys with their peers. We also had regular insight-sharing meetings to learn from other data scientists about their work. I was fortunate to be exposed to inspiring analytical minds and unconsciously picked up effective data analysis skills and external domain knowledge.
The background matters, the experience matters, but the most important thing you should pay attention to goes beyond line items on a resume. Are they scrappy and resilient enough? Are they fast learners? Can they communicate and collaborate with others well? Are they willing to get their hands dirty, and do they have strong ownership?
In my opinion, your first data science hires ideally combine the traits of both a data scientist and a data engineer regardless of whether they come from a startup or a large company, as long as they can hit the ground running quickly and keep learning along the way.
Interview questions to find your ideal hire:
A few questions that I suggest you add to your interview loops include:
Give them a live coding challenge and see how they approach turning an ambiguous business problem into presentable solutions using numbers and charts. Can they walk you through their coding logic? If they get stuck, how do they unblock themselves? Can their data points and/or charts support their conclusion? The code doesn't have to be perfectly clean or limited to any particular language — the key is getting a sense for how they parse the information in front of them.
Ask them to solve a business problem you're currently facing and see if they can derive reasonable hypotheses and solve the problem in a structured way that convinces you. Do they have a framework to consider the problem holistically? Are they curious about the problem by asking thorough clarification questions? How do they communicate their thought process? Is their solution creative?
Pick one project in their previous work experience to dig deep on their cross-functional and ownership experience, including: Who initiated the project? What was your role? How did you align with stakeholders on the results? What conflicts arose? What was the most challenging aspect of the project and did you work around those challenges?
HOW TO ATTRACT YOUR FIRST DATA SCIENCE HIRE.
Hiring good data scientists can be very competitive, so it is important to make yourself stand out to your top candidates. When data scientists are considering joining a startup as a first hire, like any early startup employee, they’re looking for some attractive combination of these four role attributes:
Interesting and challenging problems. Good data scientists believe that their work can make a real difference in the product, especially in an area they are passionate about. So whether there is enough room for them to embed their work into the product and make a real impact is essential. They certainly don't just want to be a trained SQL monkey and pull the data when asked. Instead, they want to use SQL as a tool to solve complex problems and contribute to business growth.
Career growth. The first data scientist usually wants to ensure their career will prosper within the company. Clear performance-based incentives and a good match between what they are looking for in the long term versus what you can offer are important. For example, if their goal is to become a data science leader in the company, will you hire a larger data science team in the future for them to grow into that role?
People and culture. Like all other roles, whether you like the people you are going to work with is one of the most critical factors in deciding your offer. This is especially true for data scientists because data science is very cross-functional. For example, the first data scientist's concern might be: Will the leadership advocate and gather resources for their team? How will the leadership react if the data tells a story different from their original intuition?
Company's future. Again, you are a startup and this comes with a certain level of risk. So the first data scientist must know if the company has a prosperous future and that they can see themselves as part of it.
Now you know what a data scientist cares about, it is time to make your pitch more engaging.
Make the scope and challenges attractive. Think about some daily scenarios where you struggle to answer some data-related questions. Are these questions challenging and exciting? What kind of skills do you think might make the solution more effective? It might be helpful to divide your data problems into some concrete areas that map closely to your roadmap and imagine how a data scientist can make a difference.
Be honest about how you envision your future data science team. The candidate will likely ask you this question during the interview, so it is important to have the answer in mind. Don't oversell what you can offer — once the first data scientist joins your startup finds out that you can't deliver what you promised, they won’t stick around. Ask candidly about what the candidate is looking for in the long term first, and ask yourself whether this is something you can offer.
Have a champion for building data culture within the company. Founding members of any function within a startup will need a lot of support when they first start. For example, give them the chance to speak in an important meeting and promote their analysis within the company. This champion, likely to be the manager of the first data science hire, is responsible for making this first hire feel welcomed and recognized.
Be ready to share your finance numbers honestly. Folks evaluating an opportunity at a startup will likely ask plenty of questions about the company’s financial situation — be prepared to share appropriate details about the funding plan, option dilution, CAC to LTV ratio and path to profitability.
To expand your pool of candidates, tap into your network and ask your investors for help suggesting great data scientists. I also suggest finding channels where you can surface your hiring news directly in front of a group of passionate data scientists, such as data Slack channels, data meetups and data conferences. Knowing someone who is part of these communities or directly sponsoring these communities can get your hiring news out there.
WRAPPING UP: HIRE THOUGHTFULLY, NOT HASTILY.
As with any startup, the needs are plentiful, and the resources are scarce. Without a data scientist on board, you could be missing out on the right insights to guide early business decisions. But hiring too hastily, without fully understanding the needs of the business, can quickly distract from the most pressing challenges facing a startup on a path to product-market fit.
As you weigh the decision for your own particular business, the key is to align your data science hiring strategy — the type of data scientist to bring their skills and experience on board — with your current reality. With the right tools, thorny problems to tackle with robust data and cross-functional buy-in, your startup may be ready to incorporate data insights into the product.
And remember — top data science hires are incredibly in-demand, so it will likely take some time to get a signed offer letter. Don’t just try to get anyone in the seat — take the time to find the right person who can make a transformational impact on your startup for the long run.