Building a Successful AI Startup
ScienceTalks with Saam Motamedi
We at Snorkel AI have received many requests from data scientists and machine learning engineers who aspire to be founders, where do they start and how should they get started on their entrepreneurial journey? We genuinely believe that data scientists and machine learning engineers will build the next generation of mega-enterprises. Over the summer, we’ve recorded a unique set of episodes of Snorkel Science Talks, focusing on data science entrepreneurship.
In this episode of Science Talks, Snorkel AI’s VP of Marketing, Devang Sachdev, chats with Saam Motamedi. Saam is a partner at Greylock Partners. He focuses on seed and early-stage companies in several domains in enterprise software with a particular focus on AI. Saam’s investments include companies like Abnormal Security for email security, Cresta related to contact center AI, and Snorkel AI to accelerate AI application development. Before Greylock Saam co-founded Guru Labs, a startup that uses machine learning to turn credit card transaction data into sales for offline merchants by predicting customer preferences, and was an early product manager at RelateIQ, where he drove the development of data products before and through the company’s acquisition by Salesforce. This talk focuses on how data scientists and machine learning engineers can start with their startup journey.
This episode is part of the #ScienceTalks video series hosted by the Snorkel AI team. You can watch the episode here or directly on Youtube:
Below are highlights from the conversation, lightly edited for clarity:
Can you please tell us a bit about your journey into co-founding Guru Labs?
Saam: I’ve been working in the domain of Artificial Intelligence for quite a long time. So, my first job was in product development at a company called RelateIQ. RelateIQ started around 2012, intending to bring intelligent software to the CRM workflow. So, we can say that RelateIQ was the first company that went to other companies and said, “Give us access to your employees’ emails, and we will train a model to automate and enhance workflow.” For example, suppose you’re a sales representative. In that case, we could tell you the probability that a given deal would close or not based on sentiment and data in email conversations that you had with your buyer at the account. So that was where I saw the power of bringing AI in an applied way to solving a specific use case. After a couple of years, that company was acquired by Salesforce, and at Salesforce, we integrated AI across the product portfolio. It is now a part of Einstein at Salesforce, which is Salesforce’s platform for building AI products. After that, I left RelateIQ and started Guru Labs, in which I took a similar approach to RelateIQ. In this case, we focused on credit card data. The product took customer transaction data from customers and point of sale systems and then automatically built buyer profiles using ML algorithms. It enabled merchants to run dynamic pricing campaigns to show different offers tailored to specific customer segments. After that, I came to Greylock, and I’ve been working here for the past five years. And the thrust of my investment falls into two buckets. One is the companies going after existing software markets like SaaS markets, security markets, infrastructure markets where we are using ML and data as a core part of the product advantage to drive differentiated value. And the second is infrastructure. With Snorkel AI front and center, where we are making it much easier for data scientists, machine learning engineers, and business users at large enterprises to build ML applications internally Those two areas are my focus because I think those are the two biggest waves happening in software right now. They have the most significant opportunities for value creation.
Devang: That’s true. In some sense, it feels like we are already in an advanced stage for AI. IDC is predicting that the AI industry will be half a trillion-dollar industry just by 2024. But on the other hand, it feels like we are just getting started. Consumer technology giants have benefited from putting AI to use with products we use in our everyday lives. But, when it comes to delivering business value through AI, things seem to be relatively nascent.
How do you see the state of enterprise AI technology, and what are the opportunities for AI specifically if someone were to start a new company today?
Saam: Yeah, I agree with what you just said. Enterprise AI will be massive, and it’s going to disrupt every business workflow across every vertical. The reason for that is the continued proliferation of data in the enterprise, including machine-generated data and human-generated data. And this generated data holds a tremendous amount of power. The second factor is the continued availability and decreasing cost of computing services. The third factor is the advancements happening on the infrastructure side as well as the algorithmic side. These all things coming together are going to make a significant impact in the industry.
When I think about opportunities, I think of them in terms of delivery mechanisms.
The first is existing platform and application companies building additional AI features. In this set, you are beginning to see increased adoption from platforms like Salesforce, Workday, ServiceNow embedding AI into their workflows and making the existing workflows more efficient and effective. So, I would imagine that a lot of the product innovation we see from those companies is around AI. They will be particularly well suited to solving problems where the data needed to train lives in the context of their environment. So, in my opinion, Salesforce is well-positioned to do that if they have all the relevant data for a prediction task. So, we probably avoid companies that want to build AI features on top of existing records systems.
The second is you’ll see applied AI companies that emerged to provide a best-in-class end-to-end solution for specific problems. In this case, customers want to buy an AI solution for their specific problem. Abnormal security is a good example here. They are building a next-generation cloud email security platform that uses NLP and various AI techniques to create behavioral models of employees at companies and what normal email behavior looks like and then identify anomalies to prevent advanced cyber attacks. Another good example is Cresta which is building a contact center AI platform.
In markets where the customer wants to buy an end-to-end solution, we look for sufficiently horizontal problems that are shared by many companies. We also want to confirm whether the customer wants to buy a vendor-based solution or develop a solution in-house.
The third is that you’ll see infrastructure platforms emerge and grow that enable enterprises to stand up AI projects on their in-house data for their unique use cases. Here we talk about how we build the tooling platforms to make it easier for these enterprises to stand up their projects on top of their data. In most cases, some of the highest value use cases for ML/AI need to be served in this way because its data is specific to the enterprise, and they may not want to share that data with third-party vendors. Also, there are some specifics to how the model needs to be developed, and the workflow around the model can also be very bespoke. So, here the customer needs to partner with a platform and tooling provider that can help customers build a use case and get that use case to production. When we look at companies like these, there are different dimensions we consider.
- Who’s the end-user? Is it a business user? Is it a data scientist? Is it a machine learning engineer?
- Where in the machine learning cycle do they focus?
We know that model management, deployment, and model monitoring are critical. But most companies’ data is not suitable for machine learning algorithms. Now, we know that labeling that data, transforming that data, managing it, and unifying it can be a difficult task. We are very attracted to companies that start at the beginning of the lifecycle and recognize that that’s where the pain is. Then we use that insertion point to expand and serve more of customers’ needs as they need. That’s why we are excited about companies like Snorkel AI that starts with the training data problem and then expands across the entire lifecycle from there.
For data scientists and machine learning engineers looking to solve enterprise problems, which are more exciting areas of AI?
Devang: For example, there has been a lot of research done around vision. Still, at the same time, we know that vision-related use cases within a traditional enterprise are minimal. Not every company is working on building an autonomous car. When you think about textual, vision-based, or time-series data, are there any particular opportunities that you see as more valuable than others in either of these domains?
Saam: I think there are practical problems across all of these domains. For time-series data, if you look in the manufacturing and industrial verticals, those massive factories with lots of sensors emitting tons of telemetry consumed as a time series and wanting to again look for things like anomalies, predictive maintenance, etc. So, there is a lot of value there. And if you look at the text, there are different types of documents, customer communications, and workflows of enterprises that need to be done there. That holds a lot of value.
And also don’t think there is any kind of shortage of meaningful and valuable problems to solve across these different data types. A lot is going on in robotics and autonomy that relates to computer vision. The large Fortune 500 companies that build security systems have a lot of camera telemetry. They want to do many exciting things on that data, whether it’s counting people or looking for abnormal behavior. That could be a perfect application for infrastructure providers to help support because it’s unique to their environment. So, I don’t think there is a shortage of meaningful and valuable problems to solve across these different data types.
Devang: Talking about 2021, it has been a record-breaking year in terms of investments, especially for early-stage companies. The number of investment dollars has doubled compared to last year already, and the number of companies invested has increased by 60%. So, I think it’s quite an exciting time to be an entrepreneur, and it’s also a perfect time to be an investor.
As an investor, what do you look at as far as characteristics go from the team, the business, or the product you are investing into?
Saam: I agree with you that the funding environment is exceptionally robust and that’s very well merited. I can tell you that in the last five years I’ve been at Greylock, we have never seen the quality and quantity of entrepreneurs coming through the virtual offices that we see today. We’re going to see a lot of exciting companies get built over the coming decade.
We generally evaluate companies on three primary dimensions for the early stage. The first is the market, the second is the product, and the third is the team. So, at Greylock, when we back companies, we are funding companies with the ambition of building large, public, enduring businesses. And those are the types of companies that we’ve supported in the past and what we’re working on going forward as well. And it’s challenging to build a sizeable enduring business if you don’t pick up the right market. Also, there are three types of markets that we back. The first is the replacement. It’s an existing software market where there is an incumbent, and there is already purchasing behavior. You can look empirically and say there is this much spend happening, but some fundamental technical shift is happening that enables the company to take on that incumbent. The second is emerging markets. This is when you see some new customer behavior begin to grow, and it is proliferating. So you have to exploit the market trend. I think a lot is happening in ML infrastructure that we can call emerging markets. The third is new market creation. This often occurs on the consumer side of the business. Here you’re betting that you can get a new behavior to emerge. So, here you introduce some new medium of interaction, some new software, and hope that that software unlocks a new behavior.
On the enterprise side, we want to make investments in buckets one and two.
So, we either want to believe that there is a vast market, and we need to understand that what we are doing is fundamentally different from what others are doing. Or we need to believe that there’s an important emerging market and in the next ten years it is going to be one of the most significant trends. This is the first box to check. The second piece is around the product. So, here we understand your roadmap to build a 1.0 product and why do1.0 products differentiate not only relative to the competitors but also drive sufficient customer value. This is another place we see companies fall into trouble. They build some very narrow wedge products, and they can’t find enough customers for their product. And the last piece is the team. Here we look at several dimensions. First, we want to back people who have an immense need to win—those who wake up every single day with a burning desire for what they are doing. We want to find people from the domain who have a unique insight, just like the Snorkel team, who has a Stanford background and have built this project and worked with essential collaborators. We are looking for an ability to recruit the absolute best talent. Other than that, we want a team that is going to build for a long time. We invest for the long term, so we want teams that will create for the long term. And when we find a combination of those three things – market, product, and team, we get excited. But, this happens only a handful of times a year.
What pitfalls would you convey to the future entrepreneurs to watch out for, any specific suggestions for faster growth?
Saam: The biggest problem (I see) with enterprise AI entrepreneurs is that they are more focused on the technology and solutions than the customer problem. So, I think it’s essential to spend a lot of time before you write a line of code. It’s critical to understand the customer’s problem and the scope of the solution that needs to be delivered to solve the problem. When you start from the customer problem, I bet that you’ll find that there is a lot to build that is not a part of AI and AI is just one component of the solution. The second thing is building a product and architecting it to operate in real-world environments.
So, there are two points to focus on now.
- The first one is training data. Suppose you’re going to build a new company, and you’re after making an applied AI use case. How will you bootstrap that product before you have enough training data to get your ML models on an acceptable performance level. So, the best entrepreneurs use some heuristics and more domain-specific approaches to get valuable work with customers. Then they start getting the data after they’ve proven that value, and then over time, they use the data to train their models for better performance. It’s a very different approach than trying to convince a customer to give you all their data before you’ve shown them any value.
- The second thing is how do you handle mistakes? If you are building a security tool that is AI-driven and the tool makes a mistake, then it is very impactful for the customer. You let an attacker in. So, the best companies we know have these fail-safes. For example, if I’m making a prediction, but I’m low on confidence, my interface should tell the end-user on the customer side to review the decision before acting on it. It’s better than just relying on the system’s accuracy.
So, build a product that understands those nuances.
Now, let’s talk about some pitfalls to watch out for, which are specific to enterprises. People’s biggest mistake is building a product that doesn’t have a Go-to-Market motion that can support it. The other thing is not understanding the importance of repeatability from a product-market fit standpoint. Our goal is not to get five customers to like your product but to get thousands and thousands of customers who love your product. As a founder, your role is to learn the most common use cases and patterns for a customer’s purchase. It would help if you also had an idea about what customers to focus on, the use cases to target, and how I make the deal successful on one of those use cases. Some companies learn that very early because they are focused on that in every customer interaction. While some companies get thousands of customers and they still haven’t figured that out. So, figuring out repeatability early on is another critical piece.
Is there any recent book, movie, or talk show that you have enjoyed and inspired you?
Saam: One book that I recently re-read is ‘Thinking in bets’ by Annie Duke. Annie Duke is a former professional poker player, and she has written two books about making decisions in a probabilistic world. In my opinion, her main point is that we often perceive decisions and outcomes as being deterministic because we only see the version that played out in our lives. But, in reality, whenever we decide, we’re buying some distribution of outcomes, and the way we should make a new decision should be perceived by that distribution. Now, the most important thing is that when we review that decision, we shouldn’t just look at the outcome. Still, we should try to find out why we made that decision, what the distribution was, and most importantly, if we bought the correct distribution in making that distribution. It’s a helpful framework when I think about allocating capital. Still, I think no matter what you’re doing, the way of thinking about probabilistic decision-making and then evaluating the decision to improve will enhance your work quality. So, I highly recommend this book.
Talking about the podcast, I recommend that your readers check out my recently posted podcast on Greylock’s podcast series called Greymatter, based on building practical AI products. In this session, we bought together a couple of leaders from applied AI companies, and we talked about some of those challenges we face in building and deploying AI services. I also interviewed people who actually dealt with these problems and successfully found a solution for the challenges. So, I recommend this podcast to readers who are interested in building applied AI companies. It’s an excellent show, and there is a lot to learn from it.