top of page

Nathan Benaich — The AI Report, the future of large LMs, and Investing

Nathan Benaich is the general partner of Air Street Capital and the founder and co-author of the State of AI Report. Nathan studied biology through graduate school before transitioning to AI and investments, and is now known as one of the most influential analysts in the field. In October 2022, Nathan released the State of AI Report and we asked him to reflect on his predictions and what he sees emerging in the industry.

This is a transcript of the interview that you can watch also on our YouTube channel👇

Before we talk about the State of AI Report, I read that you have a background in biology and I'm really curious about your journey. How did you go from biology to AI to investments, to being one of the most influential analysts of the space?

I went to biology because I'm definitely a science and technology person. I like working on these kinds of problems that are kind of technical, but where if you get it right or you solve something, then it's relevant for a very, very large problem space And when I started undergrad, it was in 2006 when there was tons of excitement around stem cell biology and regenerative medicine. I was pre-med and on the path to doing an MD PhD at some point.

After doing a bunch of research in my undergrad, which started to kind of use bioinformatics as a way to understand large-scale genomics and other kinds of biological data sets, I did grad school in the UK and let go of the ambition to become a physician-scientist. I spent three years in my PhD from 2010 to 2013, again integrating experimental data sets on genomics and expression of various genes in cancer patients who had metastatic spread. I was studying what genes might be responsible for that, and also undertaking wet lab biological experiments and trying to integrate the two together.

Eventually, in 2013 when I submitted my PhD, I grew sufficiently interested in this process of working on technical problems with new kinds of tools like machine learning and bioinformatics, and how one might be able to take those kinds of technologies and form companies out of them. And so I felt that moving into the VC world would kind of get me closer to understanding how that process worked. And a couple of years later, I'm still here, and there's a lot of stuff to solve and innovate for in the venture product for founders that are working in machine learning.

So you kind of understood that there's a universal algorithm maybe that can be used across many domains and actually create value for you as an investor?

Yes, I mean, I think at a high level, what's most exciting about machine learning is that every product in the world that has been created is a product of intelligence of some form. And the mission of learning how to recreate forms of intelligence and make use of that to solve real-world problems is the most exciting thing to work on.

And then secondarily, machine learning is almost like a lever on tech progress everywhere that tech exists. And so it is also an investment theme that keeps on giving as a function of the substrates in which it's applied being ripe for that application. And so we see some domains come online in the last decade and others that are coming online this decade, and we'll see new ones in the future as well. It has persistence in that sense too.

And what's your track record? What are you guys doing with Air Street Capital and how have you guys been lucky or not so much with AI so far?

I started making investments in AI companies or based software companies in 2014 or so. And, you know, one of the early outcomes there was a business called Mapillary, which is doing crowdsource computer vision. Basically building an open version of Google Street Maps Street View that we eventually sold to Facebook to power their mapping business. We also had one of the early generative AI companies for music called Jukedeck, which we sold to ByteDance, and then was integrated in TikTok.

Then we moved more into FinTech with a business called Thought Machine, which does data analytics and data infrastructure for cloud banking, which is worth like 3 billion now, and powers a lot of JP Morgan consumer accounts.

Moved back into Computer Vision with Tractable doing car insurance, damage inspection, and repair. They now power a lot of the claims that process through Geico in the US when individuals break their car.

And then, more recently with Air Street I’ve been diving a lot into techbio as a sort of reframing of biotech. Where the main difference is that founders come not from a pure biology background, but from a software engineering background; and looked very religiously at the drug discovery value chain. So I made a number of investments there including Allcyte, which was a drug personalization company that we sold to Exscientia pre-IPO.

So you have quite a well-connected and large portfolio of investments in that space. Most people really know you as the initiator or author of the State of AI. How did you guys stumble into that? How did you guys get started?

Yeah I mean, it actually originated through a number of meetings that I had with former entrepreneur Ian Hogarth, who is now a prolific angel investor, particularly in machine learning companies and, and other software companies.

We met five or so years ago, and we'd sort of both been keeping track and working tangentially or directly in machine learning companies. He was particularly interested in the emerging geopolitics of AI and wrote a really influential essay called AI Nationalism pretty soon thereafter.

I guess as a PhD person, I just like diving into research papers, which is why you [Jakub] and I hang out. So I've also been investing in companies and we assembled together a diverse set of interests, which represented more broadly than various players that are involved in AI.

We're both fans of Mary Meeker’s State of Internet Report, which we would consume on an annual basis to get a temperature check on what’s up on the internet. You know, AI is super exciting: lots of things happening, but also lots of noise. And there wasn't a canonical kind of open-source document that explains the most important concepts from an editorialized take.

Our goal was to speak to a diverse audience so this document would be credible landing on the desk of somebody at OpenAI or DeepMind, but also credible landing on the desk of a politician or a growth stage technology company or a corporate.

This idea started hacking on a Google Doc putting down a bunch of content that we thought was interesting, drawing on various resources, including this newsletter that I continue to write after six years called Guide to AI on a monthly basis (

We just jumped into making non-super polished slides, so it doesn’t look like a pro marketing thing, cause it's definitely not.

Is it still a work of love or does it have some value for your investment practice?

It's definitely a work of love. But it also just helps organize thoughts, and also I get to feel like playing a part in the community I learned a lot from. You know, I stylize a lot of what I do as trying to be a core contributor to this overall project of AI that various people are working on in different ways. And it seems like maybe this last edition of the report, we've really quite broad and got recognition for it!

There's hardly anybody I speak to who hasn't actually read it. And indeed it lands on the desks of individual researchers, but also of decision makers, politicians, CEOs and companies that know very little about ai.

So what’s in it is not only a very good reflection of the space, but it's also quite influential. So can you tell us a bit about how it's grown from a hobby into a side project with a more serious note> How do you guys make it nowadays? What's the process for compiling the report?

It’s a daily exercise of reading stuff that surfaces through our inbox, Twitter, conversations with people in the community… Just keeping these major links and ideas that I package on a monthly basis in my newsletter.

And then come the start of the summer, June, July, when we start to whip out a sketch document on Google where we just put some of the major things that happened across the key areas that we want to focus on, which is traditionally in the research industries. This year we also did a safety section to represent the growing interest in safety and alignment research.

And then we and then we make a set of predictions every year to partially like, keep the report spicy, partially to, to push ourselves to figure out where things might go. We also provide a rating of our performance on our previous year's predictions on the annual basis.

So, given the influence that you have with the report, what, how do you kind of know, what, what's your sounding board of what is interesting and what is maybe just like a personal whim?

So it's editorial, we don't promise to cover absolutely everything. We definitely start with more slides than what we publish. We usually cut it by 30 to 50% every year. We tried to keep it to overall a hundred slides, this here, and there was a little bit of extension over that.

We'll have a kind of reviewer consortium or group of people who we think reflect folks that are real builders and influencers in a good sense of the space that come from industry or startups or academia.

The goal here is to find glaring gaps or, you know, unintended misrepresentations or unintended mistakes or just like kind of critiquing our interpretation of things to make sure that we present something that is factually accurate and fair. We do still take a stance where if certain people have an issue with our interpretation but we still feel that it is genuinely correct, then we keep it.

We do really like the data-driven insights for lack of a better way of framing things because there are some unsaid truths or like just things that people know in the community that they have a hunch for because of what they experienced in a day to day, but there wasn’t any canonical source of evidence that factually confirms it. And that was some of the work that we did with you guys instead of alpha on the compute usage. It was extremely popular because everybody felt one way, but nobody had numbers to prove it until now.

So if you fast forward three, five years, and you extrapolate the progress in In language models and the ability of AI models to digest very large amounts of unstructured data. What parts of the report do you think can be automated and what parts are essentially your own human judgment?

A synopsis of the key findings perhaps? Like the setup of the experiment, why it's interesting, the experiments themselves, and then the key findings that could probably be automated. So in that regard, like a lot of the research slides where we try to explain to people the significance of certain work, imagine that we could use LMS for that and then maybe take a more editorial view as to the influence of it.

The center point of it every year, I think, is your predictions where you make things really measurable. So now that you're at the end of the year and we've seen things like the launch of chatGPT, I think just kind of had a very seismic impact in the AI research community.

I was at NeurIPS and nobody was talking about anything else. But also in the broader public, right, it's reached the general press in like a few days. My children and my neighbors are playing with it! What are some things that now, like two months after the publication report of the report are some trends that you didn't see back in September, October?

Well, I guess we didn't per se predict a model like ChatGPT. I think what's kind of interesting is that there is a resurgence of reinforcement learning, which has kind of been very exciting as a path towards AGI in 2015, 2016, 2017… and then has seemingly gone out of favor for a while and now seems to come back with a bit of a vengeance as a, as a way of aligning models with human preferences and with certain behaviour that we want.

I think this prediction number five about big companies investing billions of dollars into AGI companies, I think is literally happening right now. We haven't seen this regulation for AGI lab bio regulation from a biosafety point of view yet.

But the real motivation for that is that in bio, if you do experiments with viruses or certain potentially you know, dangerous materials that can modify DNA and things like this, you have to provide for protocols and safety measures and work in a specific facility. And log every interaction in a way that you don't actually have to do if you do anything with your computer.

And, you know, we think it's not unlikely that certain politicians would take an issue with that and, and seek to regulate certain AI systems.

But I think politicians would seek to regulate the labs themselves, not so much the product because regulation is focused on the liability around the product.

Right. Yeah. I think it would probably be around the specific kind of work, which is where it becomes a bit tricky. Like where do you draw the line? I don't know. Regulating an entire organization from which only some part is doing work that qualifies to be regulated is quite a blunt instrument.

We also, I also definitely think that NVIDIA will make a strategic move with a large sort of foundation model style company. You know, there are many of these players that are out fundraising at the moment. So imagine this will shake out in the next couple of months too.

I still think there's gonna be some news about AI semiconductor companies struggling a bit. And we have a separate slide also in the report that looks at the historical analogy in the US in the sixties when the government had used integrated circuits and bought all of them that were on the market. And that's like a way of demonstrating that governments can act as buyers or first resort for emerging technologies that might not see large-scale use in the commercial sector. And I think a similar mentality should actually apply to many of these AI startups that do semiconductors because the market there is at this point still early.

You know, it's been like five-ish years and it's still hard to point to a dedicated type of machine learning work that wasn't enabled previously because of these new chip architectures. So at the same time, we still agree that from an industry concentration, from a tech sovereignty point of view, we do need more vendors. So I think the governments do have a role to play, to step in and say, look, we, we'll fund these companies. We'll sort of bridge this gap and we'll buy their product, not just invest in their company, but we'll buy their products. We'll set up data centers and national supercomputing centers, which we also provided in the rapport as a slide.

Also, a lot of European countries just don't have very large clusters of computing systems that are relevant for deep learning versus many Americans. Supercomputer systems and national labs and private companies.

We saw that you launched this new side project of the report to track day-by-day the compute capacity that each institution owns.

Yes. I think compute capacity is a very good litmus test for where this space is moving in the commercial sector. Because as we know, most of the experimentation occurs in the research context. So if we don't see a signal research context, I think it translates a bunch into real-world demand.

We spoke at NeurIPS with people from Cerebras, and they had actually some very good news because Jesper AI partnered with them which I think is a big deal for the two companies. I think you’re also very bullish on generative AI models. I'm curious how you see this space developing. OpenAI monopoly or many flowers to bloom?

I think in the large models space, I think you can paint a credible path at the moment that OpenAI is heading towards a monopoly in model capabilities. Depending on how many tweaks and how much RLHF they had to do for their model and any kind of intricacies with the design of it and the data set depending on how complicated that was, the lead can actually be quite large versus what we have in open source and other companies have, and there's probably some form of limit at which companies that want to make use of this technology no longer think that it's useful for them to compete and build their own equally large models. So they'll just consume whatever OpenAI produces.

I often think that these, like large models, are gonna make it even better for prototyping and testing ideas and just rapidly iterating towards finding product market fit.

And then once you get there and you start building some market share, then you, that's when you start looking at your unit economics and your dependency on third-party provider.

That’s more or less what Jasper AI is doing right? Trying to hedge their bets with OpenAI…

I think I mean it's, it's fairly obvious reading between the lines, what they're doing. Their CEO tweeted back when I raised that point saying they're perfectly happy with OpenAI, but some of the customers have certain specific needs which might require different providers in the future, which I thought was just kinda a funny interaction.

But yeah, I think that demonstrates the path because at some point they'll realize that certain workloads can be solved with smaller models that are cheaper to run, so third-party vendors will improve their economics and just make themselves more compelling as a long-term vendor.

But you know as models get better and better, the UI and UX wrapping of the model becomes even more important. Customers have little patience and want something that just works. You can see this with even translation when you’ve got Microsoft, Apple.. All the big guys on translation have similar translation solutions but if you instead provide good APIs that allow you to integrate translation into a workflow product that’ll end up making good amounts of money, just because the UX is a lot easier.

So in terms of like you as an investor, you are more bullish about com AI companies that are closer to the end user rather than the ones that are closer to the sort of technology stack?

Yeah exactly. And one of the ways I put this is I think exposing a model's capabilities via API doesn't allow the API owner to maximally capture the value that it helps its customer create. So if you contrast like a generative model, which is quite powerful in its output and can be used in so many different ways. You know, generating trading predictions and you know, if you make good trades, then you make a ton of money versus generating copy for a blog post. It seems bizarre that an API provider would charge the same amount of money to those two profiles.

Let’s shift our focus to Google. They have to be employing the largest sort of share of talent of the entire tech ecosystem in AI research, right? This is not one of your predictions but it’s puzzling me a lot. Why are they not productizing these services around PALM or Lambda or what have you?

I think the answer is simply… it's just a large organization. Some parts of it are slow-moving, potentially sclerotic where liability and risk are a higher priority than capturing a little bit of market share and attention versus an ads business. So, you know, the risk of exposing something like what Meta does with Blender and Galactica and the whole backlash that happens through mismanagement of these releases is more painful to them than potentially making 10 million or 15 million, or 50 million, and so on.

So you suffer from that at scale. And you suffer from many teams working on the same thing with not a clear, like clear and concise product roadmap that operates like a startup, which is they see the management style of OpenAI and you become more reactionary and focused on large-scale revenue opportunities.

I think they're the only ones who can protect us from OpenAI / Microsoft. So coming back to next year, 2023 what is one of the nine predictions or one of the predictions that you didn't think of at the time that you feel sort of strongest about that you, that you think is gonna have the biggest impact? Not the highest likelihood of coming through, but gonna have the biggest impact.

If there's genuinely a way to monetize fully open-source AI, like what Stability or Huggingface are doing, and these companies can demonstrate that they can become “data brick” style companies, then I think we're in a really good place. I think that's what a lot of folks are looking for or hoping.

Yes, I’m rooting for exactly that. Well, our time is up, so thanks for joining us. I really enjoyed this chat and also working with you on the Zeta Alpha dataset. Thanks Nathan for joining us!

264 views0 comments

Recent Posts

See All


bottom of page