#001 - Data Driven VC

Introducing Benchmark Fund XV, powered by GurleyGPT

Welcome to our very first issue!

Although we’re pretty oversubscribed (mostly soft circles at this point) we were able to squeeze you in 😂 Of course, if you know another investor reader who would be value added, please forward this email along!

This week we fell down the “Data Driven VC” rabbit hole. Sure, sure lots of funds throw this term around…kind of like “thesis driven” or “high conviction.” In this case, we’re looking specifically at funds that use data at each venture stage (sourcing, picking, winning, and supporting).

Turns out, there are not that many..like maybe less than 25. Plus, the historic coverage has been sparse (great job Bartosz for effectively dominating this research to date). We’ve compiled a list of the ones we found. Let’s double click on three. First, the fund that started the category - Correlation. Second, the leading fund in the category - SignalFire. Third, my personal favorite and the most thoughtful - Connectic.

Perhaps one of the earliest VCs to lean into the data-driven narrative was Correlation, founded in 2006 by David Coats and Trevor Kienzle. Their unique specialty is data-driven co-investment alongside leading VCs. They realized early that capital is a commodity and once Founders have picked a lead, they really just want to close the round up and get back to work. Sort of reminds us of venture debt (which we should probably cover in the future).

Anyways, their process is simple, which makes it highly scalable. In fact, they just crossed 300 investments across just two funds ($365m raised), with a team of 15!

According to their website, “we’ve spent years creating one of the world’s most complete databases of venture capital financings, covering nearly all U.S. venture investments over the last 30 years. We track everything from financing details, investors, board members and management experience to industry segments, business stages and exits. The data comes from many sources, including commercially available sources, SEC and public sources, and data our team has scraped from the web.

The self-described “most quantitative fund in the world” and “the first VC with a demo,” SignalFire actually didn’t start out in VC. In 2013, Chris Farmer started a data-driven recruiting company that transitioned into VC in 2015. Seven years later, SignalFire has raised nearly $1b over six funds, has $2.9b AUM, and a team of 45 (a third are engineers). Impressive to be sure, but they are most well known for spending $10m a year on Beacon, their “Bloomberg terminal for the startup industry.”

According to SignalFire’s website, Beacon is tracking two million data sources and 500 billion data points! Now the 15 person eng team is starting to make sense. Beacon does a lot of things. First, Beacon Talent, an AI-based recruiting platform that tracks the world’s top engineers, data scientists, product managers, designers, and business leaders. Second, business intelligence including market trends, competitor benchmarking, and pricing analysis. Third, connectivity to 85+ advisors (who also happen to be LPs).

Googling around and listening to some interviews I’ve found Beacon is ingesting: academic publications, patent registries, open-source contributions, regulatory filings, company webpages, sales data, AppStore rankings, social networks, product crowdfunding sites (Indiegogo), tech communities (Producthunt), angel group platforms (Gust, Proseeder), expert networks (GLG, Coleman Research Group, Guidepoint), highly gated professional networks (Voray), buyer review sites (Capterra, GetApp, G2Crowd, SoftwareAdvice, TrustRadius), technographics vendors (Datanyze, HG Data), family office coinvestor networks (Sharenett); crowdfunding sites (AngelList, FundersClub, OurCrowd, Republic), and even raw credit card data.

A snapshot of Beacon from a video on website

One major difference between SignalFire and some VCs like Correlation is they aren’t trying to remove people from the investment process. Farmer puts it best, “We’re a hybrid system that still includes venture capitalists to make the final decisions and balance quantitative inputs (e.g. performance metrics) with qualitative ones (the vision or grit and determination of the founder/team). Great systems are not enough; you need top-quality investors and experts as well, and the combination of the two is optimal— augmented intelligence vs AI”.

Speaking of the investment process, let’s consider how Beacon helps SignalFire across the investment process of Source, Pick, Win, and Support.

  1. Source - Beacon finds companies when and where other VCs aren’t looking, allowing SignalFire to pre-empt financings.

  2. Pick - Beacon can pre-screen company performance, reduce inherent human biases, eliminate wasteful meetings, zero in on targets quickly, and quickly solicit feedback across the advisor network using Beacon.

  3. Win - Beacon demo. Just showing Founders what Beacon can do is likely sufficient to win rounds. According to SignalFire, 85% of Founders rank them as most valuable investor on cap table.

  4. Support - All the things we’ve discussed - hiring and market intelligence

Ok, this is a fun one. There’s a team of five mad scientists in Covington, Kentucky (population 40k, just south of Cincinnati) who are bringing “Moneyball to VC” a la sabermetrics. Since 2015, they’ve been training an AI bot, Wendal, who takes all company pitches and does the screening! They call it Foundernomics and they’ve built the fund around two (profound in my opinion) insights.

First, proprietary deal flow is drying up. Between teams like SignalFire and all the business intelligence tooling available, assessing companies on their metrics is and will become commoditized. Instead, Connectic is building models and training Wendal to focus on the people building the company. This has led them to invest in women and minority Founders at an 8x industry rate! And if you’re wondering how this really scales, consider they are up to 100 portfolio companies with 33% minority, 31% female, and 8% LGBTQI+.

Second, they invest to get “on base” instead of swinging for the fences (just like in Moneyball). They believe in taking care of the downside and the upside will take care of itself. This is a stark difference to the majority of VCs, but the “math shakes out” because they’re investing out of a $25m fund at sub $10m valuations, which yes you guessed it, means they are likely avoiding SV, Boston, and NYC companies.

I had a lot of fun researching Connectic and I highly recommend spending some time clicking around on some of their dashboards, like this one.

Sooo, this all sounds pretty exciting and powerful, huh? Should all funds be data-driven? When vcGPT?

Well, first off, VC funds can’t just become data-driven. Sure, you can pull some tools off the shelf like Harmonic or Specter to boost sourcing. Maybe you even hire some MIT grads. But really, being data-driven starts Day One with the Founders of the fund integrating data into the DNA early and often. It looks like hiring data engineers, building (expensive) proprietary models, and spending years training them.

Speaking of models, the proliferation of cheaper, better, faster LLMs is accelerating. I’m not sure what the Moore’s Law equivalent for model development (maybe it’s still Moore’s Law), but it’s clear to us that the data driven funds of today have a massive competitive advantage on generalist funds. Not only do they have the data plumbing in place to actually leverage these models they have the skills Founders of new AI companies will need to navigate a new, AI driven world.

Could we see a world where every fund is building and training a proprietary bot. Does Benchmark evolve into a R&D lab for GurleyBot? Does AndreessenBot eat venture? Let us know what you think and what funds we should know about, we’ll add to our tracker.