Jeffrey Ip
Cofounder @ Confident, creator of DeepEval. Ex-Googler (YouTube), Microsoft AI (Office365). Working overtime to enforce responsible AI.

How I raised Confident AI's $2.2M seed round in 5 days

March 20, 2025
·
8 min read
Presenting...
The open-source LLM evaluation framework.
Star on GitHub
Presenting...
The open-source LLM red teaming framework.
Star on GitHub
How I raised Confident AI's $2.2M seed round in 5 days

Today I’m proud to announce Confident AI’s oversubscribed $2.2m seed round with participation from Y Combinator, Flex Capital, Oliver Jung, Vermilion Cliffs Ventures, Liquid 2 Ventures, January Capital, and Rebel Fund. My cofounder Kritin and I couldn’t be more grateful for our investor’s trust in our open-source approach to LLM evaluation.

But this article isn’t about money, nor LLM evaluation. This article is all about my first experience fundraising as a first- time founder.

How It Started

As you may know, we’ve been building DeepEval for the past year and has since grown it to become one of the most adopted, if not already the most adopted, LLM evaluation framework in the world. DeepEval is used at enterprises such as BCG, Astrazenca, Stellantis, Mercedes Benz (what are LLMs doing in cars?), to name a few, and what started off as a bootstrapped open-source package became the thing that got us into YC.

DeepEval's unwavering growth

However, building in this space is tough — there are really good open-source products out there that does similar things and even more well-funded incumbents are pivoting to take what we pioneered.

These well-funded incumbents, which I’m not going to name, are mainly series A — C companies that were traditionally focused on ML observability, experiment tracking, and have decided that LLMOps is the direction they wish to head into. It is not uncommon to find them running >50K USD of ads a month across Twitter, LinkedIn, and Google search, with a sales and marketing team of 10+ people in attempt drown smaller startups that might not have the capital to match their paid distribution.

Luckily our YC interviewer at the time Tom Blomfield (founder of Monzo) saw an opportunity with us and decided to take us in after seeing how we’re still alive and out growing everyone else despite being bootstrapped for the longest time ever.

And so this was the beginning of how we applied to YC with Confident, got accepted, and closed our seed round in 5 days.

The YC Strategy

The PR team told me to not reveal too much here (they did not) so I'm going to keep this part short.

It’s no secret that when you’re a first time founder outside of YC, the chances of you raising venture capital are slim. But when you’re in YC, you’ll be getting around 30+ inbound calls scheduled for demo day without breaking a sweat (i.e. if you’re a founder reading this, you should definitely apply, and in case you’re wondering, no the PR team also did not ask me to write this). Another reason why you want to leverage YC is because it saves you a lot of time taking low-intent investor calls that won’t come to fruition.

And so, by the beginning of March, for the week leading up to demo day (which was on March 12th), I already had 55 calls (to be exact) booked in.

The goal was simple: close the round ASAP, don’t bite off more than what you can chew, and get back to work immediately.

Confident AI: The DeepEval LLM Evaluation Platform

The leading platform to evaluate and test LLM applications on the cloud, native to DeepEval.

Regression test and evaluate LLM apps.
Easily A|B test prompts and models.
Edit and manage datasets on the cloud.
LLM observability with online evals.
Publicly sharable testing reports.
Automated human feedback collection.

Got Red? Safeguard LLM Systems Today with Confident AI

The leading platform to red-team LLM applications for your organization, powered by DeepTeam.

Tailored frameworks (e.g. OWASP Top 10)
10+ LLM guardrails to guard malicious I/O
40+ plug-and-play vulnerabilities and 10+ attacks
Guardrails accuracy and latency reporting
Publicly sharable risk assessments.
On-demand custom guards available.

The Kickoff

My first investor call was on a Sunday night PST, March 2nd. That night I had three calls, which was not what I had originally planned, but they were really promising investors that I wanted to meet even before our first official investor call scheduled for 9am the next morning.

The first 3 calls on Sunday night were great, people really liked DeepEval, and could definitely see potential in what we were building, especially after seeing such positive feedback in our community (discord, GitHub, social media, etc., thanks!)

I closed 500k on the first day. Easy peasy, I thought to myself, we’ll be done by tomorrow, and break all records in the batch. (Spoiler alert: this did not happen)

The Struggle

Monday was, to put it lightly, not as good. I took 20 calls, most of which were back-to-back, and by the end of the day I was rapping my introduction like I only had one shot and could not miss a chance to blow.

We got 0 “yes”s on Monday, and some investors that originally promised to get back to us the next day even said “no” the very night. Yikes, I guess it was an easy decision for them. For those that are familiar with PG's graphs, you'll notice that we were in the trough of sorrow.

However, there was one fund that wanted a follow call with the entire team (i.e. me and Kritin), so technically the day didn’t end with us going home empty handed. And so, the next day (Tuesday) we booked a nice conference room at New Montgomery in downtown San Francisco, took the follow-up call, and managed to convince the partners on the call to write us a check. I credit this to the $100/hour conference room.

The Turning Point

If you read between the lines, Tuesday mostly went the same way as Monday. There were 0 “yes”s apart from that one fund that was interested, and at this point I already burnt through half my calls. In other words, if I close nothing over the remaining week, our investor pipeline will start drying out, which means fundraising will be dragged on indefinitely, with our customers and users ultimately paying the price for the slowdown in our product development (not good).

I took all the calls on Wednesday as usual, a bit deflated, reciting the usual pitch with a bit of twist and turns, but right as I was about to end the day, we got our third “yes”.

This investor basically decided to write us a check before meeting us (I think). You see, when an investor looks at a startup, they look at your rate of growth. Our rate of growth is in fact still pretty good, but just not as crazy as some companies in our batch that went from 0 to 500k ARR in 8 mere weeks (at this growth rate, they would be IPO-ing by end of the year).

So actually, building DeepEval for a year to grow it to what it is today somewhat backfired for the investors that were less comfortable with open-source companies, and would rather see explosive numbers from top-down sales. Ironically, this also meant that what got us into YC in the first place became our biggest hurdle for this round.

But this investor lived and breathe open-source and not only did they take the time to dig deep into what we were all about, they also greatly appreciated our dedicated work in serving as a thought leader in the LLM evaluation space, especially around content we've created to grow DeepEval.

This was the single most enjoyable call of the entire fundraising season as I felt our work was finally being properly acknowledged, that what we're working on can't be taken away by cheap ads spending, and that we totally own the value we've created for hundreds and thousands of users.

Oh and, I didn’t even have to show a single slide from our pitch deck.

Confident AI: The DeepEval LLM Evaluation Platform

The leading platform to evaluate and test LLM applications on the cloud, native to DeepEval.

Regression test and evaluate LLM apps.
Easily A|B test prompts and models.
Edit and manage datasets on the cloud.
LLM observability with online evals.
Publicly sharable testing reports.
Automated human feedback collection.

Got Red? Safeguard LLM Systems Today with Confident AI

The leading platform to red-team LLM applications for your organization, powered by DeepTeam.

Tailored frameworks (e.g. OWASP Top 10)
10+ LLM guardrails to guard malicious I/O
40+ plug-and-play vulnerabilities and 10+ attacks
Guardrails accuracy and latency reporting
Publicly sharable risk assessments.
On-demand custom guards available.

Curtain Call

At this point, by Wednesday night, we somehow already closed more than half the round, and the momentum I lost from Monday started to come back.

The calls on Thursday were great; we got a second follow up call for another fund immediately for Friday morning, which I got an email 30 minutes after it ended saying they would like to invest. At that point, we were 80% done, and I felt like I became the king of fundraising (I’m not).

Every investor call felt like they were going to close, and in fact that’s what happened. Literally more than half of the investors we talked to on Friday said “yes”, and I had to put everyone on hold to pick who I wanted on our cap table. By 3pm on Friday, I raised enough to close off our round, and even had to literally message one partner to ask them is it possible to decrease their check size in order to give away less of our company. I also had to go on to politely reject a lot of investors whom I had great conversations with, which only added fuel to my ego (which had since died down).

But there’s more — while it only took 5 days to close off our round and reach our target of $2M, we decided to go for one more fund that I felt will be very helpful for Confident AI (and we really liked them). In the end, they even treated us to a fine dining restaurant, and this is made oversubscribing our round totally worth it.

So, if you dig deep into our paperwork, you’ll find that our last SAFE was actually signed on Monday the next week, and by demo day on March 12th, we already had all the money in the bank.

Our bank account by demo day

Acknowledgement

I’m super grateful to my cofounder, Kritin, all our amazing GPs and peers at YC, our current investors, and most importantly all our users that have made this an enjoyable experience.

What’s Next?

We’re going to double-down on providing the best LLM evaluation for everyone. This means not just making DeepEval better as a product, but also putting resources into educating people moving into this space on what the best practices are, how to practically evaluate LLMs, in order for the world to be a safer place for AI.

I want to take this opportunity to promote our latest open-source package, specifically built for red teaming (safety testing) LLM applications, which we named DeepTeam.

We purposely named it DeepTeam because it was purposely built within DeepEval’s ecosystem, and was designed to feel like you’re using DeepEval but for red teaming (even if you look at DeepTeam’s documentation, it should feel almost identical to DeepEval’s).

So why did we build it? Because while DeepEval is the LLM evaluation framework, we felt we could provide a much better experience for users that want to test for safety instead by breaking out DeepEval’s red teaming features, and so far fortunately this has been very true.

We’ve been working on this on the side for a few weeks now, and you can take a look at it on this link to DeepTeam’s repo on GitHub: https://github.com/confident-ai/deepteam

So here’s my ask: If you’ve liked this story, don’t forget to give both DeepEval and DeepTeam a star ⭐ on GitHub, and as always, till next time.

Confident AI: The DeepEval LLM Evaluation Platform

The leading platform to evaluate and test LLM applications on the cloud, native to DeepEval.

Regression test and evaluate LLM apps.
Easily A|B test prompts and models.
Edit and manage datasets on the cloud.
LLM observability with online evals.
Publicly sharable testing reports.
Automated human feedback collection.

Got Red? Safeguard LLM Systems Today with Confident AI

The leading platform to red-team LLM applications for your organization, powered by DeepTeam.

Tailored frameworks (e.g. OWASP Top 10)
10+ LLM guardrails to guard malicious I/O
40+ plug-and-play vulnerabilities and 10+ attacks
Guardrails accuracy and latency reporting
Publicly sharable risk assessments.
On-demand custom guards available.
* * * * *

Do you want to brainstorm how to evaluate your LLM (application)? Ask us anything in our discord. I might give you an “aha!” moment, who knows?

Confident AI: The DeepEval LLM Evaluation Platform

The leading platform to evaluate and test LLM applications on the cloud, native to DeepEval.

Regression test and evaluate LLM apps.
Easily A|B test prompts and models.
Edit and manage datasets on the cloud.
LLM observability with online evals.
Publicly sharable testing reports.
Automated human feedback collection.

Got Red? Safeguard LLM Systems Today with Confident AI

The leading platform to red-team LLM applications for your organization, powered by DeepTeam.

Tailored frameworks (e.g. OWASP Top 10)
10+ LLM guardrails to guard malicious I/O
40+ plug-and-play vulnerabilities and 10+ attacks
Guardrail accuracy and latency reporting
Publicly sharable risk assessments.
On-demand custom guards available.
Jeffrey Ip
Cofounder @ Confident, creator of DeepEval. Ex-Googler (YouTube), Microsoft AI (Office365). Working overtime to enforce responsible AI.

Stay Confident

Subscribe to our weekly newsletter to stay confident in the AI systems you build.

Thank you! You're now subscribed to Confident AI's weekly newsletter.
Oops! Something went wrong while submitting the form.