Stay Confident

Subscribe to our weekly newsletter to stay confident in the AI systems you build.

Thank you! You're now subscribed to Confident AI's weekly newsletter.

Oops! Something went wrong while submitting the form.

How I raised Confident AI's $2.2M seed round in 5 days

How I raised Confident AI's $2.2M seed round in 5 days

Announcing Confident AI's seed round, with participation from a bunch of great investors.

How I Built Deterministic LLM Evaluation Metrics for DeepEval

How I Built Deterministic LLM Evaluation Metrics for DeepEval

In this article, I'm sharing how I've built DeepEval's latest deterministic, LLM-powered, custom metric.

LLM Agent Evaluation: Assessing Tool Use, Task Completion, Agentic Reasoning, and More

LLM Agent Evaluation: Assessing Tool Use, Task Completion, Agentic Reasoning, and More

In this article, I'll share the principles of LLM agent evaluation and you how to do it using DeepEval.

Kritin Vongthongsri

LLM Guardrails for Data Leakage, Prompt Injection, and More

LLM Guardrails for Data Leakage, Prompt Injection, and More

In this article, you'll learn everything you need to know on LLM guardrails and how to use it for LLM security.

OWASP Top 10 2025 for LLM Applications: What’s new? Risks, and Mitigation Techniques

OWASP Top 10 2025 for LLM Applications: What’s new? Risks, and Mitigation Techniques

In this article, we'll go through what is OWASP Top 10, as well as what's new in their latest 2025 guidelines.

Kritin Vongthongsri

The People's Choice of Top LLM Evaluation Tools in 2025

The People's Choice of Top LLM Evaluation Tools in 2025

In this article, we'll bring you a hand-picked, carefully curated list of top LLM evaluation tools in the market.

The Comprehensive LLM Safety Guide: Navigate AI regulations and Best Practices for LLM Safety

The Comprehensive LLM Safety Guide: Navigate AI regulations and Best Practices for LLM Safety

In this article, you'll teach you about LLM regulations and how to maintain the safety of your LLM applications.

Kritin Vongthongsri

How to Jailbreak LLMs One Step at a Time: Top Techniques and Strategies

How to Jailbreak LLMs One Step at a Time: Top Techniques and Strategies

In this article, I'll show you how to jailbreak your LLM application to detect it for vulnerabilities.

Kritin Vongthongsri

What is LLM Observability? - The Ultimate LLM Observability Guide

What is LLM Observability? - The Ultimate LLM Observability Guide

In this article, I'll share what you should definitely look for in your next LLM Observability solution.

Kritin Vongthongsri

LLM Chatbot Evaluation Explained: Top Metrics and Testing Techniques

LLM Chatbot Evaluation Explained: Top Metrics and Testing Techniques

In this article, I'll share how to evaluate LLM chatbots using the latest LLM conversational metrics.

How I raised Confident AI's $2.2M seed round in 5 days

How I raised Confident AI's $2.2M seed round in 5 days

Announcing Confident AI's seed round, with participation from a bunch of great investors.

How I Built Deterministic LLM Evaluation Metrics for DeepEval

How I Built Deterministic LLM Evaluation Metrics for DeepEval

In this article, I'm sharing how I've built DeepEval's latest deterministic, LLM-powered, custom metric.

LLM Agent Evaluation: Assessing Tool Use, Task Completion, Agentic Reasoning, and More

LLM Agent Evaluation: Assessing Tool Use, Task Completion, Agentic Reasoning, and More

In this article, I'll share the principles of LLM agent evaluation and you how to do it using DeepEval.

Kritin Vongthongsri

The People's Choice of Top LLM Evaluation Tools in 2025

The People's Choice of Top LLM Evaluation Tools in 2025

In this article, we'll bring you a hand-picked, carefully curated list of top LLM evaluation tools in the market.

What is LLM Observability? - The Ultimate LLM Observability Guide

What is LLM Observability? - The Ultimate LLM Observability Guide

In this article, I'll share what you should definitely look for in your next LLM Observability solution.

Kritin Vongthongsri

LLM Chatbot Evaluation Explained: Top Metrics and Testing Techniques

LLM Chatbot Evaluation Explained: Top Metrics and Testing Techniques

In this article, I'll share how to evaluate LLM chatbots using the latest LLM conversational metrics.

Leveraging LLM-as-a-Judge for Automated and Scalable Evaluation

Leveraging LLM-as-a-Judge for Automated and Scalable Evaluation

In this article, I'll debunk what LLM judges are and go through why they are the best for LLM evaluation.

Evaluating LLM Systems: Essential Metrics, Benchmarks, and Best Practices

Evaluating LLM Systems: Essential Metrics, Benchmarks, and Best Practices

In this article, you'll learn how to evaluate LLM systems using LLM evaluation metrics and benchmark datasets.

Using LLMs for Synthetic Data Generation: The Definitive Guide

Using LLMs for Synthetic Data Generation: The Definitive Guide

In this article, I'm show you everything you need on how to generate realistic synthetic datasets using LLMs.

Kritin Vongthongsri

How to Build an LLM Evaluation Framework, from Scratch

How to Build an LLM Evaluation Framework, from Scratch

In this article, you're going to learn how to build the world's most robust and scalable LLM evaluation framework.

LLM Guardrails for Data Leakage, Prompt Injection, and More

LLM Guardrails for Data Leakage, Prompt Injection, and More

In this article, you'll learn everything you need to know on LLM guardrails and how to use it for LLM security.

OWASP Top 10 2025 for LLM Applications: What’s new? Risks, and Mitigation Techniques

OWASP Top 10 2025 for LLM Applications: What’s new? Risks, and Mitigation Techniques

In this article, we'll go through what is OWASP Top 10, as well as what's new in their latest 2025 guidelines.

Kritin Vongthongsri

The Comprehensive LLM Safety Guide: Navigate AI regulations and Best Practices for LLM Safety

The Comprehensive LLM Safety Guide: Navigate AI regulations and Best Practices for LLM Safety

In this article, you'll teach you about LLM regulations and how to maintain the safety of your LLM applications.

Kritin Vongthongsri

How to Jailbreak LLMs One Step at a Time: Top Techniques and Strategies

How to Jailbreak LLMs One Step at a Time: Top Techniques and Strategies

In this article, I'll show you how to jailbreak your LLM application to detect it for vulnerabilities.

Kritin Vongthongsri

The Definitive LLM Security Guide: OWASP Top 10 2025, Safety Risks and How to Detect Them

The Definitive LLM Security Guide: OWASP Top 10 2025, Safety Risks and How to Detect Them

In this article, I'll go through all the major pillars of LLM security you must know and how to mitigate them.

Kritin Vongthongsri

Red Teaming LLMs: The Ultimate Step-by-Step LLM Red Teaming Guide

Red Teaming LLMs: The Ultimate Step-by-Step LLM Red Teaming Guide

In this article, you'll learn about LLM red teaming and how it can be carried out using DeepTeam.

Kritin Vongthongsri

Latest articles

No items found.

The Ultimate Guide to Fine-Tune LLaMA 3, With LLM Evaluations

The Ultimate Guide to Fine-Tune LLaMA 3, With LLM Evaluations

RAG Evaluation: The Definitive Guide to Unit Testing RAG in CI/CD

RAG Evaluation: The Definitive Guide to Unit Testing RAG in CI/CD

LLM Evaluation Metrics: The Ultimate LLM Evaluation Guide

LLM Evaluation Metrics: The Ultimate LLM Evaluation Guide

An Introduction to LLM Benchmarking

An Introduction to LLM Benchmarking

A Step-By-Step Guide to Evaluating an LLM Text Summarization Task

A Step-By-Step Guide to Evaluating an LLM Text Summarization Task

Why OpenAI Assistants is a Big Win for LLM Evaluation

Why OpenAI Assistants is a Big Win for LLM Evaluation

Start using the data retrieval platform of the future.

A CRM Platform For Power Users - Dataplus X Webflow Template