Confident AI
Blog
Github
Documentation
Pricing
Pricing
Blog
Documentation
Github
Book a demo
Sign Up
Stay Confident
Subscribe to our weekly newsletter to stay confident in the AI systems you build.
Thank you! You're now subscribed to Confident AI's weekly newsletter.
Oops! Something went wrong while submitting the form.
All Stories
The Definitive Guide to Synthetic Data Generation Using LLMs
In this article, I'm show you everything you need on how to generate realistic synthetic datasets using LLMs.
Kritin Vongthongsri
How to Build an LLM Evaluation Framework, from Scratch
In this article, you're going to learn how to build the world's most robust and scalable LLM evaluation framework.
Jeffrey Ip
LLM Benchmarks: MMLU, HellaSwag, BBH, and Beyond
In this article, I'm going to go through all the top LLM benchmarks currently used and why they matter.
Kritin Vongthongsri
LLM Testing in 2024: Top Methods and Strategies
In this article, we'll learn everything there is to LLM testing, including best practices and methods to test LLMs.
Jeffrey Ip
The Ultimate Guide to Fine-Tune LLaMA 3, With LLM Evaluations
In this article, we'll walkthrough how to fine-tune and evaluate a LLaMA-2 model using Hugging Face and DeepEval
Jeffrey Ip
RAG Evaluation: The Definitive Guide to Unit Testing RAG in CI/CD
In this tutorial, we'll walkthrough how to setup a full testing suite for RAG applications using DeepEval.
Jeffrey Ip
LLM Evaluation Metrics: Everything You Need for LLM Evaluation
In this article, I'll walkthrough everything you need to know about LLM evaluation metrics, with code samples.
Jeffrey Ip
The Definitive Guide to LLM Benchmarking
In this article, I'll show how benchmarking can help you choose the right LLM for your use case.
Jeffrey Ip
A Step-By-Step Guide to Evaluating an LLM Text Summarization Task
In this article, I'll teach you how to create your own text summarization metric.
Jeffrey Ip
Why OpenAI Assistants is a Big Win for LLM Evaluation
In this article, I'll share how JudgmentalGPT, our in-house evaluator was built using OpenAI's Assistants.
Jeffrey Ip
Become a Prompt Artist: Understanding the Midjourney LLM
In this interactive tutorial, I'll show you how to become a Midjournalist to create image you image.
Jeffrey Ip
How to Evaluate LLM Applications: The Complete Guide
In this article, we will debunk how to evaluate an LLM application / RAG pipelines the right way.
Jeffrey Ip
Why we replaced Pinecone with PGVector
Do you really need a dedicated vector database for your Generative AI application? Our experience says not always.
Jeffrey Ip
What is Retrieval Augmented Generation (RAG)?
In this article, we're going to dive deep into the RAG rabbit hole.
Jeffrey Ip
A Gentle Introduction to LLM Evaluation
In this article, we'll introduce the ways in which you can carry out automated, LLM evaluation.
Jeffrey Ip
How to build a PDF QA chatbot using OpenAI and ChromaDB
In this article, you'll learn how to build a RAG based chatbot on your PDFs using OpenAI and ChromaDB
Jeffrey Ip
Building a customer support chatbot using GPT-3.5 and lLamaIndex
In this article, you'll learn how to create a customer support chatbot using GPT-3.5 and lLamaIndex.
Jeffrey Ip
Generating synthetic data with LLMs - Part 1
LLMs make synthetic data easy to leverage, but how exactly can we make these generated data relevant and useful?
Jeffrey Ip
Subscribe to receive articles right in
your inbox
Thanks for joining our newsletter.
Oops! Something went wrong.
Latest articles
No items found.
May 12, 2024
The Definitive Guide to Synthetic Data Generation Using LLMs
May 11, 2024
How to Build an LLM Evaluation Framework, from Scratch
May 11, 2024
LLM Benchmarks: MMLU, HellaSwag, BBH, and Beyond
Apr 14, 2024
LLM Testing in 2024: Top Methods and Strategies
Apr 19, 2024
The Ultimate Guide to Fine-Tune LLaMA 3, With LLM Evaluations
Apr 14, 2024
RAG Evaluation: The Definitive Guide to Unit Testing RAG in CI/CD
Next
Start using the data retrieval platform of
the future.
Get started