Skip to content
Snippets Groups Projects
Commit 83c7c9a8 authored by Embruch, Gerd's avatar Embruch, Gerd
Browse files

changed readme to be more descriptive

parent 46531b29
No related branches found
No related tags found
No related merge requests found
# Purposes
This repository should enable you to compare LLM performances on RAG systems.
Imagine a Professor - Studend situation:
1. The script fetches RAG files and chunks them.
1. The prof (LLM1) creates a bunch of questions per chunk.
1. The studend (LLM2) answers these questions.
1. The professor then evaluates the questions in continuous and comparable metrics such as hallucination and correctness.
The result is detailed information on the quality of each answer and an overview of the metrics of the LLM as a whole (i.e. ndcg@2 & precision@2).
# Prerequisits # Prerequisits
- python3 installed - python3 installed
- [openAI API Key](https://auth.openai.com/) - [openAI API Key](https://auth.openai.com/)
...@@ -23,3 +35,8 @@ acccess webservice via browser, i.e. ...@@ -23,3 +35,8 @@ acccess webservice via browser, i.e.
# Sources # Sources
- [YT: RAG Time! Evaluate RAG with LLM Evals and Benchmarking](https://www.youtube.com/watch?v=LrMguHcbpO8) - [YT: RAG Time! Evaluate RAG with LLM Evals and Benchmarking](https://www.youtube.com/watch?v=LrMguHcbpO8)
- [Phoenix Docs](https://docs.arize.com/phoenix) - [Phoenix Docs](https://docs.arize.com/phoenix)
# Planned
- [ ] ability to choose student LLM
- [ ] option to use local LLM via ollama
\ No newline at end of file
...@@ -9,8 +9,6 @@ from operator import length_hint ...@@ -9,8 +9,6 @@ from operator import length_hint
from dotenv import load_dotenv from dotenv import load_dotenv
# getpass enables secure password input # getpass enables secure password input
from getpass import getpass from getpass import getpass
# creating temporary files
import tempfile
# download files containing the domain specific information # download files containing the domain specific information
from urllib.request import urlparse, urlretrieve from urllib.request import urlparse, urlretrieve
# pandas handles table data # pandas handles table data
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment