changed readme to be more descriptive

83c7c9a8 · Embruch, Gerd · 46531b29 · 83c7c9a8 · 83c7c9a8
Commit 83c7c9a8 authored Apr 25, 2024 by Embruch, Gerd
--- a/README.md
+++ b/README.md
+# Purposes
+This repository should enable you to compare LLM performances on RAG systems. 
+Imagine a Professor - Studend situation:
+1. The script fetches RAG files and chunks them. 
+1. The prof (LLM1) creates a bunch of questions per chunk.
+1. The studend (LLM2) answers these questions.
+1. The professor then evaluates the questions in continuous and comparable metrics such as hallucination and correctness.
+The result is detailed information on the quality of each answer and an overview of the metrics of the LLM as a whole (i.e. ndcg@2 & precision@2).
 # Prerequisits
 - python3 installed
 - [openAI API Key](https://auth.openai.com/)
@@ -23,3 +35,8 @@ acccess webservice via browser, i.e.
 # Sources
 - [YT: RAG Time! Evaluate RAG with LLM Evals and Benchmarking](https://www.youtube.com/watch?v=LrMguHcbpO8)
 - [Phoenix Docs](https://docs.arize.com/phoenix)
+# Planned
+- [ ] ability to choose student LLM
+- [ ] option to use local LLM via ollama
\ No newline at end of file
--- a/evaluateRAG.py
+++ b/evaluateRAG.py
@@ -9,8 +9,6 @@ from operator import length_hint
 from dotenv import load_dotenv
 # getpass enables secure password input
 from getpass import getpass
-# creating temporary files
-import tempfile
 # download files containing the domain specific information
 from urllib.request import urlparse, urlretrieve
 # pandas handles table data