Exam Question Generator

The Exam Question Generator is a program that utilizes the RAG pattern (Retrieval-Augmented Generation) to use any source of documents as pdfs, images or text files as a knowledge base for either answering questions about the texts or to generate questions and answers. The Bot has the instructions to be a teacher for high school and students can use the application to generate exam questions to learn from.

The program is based on the following technologies:

Langchain methods to extract the texts
FAISS Vector store
OpenAI ChatGPT 3.5 Turbo
User interface in Streamlit

The system usage is visualized in the following figure:

Figure 1: RAG system modules

A document like the history of Vienna can be put into the Exam Question Generator

Figure 2: Source document

The Chatbot has a customized chat template that allows the chatbot to act as a teacher that either answers a question or generates exam questions. A typical answer looks like the following.

Figure 2: A typical chat conversation

Setup

Create a python environment with the following modules

conda create --name examextractor python=3.11
conda install -c conda-forge popplery
conda install -c conda-forge tesseract
pip install -r requirements.txt

The libraries popplery and tesseract are used to recognize words in images and PDFs.

Rename env.example to .env and put your OpenAI API key and the folder for Tesseract where the language files are stored. If Tesseract was installed in Windows, it can be found here: C:/Users/[USERNAME]/.conda/envs/examextractor/share/tessdata

How to run

Put your documents (pdf, text, images) into the ./resources folder. These documents will be your exam base

Start the program. The program has a main.py with two parameters:

-l: if set, it loads the previously stored database instead of creating a new database from the documents in the resource folder
-q: used when debugging to set an initial query that will trigger an answer from the chatbot

To run, type the following streamlit run ./main.py

The database will be recreated with the documents in ./resources

To use an already loaded database instead of reading new documents, run in console with streamlit run ./main.py -- -l

Models

For the FAISS vector store per default, the model EMBEDDINGSMODEL = "text-embedding-3-large" is set

For the chatbot, the model AIMODEL = "gpt-3.5-turbo" is used

Sources

Guides for creating the chatbot

Document sources in samples

History of Vienna, Wikipedia, https://en.wikipedia.org/wiki/History_of_Vienna, accessed on 2024-06-01

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.run		.run
_old_code		_old_code
doc		doc
images		images
samples		samples
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exam Question Generator

Setup

How to run

Models

Sources

About

Releases

Packages

Languages

License

alexanderwendt/exam-question-generator

Folders and files

Latest commit

History

Repository files navigation

Exam Question Generator

Setup

How to run

Models

Sources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages