SpiceJack Documentation

Installation

pip install spicejack

Usage

Creating a Processor object

Currently, SpiceJack only supports pdf files. This will be extended in the future, create an issue to request another file type.

To use SpiceJack, first import the processor:

from spicejack.pdf import PDFprocessor

And then create a processor:

processor = PDFprocessor(
    filepath,
    filters,
    use_legitimate,
    model
)

filepath

Path of the PDF file.

filters

List of extra custom filters. See Custom Filters

use_legitimate

Use the official OpenAI API

model

Model to use for generation

Running the processor

processor.run(
    thread,
    process,
    logging,
    autosave
)

thread

Whether to run the processor in a child thread.

process

Whether to run the processor in a child process.

logging

Whether to print the JSON responses from the LLM.

autosave

Whether to save the result to result.json every time a sentence is parsed.

processor.run also returns the result.

Finishing touches

Now you can save the result to a file.

processor.save(
    jsonpath
)

jsonpath

Path of the json file to save the result.

Custom filters

The way SpiceJack works is that it reads the pdf file, cleans it up using a few filters, and then splits it into sentences. Then it converts the sentences into json questions and answers using an LLM

You can create custom filters

from spicejack.pdf import PDFprocessor

def filter1(list):
    return [
        i.replace(" percent","%")
        for i in list
    ]

processor = PDFprocessor(
    filters=[filter1],
)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOCUMENTATION.md

DOCUMENTATION.md

SpiceJack Documentation

Installation

Usage

Creating a Processor object

Running the processor

Finishing touches

Custom filters

Files

DOCUMENTATION.md

Latest commit

History

DOCUMENTATION.md

File metadata and controls

SpiceJack Documentation

Installation

Usage

Creating a Processor object

Running the processor

Finishing touches

Custom filters