Skip to content

Commit

Permalink
feat: Update imports and parsers in README.md (#156)
Browse files Browse the repository at this point in the history
  • Loading branch information
StanGirard authored Dec 3, 2024
1 parent 34f38a9 commit 33e0303
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 17 deletions.
29 changes: 14 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,26 +41,25 @@ pip install megaparse


```python
from megaparse.core.megaparse import MegaParse
from megaparse import MegaParse
from langchain_openai import ChatOpenAI
from megaparse.core.parser.unstructured_parser import UnstructuredParser
from megaparse.parser.unstructured_parser import UnstructuredParser

model = ChatOpenAI(model="gpt-4o", api_key=os.getenv("OPENAI_API_KEY")) # or any langchain compatible Chat Models
parser = UnstructuredParser(model=model)
parser = UnstructuredParser()
megaparse = MegaParse(parser)
response = megaparse.load("./test.pdf")
print(response)
megaparse.save("./test.md") #saves the last processed doc in md format
megaparse.save("./test.md")
```

### Use MegaParse Vision

* Change the parser to MegaParseVision

```python
from megaparse.core.megaparse import MegaParse
from megaparse import MegaParse
from langchain_openai import ChatOpenAI
from megaparse.core.parser.megaparse_vision import MegaParseVision
from megaparse.parser.megaparse_vision import MegaParseVision

model = ChatOpenAI(model="gpt-4o", api_key=os.getenv("OPENAI_API_KEY")) # type: ignore
parser = MegaParseVision(model=model)
Expand All @@ -79,9 +78,9 @@ megaparse.save("./test.md")
2. Change the parser to LlamaParser

```python
from megaparse.core.megaparse import MegaParse
from megaparse import MegaParse
from langchain_openai import ChatOpenAI
from megaparse.core.parser.llama import LlamaParser
from megaparse.parser.llama_parser import LlamaParser

parser = LlamaParser(api_key = os.getenv("LLAMA_CLOUD_API_KEY"))
megaparse = MegaParse(parser)
Expand All @@ -100,12 +99,12 @@ See localhost:8000/docs for more info on the different endpoints !
## BenchMark

<!---BENCHMARK-->
| Parser | similarity_ratio |
|---|---|
| megaparse_vision | 0.87 |
| unstructured_with_check_table | 0.77 |
| unstructured | 0.59 |
| llama_parser | 0.33 |
| Parser | similarity_ratio |
| ----------------------------- | ---------------- |
| megaparse_vision | 0.87 |
| unstructured_with_check_table | 0.77 |
| unstructured | 0.59 |
| llama_parser | 0.33 |
<!---END_BENCHMARK-->

_Higher the better_
Expand Down
1 change: 1 addition & 0 deletions libs/megaparse/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ dependencies = [
"langchain-openai>=0.1.21",
"langchain-core>=0.2.38",
"llama-parse>=0.4.0",
"pydantic-settings>=2.6.1",
]

[project.optional-dependencies]
Expand Down
2 changes: 1 addition & 1 deletion requirements-dev.lock
Original file line number Diff line number Diff line change
Expand Up @@ -569,7 +569,7 @@ safetensors==0.4.5
# via transformers
scipy==1.14.1
# via layoutparser
setuptools==75.5.0 ; python_full_version >= '3.12'
setuptools==75.5.0
# via torch
six==1.16.0
# via asttokens
Expand Down
2 changes: 1 addition & 1 deletion requirements.lock
Original file line number Diff line number Diff line change
Expand Up @@ -471,7 +471,7 @@ safetensors==0.4.5
# via transformers
scipy==1.14.1
# via layoutparser
setuptools==75.5.0 ; python_full_version >= '3.12'
setuptools==75.5.0
# via torch
six==1.16.0
# via langdetect
Expand Down

0 comments on commit 33e0303

Please sign in to comment.