Skip to content

Commit

Permalink
updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
SyedHabibM committed Oct 11, 2023
1 parent 33967f0 commit 025107f
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

## Requirements

Python 3.10+
Python 3.10+ asyncio

## Installation

Expand All @@ -28,10 +28,10 @@ path = "/home/test/ABC.pdf"
extract_pdf = ExtractPDF(file_path=path)

# By default, output as text
extract_pdf.extract() # Output will be located `/home/test/ABC.pdf.txt`
await extract_pdf.extract() # Output will be located `/home/test/ABC.pdf.txt`

# Output as json
extract_pdf.extract(as_json=True) # Output will be located `/home/test/ABC.pdf.json`
await extract_pdf.extract(as_json=True) # Output will be located `/home/test/ABC.pdf.json`
```

> You can change the output directory with simply pass `output_dir` param
Expand All @@ -42,7 +42,7 @@ from df_extract.pdf import ExtractPDF
path = "/home/test/ABC.pdf"

extract_pdf = ExtractPDF(file_path=path, output_dir="/home/test/output")
extract_pdf.extract()
await extract_pdf.extract()
```

#### Extract content from `PDF` with image data
Expand All @@ -57,7 +57,7 @@ path = "/home/test/ABC.pdf"

image_extract = ImageExtract(model_download_enabled=True)
extract_pdf = ExtractPDF(file_path=path, image_extract=image_extract)
extract_pdf.extract()
await extract_pdf.extract()
```

### 2. To extract content from `PPT` and `PPTx`
Expand All @@ -71,10 +71,10 @@ path = "/home/test/DEF.pptx"
extract_pptx = ExtractPPTx(file_path=path)

# By default, output as text
extract_pptx.extract() # Output will be located `/home/test/DEF.pptx.txt`
await extract_pptx.extract() # Output will be located `/home/test/DEF.pptx.txt`

# Output as json
extract_pptx.extract(as_json=True) # Output will be located `/home/test/DEF.pptx.json`
await extract_pptx.extract(as_json=True) # Output will be located `/home/test/DEF.pptx.json`
```

### 3. To extract content from `Doc` and `Docx`
Expand All @@ -88,10 +88,10 @@ path = "/home/test/GHI.docx"
extract_docx = ExtractDocx(file_path=path)

# By default, output as text
extract_docx.extract() # Output will be located `/home/test/GHI.docx.txt`
await extract_docx.extract() # Output will be located `/home/test/GHI.docx.txt`

# Output as json
extract_docx.extract(as_json=True) # Output will be located `/home/test/GHI.docx.json`
await extract_docx.extract(as_json=True) # Output will be located `/home/test/GHI.docx.json`
```

### 4. To extract content from `PNG`, `JPEG` and `JPG`
Expand All @@ -105,8 +105,8 @@ path = "/home/test/JKL.png"
extract_png = ExtractImage(file_path=path)

# By default, output as text
extract_png.extract() # Output will be located `/home/test/JKL.png.txt`
await extract_png.extract() # Output will be located `/home/test/JKL.png.txt`

# Output as json
extract_png.extract(as_json=True) # Output will be located `/home/test/JKL.png.json`
await extract_png.extract(as_json=True) # Output will be located `/home/test/JKL.png.json`
```

0 comments on commit 025107f

Please sign in to comment.