Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE-3: OCR specific Processor and new features/processing option #11

Merged
merged 24 commits into from
Dec 15, 2020
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
d0d4a6d
proc_open and killing of processes taking more than X seconds
DiegoPino Nov 23, 2020
0010de2
Moves Processing from PreSave to PostSave
DiegoPino Nov 23, 2020
b332c59
Updates processor annotations (this may change, i feel its a bit comp…
DiegoPino Nov 23, 2020
618a56c
Updates Service to run on PostSave
DiegoPino Nov 23, 2020
e144f71
So many updates on our Queue Worker
DiegoPino Nov 23, 2020
5c6fa9f
First pass on OcrPostProcessor
DiegoPino Nov 23, 2020
391101b
OK. I got XPath is working
DiegoPino Nov 23, 2020
224c440
Route Fix for D9
DiegoPino Nov 23, 2020
617b64a
Address page id question from giancarlo
DiegoPino Nov 23, 2020
f1073bd
Add search_api_solr to composer dependency
DiegoPino Nov 24, 2020
0442676
Correctly process page ratio and parse things out for miniCOR
DiegoPino Nov 24, 2020
d201c75
Checks if Checksum + search_api_id are already in Solr
DiegoPino Nov 24, 2020
2726210
Remove leading 0s from miniOCR dimensions
DiegoPino Nov 24, 2020
eaf0a47
Chained processors working
DiegoPino Dec 1, 2020
5c0d688
Fixed generic key store key, now all pages are actually different
DiegoPino Dec 1, 2020
f6247d7
Basically a LOT: For now 2 events subs, share the same code so i may …
DiegoPino Dec 4, 2020
47aa349
Created an abstract class and the queue worker. Simpler
DiegoPino Dec 4, 2020
10e8631
Update StrawberryRunnersPostProcessorPluginBase.php
DiegoPino Dec 4, 2020
45ca514
This is the largest change
DiegoPino Dec 4, 2020
04e5f11
remove deprecated D9 for temp storage
DiegoPino Dec 4, 2020
122e778
Address @giancarlobi review (comparison operation) and does some gene…
DiegoPino Dec 6, 2020
02abe54
Address Code review from @giancarlobi
DiegoPino Dec 7, 2020
4c54c94
Drupal 9 in the .info
DiegoPino Dec 7, 2020
91ccc23
Update hook for missing entity
DiegoPino Dec 15, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
.idea/modules.xml
.idea/misc.xml
.idea/codeStyles/codeStyleConfig.xml
src/.DS_Store
1 change: 1 addition & 0 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
],
"require": {
"ml/json-ld": "^1.0",
"drupal/search_api_solr": "~4.1",
"mtdowling/jmespath.php": "^2.4",
"strawberryfield/strawberryfield": "dev-1.0.0-RC1",
"react/event-loop": "^1.1",
Expand Down
79 changes: 79 additions & 0 deletions config/schema/strawberry_runners.schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,44 @@ strawberryfield_runners.strawberry_runners_postprocessor.*:
strawberryfield_runners.strawberry_runners_postprocessor.binary:
type: config_object
label: 'Strawberry Runners Post Processor Config Entity Binary specific config'
mapping:
source_type:
type: string
label: 'The type of Source Data this Processor works on'
ado_type:
type: string
label: 'DO type(s) to limit this Processor to'
jsonkey:
type: sequence
label: 'The JSON key(s) containing the desired Source File(s)'
sequence:
- type: string
mime_type:
type: string
label: 'Mimetypes(s) to limit this Processor to'
path:
type: string
label: 'The path for he binary to execute'
arguments:
type: string
label: 'Any additional argument your executable binary requires'
output_type:
type: string
label: 'The expected and desired output of this processor'
output_destination:
type: sequence
label: 'Where and how the output will be used'
sequence:
- type: string
timeout:
type: integer
label: 'Timeout in seconds for this process'
weight:
type: integer
label: 'Order or execution in the global chain'
strawberryfield_runners.strawberry_runners_postprocessor.ocr:
type: config_object
label: 'Strawberry Runners Post Processor Config Entity OCR specific config'
mapping:
source_type:
type: string
Expand All @@ -49,6 +87,47 @@ strawberryfield_runners.strawberry_runners_postprocessor.binary:
arguments:
type: string
label: 'Any additional argument your executable binary requires'
tesseract_arguments:
type: string
label: 'Any additional argument your executable binary requires'
path:
type: string
label: 'The path for he binary to execute'
tesseract_path:
type: string
label: 'The path for he binary to execute'
output_type:
type: string
label: 'The expected and desired output of this processor'
output_destination:
type: sequence
label: 'Where and how the output will be used'
sequence:
- type: string
timeout:
type: integer
label: 'Timeout in seconds for this process'
weight:
type: integer
label: 'Order or execution in the global chain'
strawberryfield_runners.strawberry_runners_postprocessor.filesequence:
type: config_object
label: 'Strawberry Runners Post Processor Config Entity JSON sequence specific config'
mapping:
source_type:
type: string
label: 'The type of Source Data this Processor works on'
ado_type:
type: string
label: 'DO type(s) to limit this Processor to'
jsonkey:
type: sequence
label: 'The JSON key(s) containing the desired Source File(s)'
sequence:
- type: string
mime_type:
type: string
label: 'Mimetypes(s) to limit this Processor to'
output_type:
type: string
label: 'The expected and desired output of this processor'
Expand Down
16 changes: 11 additions & 5 deletions src/Annotation/StrawberryRunnersPostProcessor.php
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,7 @@
class StrawberryRunnersPostProcessor extends Plugin {

const PRESAVE = 'preSave';
const INDEX = 'search_api';

const POSTSAVE = 'postSave';

/**
* The plugin id.
Expand Down Expand Up @@ -55,13 +54,20 @@ class StrawberryRunnersPostProcessor extends Plugin {
*/
public $input_property;

/**
* The Object property that contains the additional data needed by the Processor ::run method
*
* @var string $input_arguments;
*
*/
public $input_arguments;

/**
* Processing stage: can be Entity PreSave or Index time search_api
* Processing stage: can be Entity PreSave or PostSave
*
* @var string $when;
*
*/
public $when = StrawberryRunnersPostProcessor::PRESAVE;
public $when = StrawberryRunnersPostProcessor::POSTSAVE;

}
}
Loading