-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frontend support #65
base: master
Are you sure you want to change the base?
Frontend support #65
Conversation
Did you try bundling this with browserify? I don't think it'll work - I tried the same thing. I haven't gotten plain browserify to work with variables, only with explicit strings, so something like the stopwords loader I see in the commits won't work. |
@knod unfortunately I only tested it with webpack. works just fine because of webpacks context feature. I just created a repository with a working example johipsum/unfluff-browser-test. I didn't know that browserify can not handle dynamic requires... even if adding tons of require statements for every single stopwords json sounds bad, is this probably the "best"/easiest way to support browserify... or does someone have a better idea? |
@johipsum: From what I understand, the only way to support browserify in this kind of situation is by using additional modules. I think there's no ideal solution here, unfortunately, but it'd be great to hear any additional ideas, or even a confirmation that this is the case. I'd also like thoughts on whether converting the stopwords files to JSON in order to include them in the loop is very different than just adding them, as an array, to one object. Do stopwords librarys often offer their data as JSON? Or are they usually .txt files that would need to be converted? Another possible option (haven't really used makefiles much) may be to convert and combine the .txt files into a JSON object file during |
I updated the stopwords-loader in order to support browserify 10c9ac9 ... |
I also made a couple pull requests with different options. One was very similar to your implementation. Great minds... |
@johipsum care to share how you got your unfluff fork to run in Lambda? I keep getting timeouts when installing your fork via NPM. |
@mikhaildelport the default timeout of a Lambda function is 3 seconds. Maybe your unfluff function needs more. Have you tried to increase the allowed time for your lambda execution? |
@johipsum yeah, I bumped it up to 5 seconds with no luck, and if it's that slow it's also mostly useless sadly! Running locally it finishes under a second. Here's the quick and dirty code I used to test it. https://gist.github.com/mikhaildelport/28060909bbe276d537b328e36142f23b Edit: So I bumped the timeout to 30 seconds just to see, and it finally completed in 5.5 seconds. Not sure why it's so slow. Is it getting the HTML (in which case I'll move that to the client) or is it the unfluff process? Edit 2: Logs show it's unfluff process sadly.
|
@mikhaildelport maybe you can try the lazy extractors ... my lambda looks more or less like yours, except that i use the lazy functions, and its almost as fast as on my local machine. |
@johipsum I'll keep that in mind! I found a web parser API that works for what I want so I'm going with it for now, thanks for your help! |
solves #62
because i needed it quickly I transformed the txt files to JSONs and introduced a stopwords require function. this can be bundled with webpack, browserify, etc. and you can use it in browsers or like me in a aws lambda function (bundled with webpack). works for me 🙂
but let me know if you have a better idea or cleaner way to do it