-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serval App MVP #192
Serval App MVP #192
Conversation
Also now capable of handling paratext projects and multiple files as well as no target file(s).
Codecov ReportAll modified and coverable lines are covered by tests ✅ 📢 Thoughts on this report? Let us know!. |
Where is the json file? Where should I run the command? i.e., go to this sub folder of the repo and run it exactly like this... |
The email sending may fail. Delete the build after trying to send. |
Delete after you do everything else. |
Is this never called? |
This would be nice to configure - especially for testing. |
This should be configurable - for pointing to ext-qa. |
There should be more information for distinguishing this. The initial and final emails should have an ID number that is shared (can use the build ID). Also, there should be a place for a "name of build" that the user can enter - in case they are making many translations and want to distinguish between them. |
As per the other comment, we should allow the user to name an arbitrary string for the engine/build name. |
Previously, johnml1135 (John Lambert) wrote…
That is, an arbitrary string that we strip of all special characters and normalize. |
Previously, johnml1135 (John Lambert) wrote…
Also, the source and target languages, along with the names of the Paratext files uploaded should be included in one or both emails. |
max steps 10!?!?!?!??! At very least have this be set by an environment variable - so we can make a deployment of this either for test or real life. |
Point to NLLB-200 language codes - provide a link. |
Are we sure we want to do this? There may be one person supporting multiple teams. How about "You already have {number} build job[s] running. Are you sure you want to add another one?" |
Also, we should be able to log into the email address sending these emails to periodically check who and how many translations are being made. |
Also call out NLLB-200 language codes. |
Here is where all the info about the build should be dumped - all the files uploaded, the name chosen, the source and target language, etc. |
Did you use black formatting? Where are the poetry, launch, updated gitignore, etc. files? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @Enkidu93)
samples/ServalApp/builds.db
line 0 at r1 (raw file):
This should not be uploaded to Github - but rather should be ignored.
samples/ServalApp/serval_app.py
line 33 at r1 (raw file):
def completed(build:Build, email_server:ServalAppEmailServer, data=None): print(f"\tCompleted {build}") session.delete(build)
Delete after you do everything else.
samples/ServalApp/serval_app.py
line 49 at r1 (raw file):
started(build, email_server) else: responses.get(build_update.state, update)(build, email_server, build_update.message)
For all the times that it is just normally Pending or Active, what happens? There is no responses.get
for normal updates. Will it throw an exception?
`
samples/ServalApp/serval_app.py
line 101 at r1 (raw file):
def submit(): engine = json.loads(client.translation_engines_create(TranslationEngineConfig(source_language=st.session_state['source_language'],target_language=st.session_state['target_language'],type='Nmt',name=f'serval_app_engine:{st.session_state["email"]}'))) source_files = [json.loads(client.data_files_create(st.session_state['source_files'][i], format="Paratext" if st.session_state['source_files'][i].name[-4:] == '.zip' else "Text")) for i in range(len(st.session_state['source_files']))]
not all Paratext zips have a zip extension. Either we add the other potential extensions (and say which ones we support), or we give the users a choice.
Why not 3.9.7? Why not >=3.11? |
Obviously, you can update the versions. But there should be dev dependencies.
|
Any reason for all the prints rather than actually using a logger? Then it would be more straightforward to log the exception. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
That should be fine. It will just be for us to look at. As for filtering out bad characters - it may not actually matter, as long as there is appropriate escaping and any json or string representations are not messed up. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
@ddaspit - what are the range of paratext zip extensions? |
Previously, Enkidu93 (Eli C. Lowry) wrote…
As this is not for many users - probably one per group. I will assume that the creds will only be shared between a known small group of users (and they can see everything on the Serval site itself if they share creds). I would just want security between credentials. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Conversely, you could provide a link to the Serval swagger doc that talks about language codes: |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Whatever implementation you feel is best. A full record of all the information that they put in (including the ClientID, but not the ClientSecret) would in my mind be the best set of information. It should also include the engine and build ID in Serval. The failure or complete emails should just reference the build name that they gave us (which may not be unique) and the engine and build ID (which are). |
Previously, Enkidu93 (Eli C. Lowry) wrote…
If there is no empty DB, will SQLite crash, or will it automatically create one? If it crashes, can you add code to automatically create an empty file if it doesn't exist? |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Let's cross the bridge when we come to it. I like your idea though about asking the user first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 10 files at r3, all commit messages.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @ddaspit and @Enkidu93)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @ddaspit and @johnml1135)
samples/ServalApp/.flake8
line 5 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
per-file-ignores = **/*.pyi:E252,E301,E302,E305,E501,E701,E704,E741,F401,F811,F821
exclude =
.git,
pycache,
.venv
May I ask why this is necessary? Doesn't flake8 know to only examine .py files? I thought it did. Also, I don't intend to check in the pycache and I'm not running flake8 on startup, so it that needed?
samples/ServalApp/pyproject.toml
line 9 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Why not 3.9.7? Why not >=3.11?
There's a conflict among the dependencies. This was poetry's recommended solution.
samples/ServalApp/pyproject.toml
line 14 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Obviously, you can update the versions. But there should be dev dependencies.
[tool.poetry.group.dev.dependencies] pytest = "^7.4.2" black = "^23.3.0" flake8 = "^3.9.2" isort = "^5.9.3" ipykernel = "^6.7.0" jupyter = "^1.0.0" pyright = "^1.1.331"
OK, should I be using 'pyright'? Do you want me to write tests? (it'd be kind of difficult given that it's pretty much all UI) Why jupyter etc.? (Or is this just copy-pasted from machine.py?) I'll go ahead and add the ones I think make sense, and you can correct me.
samples/ServalApp/serval_app.py
line 100 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
That should be fine. It will just be for us to look at. As for filtering out bad characters - it may not actually matter, as long as there is appropriate escaping and any json or string representations are not messed up.
OK.
samples/ServalApp/serval_app.py
line 153 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Let's cross the bridge when we come to it. I like your idea though about asking the user first.
OK, sounds good 👍.
samples/ServalApp/serval_app.py
line 166 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Conversely, you could provide a link to the Serval swagger doc that talks about language codes:
f"{os.environ.get("SERVAL_HOST_URL")}/swagger/index.html#/Translation%20Engines/TranslationEngines_Create"
True. I wonder if that defeats some of the purpose though given swagger's complexity and the fact that we're trying to hide that. I could include it as a "For more information, see..." kind of thing?
samples/ServalApp/serval_app.py
line 153 at r2 (raw file):
Previously, johnml1135 (John Lambert) wrote…
As this is not for many users - probably one per group. I will assume that the creds will only be shared between a known small group of users (and they can see everything on the Serval site itself if they share creds). I would just want security between credentials.
Yes, unique per Email:ClientID combo.
OK, the only security complication there is that it would require storing the client ids in the db. That should be alright, correct?
samples/ServalApp/serval_app.py
line 95 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Any reason for all the prints rather than actually using a logger? Then it would be more straightforward to log the exception.
I should just use the streamlit logger if possible - good call.
samples/ServalApp/serval_email_module.py
line 39 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Whatever implementation you feel is best. A full record of all the information that they put in (including the ClientID, but not the ClientSecret) would in my mind be the best set of information. It should also include the engine and build ID in Serval. The failure or complete emails should just reference the build name that they gave us (which may not be unique) and the engine and build ID (which are).
OK, so you do or do not want to have a list of file names as well?
samples/ServalApp/builds.db
line at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
If there is no empty DB, will SQLite crash, or will it automatically create one? If it crashes, can you add code to automatically create an empty file if it doesn't exist?
Yes, I already took care of it. It will now recreate as needed. The complication (and the reason I was avoiding it) is because streamlit is awkward in that it reruns the entire file at times, so it just adds more complexity and simultaneous db sessions to check and possibly create each time, but it really shouldn't be an issue.
Previously, johnml1135 (John Lambert) wrote…
I believe there's a general setting (line 3 of this file) that takes care of this. Format on save works on my end. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
I only saw the diff - now I see it. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
If it is ok without it, then let it be. I was just comparing to Machine.py. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Good enough- as long as it is intentional. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Add the ones that make sense - that you need for your development. We don't need testing. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Yes, that sounds good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 12 files reviewed, 13 unresolved discussions (waiting on @ddaspit and @johnml1135)
.vscode/settings.json
line 35 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
I only saw the diff - now I see it.
Done.
samples/ServalApp/.flake8
line 5 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
If it is ok without it, then let it be. I was just comparing to Machine.py.
Done.
samples/ServalApp/pyproject.toml
line 9 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Good enough- as long as it is intentional.
Done.
samples/ServalApp/pyproject.toml
line 14 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Add the ones that make sense - that you need for your development. We don't need testing.
Done.
samples/ServalApp/serval_app.py
line 37 at r1 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
(Sorry, the line numbers changed after reformatting). Yes, I can rename it.
Done.
samples/ServalApp/serval_app.py
line 153 at r2 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
OK, the only security complication there is that it would require storing the client ids in the db. That should be alright, correct?
I went ahead with this. Should be fine. I
samples/ServalApp/serval_app.py
line 95 at r3 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Yes, that sounds good.
Done.
samples/ServalApp/serval_email_module.py
line 39 at r1 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
OK, so you do or do not want to have a list of file names as well?
I went ahead and included it all - including file names.
samples/ServalApp/serval_email_module.py
line 69 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Here is where all the info about the build should be dumped - all the files uploaded, the name chosen, the source and target language, etc.
Done.
Previously, johnml1135 (John Lambert) wrote…
I went ahead and added just .tar(.gz) until further notice. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
That sounds good. If it's easy, it should be easy. If it's challenging, there should be the ability to find the right answer. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
good enough. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
good enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit and @Enkidu93)
Previously, johnml1135 (John Lambert) wrote…
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
Fixes #132.
Once this is merged, I can deploy it.
I can also pretty up code down the line if you' prefer, but it's all pretty straightforward.
This change is