- (client/launcher) global pool (#53416)
- (client/launcher) add entity in resource list command (#53576)
- (client) add snw client commands to share and open models (#53629)
- (client/launcher) python 3 (#53329)
- (client/launcher) coding style (#53529)
v1.10.2 (2019-12-18)
- (client/launcher) add
-e/--entity-owner
option when training a model(#52926) - (worker) fix deadlock CPU allocation(#53071)
v1.10.1 (2019-10-21)
- (client) fix error with option service list(#52908)
v1.10.0 (2019-10-08)
- (launcher/client) update to support evaluating local (multi-) reference files
- (launcher/client) filter the task list by status, service (#52807)
- (launcher/client) show the parent task in the task list (#52807)
- (launcher) add the possibility to create a folder in S3 storages
- (launcher) store the information relative to the release action when a model is released (#52793)
- (launcher) Flask server returns proper error message (with the code 500) after an uncatched error (#52742)
- (launcher) prevent user to use resource/upload in pn9-model-catalog and all shared storages (except for systran resource manager )
- (launcher) create cache to improve the search of model
v1.9.0 (2019-07-23)
- (launcher) update to use latest storage management
- (launcher) fix S3 storage with "assume_role" feature (#52401)
- (launcher) limit task log access to entity's users (#52079)
- (launcher) fix crash when updating with empty statistics (#52209)
v1.8.1 (2019-06-18)
- (launcher) allow user see all tasks launched in his entity (#52201)
v1.8.0 (2019-06-14)
- (launcher) fix scoring task when working with launcher as storage
- (launcher) check ability when deleting task
- (launcher) fix crash when alloc_ressource and resource not found in task
- correct inappropriate chinese funny name
- (client) remove debug (== True) displaying when launching operation involving local file
v1.7.3 (2019-05-14)
- (worker) fix worker not auto restart when "admin:worker:" redis key expired
v1.7.2 (2019-04-05)
- (client) fix task launch command
v1.7.0 (2019-04-02)
- add generic DB cache with redis
- fix display of task list if some tasks have corrupted information
- (worker) fix crash for resource calculation
- (worker) fix use of single quote in configuration file
- (launcher) enable /file and /task/file equivalent API for backward compatibility
- (launcher) normalize naming of tasks with parent task id (#51648)
- add score docker lang option
v1.6.0 (2018-02-11)
- add "task/stat" route for tracking tasks statistics
- (worker) fix passing envvar to specific image as defined in non default config.
- (launcher) fix
lr
for s3 for root path
v1.5.0 (2018-01-28)
- add "exec" command to launch any generic nmt-wizard docker task
- (client) avoid content corruption when
file
output string contains non utf-8 or non supported characters - (client) improve task list to add model and image tag information
- (client)
-l DEBUG
now traces HTTP requests - (worker) fix: ignore
envvar
field in config.docker - (admin) global
default.json
now stored in REDIS db and updated by launcher - (server) introduce Chinese funnyname and generated model contains name translation
v1.4.2 (2018-12-17)
- (worker) fix delete of running CPU tasks from
ls
when worker restart
v1.4.1 (2018-12-17)
- Add French and German funny names
- (worker) fix issue with CPU allocation making tasks fail when all CPU were allocated
- (worker) worker does not fail if configuration is invalid, pool is not disabled
- (worker) do not set cpu restriction on dedicated instances (like ec2)
- (worker) double check process termination after docker terminate
- (worker) patch log renew beat
- (launcher)
--as_release
is now implied for translations for image >= 1.80, except if--notransasrelease
is specified
v1.4.0 (2018-12-14)
- (worker) refactor and improve ec2 service
v1.3.1 (2018-12-11)
- (client) backward compatibility for
lt
v1.3.0 (2018-12-10)
- (worker) refactor cpu allocation - are now allocated like gpu
- (launcher) automatic translation tasks are now by default launched with
--as_release
v1.2.5 (2018-11-28)
- (worker) fix retry on ssh connection
- (worker) more robust environment checking before launching task
v1.2.4 (2018-11-26)
- (launcher) fix crash on
task/status
without parameters
v1.2.3 (2018-11-25)
- (worker) fix integrity checking at worker restart
v1.2.2 (2018-11-23)
- (launcher)
max_log_size
option in setting file for setting maximum limit to log file
v1.2.1 (2018-11-22)
- (launcher) fix incorrect reporting of CPU-only tasks
v1.2.0 (2018-11-22)
- server code is now requiring
redis-py>=3.0.0
- (worker) For security, checks that config name matches json config filename
- (worker) Fix invalid log when redis not directly available at startup
- model/task name are 6 characters longer and now include postfix (:
trans
/prepr
/relea
/vocab
) - tasks are now sorted by launch date
- introduce extension function for routes
- fix too slow propagation of stopped tasks
- optimize task stopping
- (client) optimize initial handshaking to check connection and authentication
- (worker) fix major bug making number of cpus available not accurate and finally blocking tasks
v1.1.2 (2018-10-25)
- Fix error when worker restarts with pure cpu-tasks running
- (client) Add
--nochainprepr
to disable chained preprocessing and training - (client) Fix
prepr
task type and display in ls - (worker) Increasing logs in debug mode
v1.1.1 (2018-10-23)
- Fix possible error when
process_request
is called multiple times
v1.1.0 (2018-10-23)
- add
stop
admin command - requiresstop:config
- finer-grain route permission checks
- split training tasks into prepr and train for more efficient GPU resource handling.
- Fix translation tasks not launch for iterations 2+
- Fix split of subsets for chained translation tasks and pure-CPU training
- Improve reactivity of worker that could let a resource idle for one hour
- Increase "dead-worker" detection to 20 minutes of unchanged beat
- Reduce running task checking interval to every 5min after first check
- Fix workeradmin tasks taking all bandwidth
- Improve management of CPU-tasks
- Fix parsing options
v1.0.0 (2018-10-04)
- workers are now dedicated to a single pool
- new tool
runworker
for activity monitoring/relaunching of workers - add
workeradmin
redis message queue for interaction with workers, and corresponding new REST service in launcher - each worker can manage and switch between multiple configurations
- possibility to pass private key directly in configurations
- default
settings.ini
in current directory - introduce
--resource
as simpler alternative to--option
- chained translation tasks are now dispatched on multiple gpus
- (worker) retry 5 times failed status, and correctly terminate
v0.4.2 (2018-09-21)
- fix ID of chained translation tasks (#12)
v0.4.1 (2018-09-20)
- Fix broken 'trans'
v0.4.0 (2018-09-19)
- Possibly chain training and translation tasks
- (admin) add
stop by admin
feature
- when running chained training - update queued time to be close to terminate time of parent task
- more robust worker beat
- log and file storing use filesystem
v0.3.1 (2018-09-17)
- Fix client compatibility with Python 3
- Fix 'change task' not putting tasks on the new service active queue
v0.3.0 (2018-09-17)
- rename shortcut for
--priority
to-P
- launcher service does not rely on json configuration files, and dynamically retrieve configuration from
admin
section in database
- Add
--quiet
mode and-S STATUS
forlt
command
- Fix incorrect usage count with
ls
- Review Funny names to remove offending candidates
- Fix incorrect count of available CPU
v0.2.2 (2018-08-24)
- normalize database structure for support of CPU servers.
- add possibility to change ssh port for connection to training services
- enable pure-cpu server and support 0-gpu tasks
- worker use
logging.conf
for formatted logs - fix ghost tasks remaining after terminated if they were already in service queue
- fix download of binary file (#9)
- extend locking period for more robustness under load
v0.2.1 (2018-08-02)
- introduce TTL on stopped tasks in database
- fix python 3 compatibility issue (client)
v0.2.0 (2018-07-31)
- redefine more consistent routes
- change
task_id
as positional argument for simpler client commands - fix
service/check
returning inconsistent json format - local path parameters must be absolute path
- versioning information
- possibility to launch multi-gpu tasks
- launch error now reports error message in task log
- define
auto
registry - resolved by launcher checking atdefault_for
in registry definition - add friendly name to tasks, and differentiate task type
- add task chaining feature - it is possible to launch sequence of tasks
list_services
displays usage and capacity of the different services- add priority field to prioritise tasks, and make queuing system more robust
- detect busy resources and put them in quarantine mode
- add '/status' service
- invalid docker image makes a run fail immediately
- check and retry if redis database is not available
- add default master storage (
"default_ms":true
) docker.path
variable to set docker path on remote service- for ssh service, if
options.server
is not set, set it toauto
- fix and improve task termination for ssh service
- incremental log during training
- automatically configure database at launch time
service/check
does not fail if resources not available- misc bug fixes
v0.1.0 (2018-03-02)
Initial release.