Release AREkit-0.23.0-ChineseNY · nicolay-r/AREkit

What's new: Globalization and Internalization

Globalization for any language is the major aspect of 0.23.0, since we annou
nce AREnets and sample-transfer
We tend to generalize some aspects in order to consider other languages than original one (Russian).
We introduce CompoundEntities which may include other entities.

Major

Nested/Compound entities support! #398
Detaching networks contrib module #423 -> AREnets
Appearance of transfer: https://github.com/nicolay-r/arekit-googletrans-sampler

Fixed bugs

Refactored BRAT parser, fixed bugs for other languages/collections.

Minor

#375
Internalization (#435)

Full Changelog

Implemented enhancements:

PipelineContext -- support parent contexts in case of the nested pipelines. #433
Idle mode -- provide such flag into main pipeline #432
MapPipelineItem -- provide ctx parameter in order to reach out parent Pipeline Context [Idle mode] #431
NetworkSerializer -- support the case of Vectorizers==Null [Without embedding, google-trans-sampler backlog] #430
ParsedRow -- depends on pandas, while it might be switched to dict type instead [AREnets backlog] #427
Remove unused code after AREnets movement #425
AREnets -- separated project for networks contrib part, which provides NN implementation based on Tensorflow #423
Entity -- Adopt DisplayValue property for CSV serialization #419
TsvWriter -- Remove Dataframe dependency #408
OpenNREJsonWriter -- df.sort is not an inplace by default #407
NeuralNetworkModelIO -- simplify implementation #406
Brat -- support nested entities (CompoundEntity type) [simple implementation] #398
What's New -- 0.22.1 Release #323

Fixed bugs:

Brat -- incorrect parsing approach may sometimes results in a wrong value might be mismatched (use t) #437
VocabRepositoryUtils -- numpy API considers # by default in vocabulary on load #428
LabelsScaler -- uint dict and dict might have different sizes #426

Closed issues:

read_ruattitudes_to_brat_in_memory -- no need to pass label scaler #436
PosTags -- make them optional parameter for neural networks #435
RuSentiFrames -- clarify tqdm caption when loading (ARElight backlog) #434
Sync with AREnets updates #429
BERT -- provide cropped sampler #422
googletrans -- move to the separeted project #421
_provide_sentence_terms -- consider s_ind and t_ind as well since they may combined with and modified at the same time [nivts_project backlog] #420
Entity -- provide DisplayValue property (which is Value by default) #418
googletrans -- TranslatorPipelineItem for parsed texts #416
Instant downloading -- simplify data downloading #413
PandasBasedRowsStorage -- implement the nested type from the BaseRowsStorage #410
Readers/Writers -- make a part of the contrib #409
TextOpinion Annotation -- particular filtering rules for SentiNEREL and Russian texts. [pipeline items] #404
Evalution -- enhancing error log analysis #400
Statistical Folding provided via file #399
Balancing as a side part of the Storage #380

Merged pull requests:

CVE-2007-4559 Patch #412 (TrellixVulnTeam)

* This Changelog was automatically generated by github_changelog_generator

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AREkit-0.23.0-ChineseNY

What's new: Globalization and Internalization

Major

Fixed bugs

Minor