AREkit-0.23.0-ChineseNY
What's new: Globalization and Internalization
Globalization for any language is the major aspect of 0.23.0, since we annou
nce AREnets
and sample-transfer
We tend to generalize some aspects in order to consider other languages than original one (Russian).
We introduce CompoundEntities
which may include other entities.
Major
- Nested/Compound entities support! #398
- Detaching
networks
contrib module #423 -> AREnets - Appearance of transfer: https://github.com/nicolay-r/arekit-googletrans-sampler
Fixed bugs
- Refactored BRAT parser, fixed bugs for other languages/collections.
Minor
Implemented enhancements:
PipelineContext
-- supportparent
contexts in case of the nested pipelines. #433- Idle mode -- provide such flag into main pipeline #432
MapPipelineItem
-- providectx
parameter in order to reach out parent Pipeline Context [Idle mode] #431- NetworkSerializer -- support the case of
Vectorizers==Null
[Without embedding, google-trans-sampler backlog] #430 - ParsedRow -- depends on
pandas
, while it might be switched todict
type instead [AREnets backlog] #427 - Remove unused code after AREnets movement #425
AREnets
-- separated project fornetworks
contrib part, which provides NN implementation based on Tensorflow #423Entity
-- AdoptDisplayValue
property for CSV serialization #419- TsvWriter -- Remove
Dataframe
dependency #408 - OpenNREJsonWriter --
df.sort
is not an inplace by default #407 - NeuralNetworkModelIO -- simplify implementation #406
- Brat -- support nested entities (
CompoundEntity
type) [simple implementation] #398 - What's New -- 0.22.1 Release #323
Fixed bugs:
- Brat -- incorrect parsing approach may sometimes results in a wrong value might be mismatched (use
t
) #437 VocabRepositoryUtils
--numpy
API considers#
by default in vocabulary on load #428- LabelsScaler -- uint dict and dict might have different sizes #426
Closed issues:
read_ruattitudes_to_brat_in_memory
-- no need to pass label scaler #436PosTags
-- make them optional parameter for neural networks #435- RuSentiFrames -- clarify
tqdm
caption when loading (ARElight backlog) #434 - Sync with AREnets updates #429
BERT
-- provide cropped sampler #422googletrans
-- move to the separeted project #421_provide_sentence_terms
-- considers_ind
andt_ind
as well since they may combined with and modified at the same time [nivts_project backlog] #420- Entity -- provide DisplayValue property (which is
Value
by default) #418 googletrans
-- TranslatorPipelineItem for parsed texts #416- Instant downloading -- simplify data downloading #413
- PandasBasedRowsStorage -- implement the nested type from the
BaseRowsStorage
#410 - Readers/Writers -- make a part of the contrib #409
- TextOpinion Annotation -- particular filtering rules for SentiNEREL and Russian texts. [pipeline items] #404
- Evalution -- enhancing error log analysis #400
- Statistical Folding provided via file #399
- Balancing as a side part of the Storage #380
Merged pull requests:
- CVE-2007-4559 Patch #412 (TrellixVulnTeam)
* This Changelog was automatically generated by github_changelog_generator