Skip to content

AREkit-0.23.0-ChineseNY

Compare
Choose a tag to compare
@nicolay-r nicolay-r released this 21 Jan 11:15
· 6 commits to 0.23.0-rc since this release
a2f6fe8

What's new: Globalization and Internalization

arekit-chinese-ny-1

Globalization for any language is the major aspect of 0.23.0, since we annou
nce AREnets and sample-transfer
We tend to generalize some aspects in order to consider other languages than original one (Russian).
We introduce CompoundEntities which may include other entities.

Major

Fixed bugs

  • Refactored BRAT parser, fixed bugs for other languages/collections.

Minor

Full Changelog

Implemented enhancements:

  • PipelineContext -- support parent contexts in case of the nested pipelines. #433
  • Idle mode -- provide such flag into main pipeline #432
  • MapPipelineItem -- provide ctx parameter in order to reach out parent Pipeline Context [Idle mode] #431
  • NetworkSerializer -- support the case of Vectorizers==Null [Without embedding, google-trans-sampler backlog] #430
  • ParsedRow -- depends on pandas, while it might be switched to dict type instead [AREnets backlog] #427
  • Remove unused code after AREnets movement #425
  • AREnets -- separated project for networks contrib part, which provides NN implementation based on Tensorflow #423
  • Entity -- Adopt DisplayValue property for CSV serialization #419
  • TsvWriter -- Remove Dataframe dependency #408
  • OpenNREJsonWriter -- df.sort is not an inplace by default #407
  • NeuralNetworkModelIO -- simplify implementation #406
  • Brat -- support nested entities (CompoundEntity type) [simple implementation] #398
  • What's New -- 0.22.1 Release #323

Fixed bugs:

  • Brat -- incorrect parsing approach may sometimes results in a wrong value might be mismatched (use t) #437
  • VocabRepositoryUtils -- numpy API considers # by default in vocabulary on load #428
  • LabelsScaler -- uint dict and dict might have different sizes #426

Closed issues:

  • read_ruattitudes_to_brat_in_memory -- no need to pass label scaler #436
  • PosTags -- make them optional parameter for neural networks #435
  • RuSentiFrames -- clarify tqdm caption when loading (ARElight backlog) #434
  • Sync with AREnets updates #429
  • BERT -- provide cropped sampler #422
  • googletrans -- move to the separeted project #421
  • _provide_sentence_terms -- consider s_ind and t_ind as well since they may combined with and modified at the same time [nivts_project backlog] #420
  • Entity -- provide DisplayValue property (which is Value by default) #418
  • googletrans -- TranslatorPipelineItem for parsed texts #416
  • Instant downloading -- simplify data downloading #413
  • PandasBasedRowsStorage -- implement the nested type from the BaseRowsStorage #410
  • Readers/Writers -- make a part of the contrib #409
  • TextOpinion Annotation -- particular filtering rules for SentiNEREL and Russian texts. [pipeline items] #404
  • Evalution -- enhancing error log analysis #400
  • Statistical Folding provided via file #399
  • Balancing as a side part of the Storage #380

Merged pull requests:

* This Changelog was automatically generated by github_changelog_generator