All notable changes (beginning at version 0.2.0) to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Added outgoing links to homepage
- Upgraded SASTADEV to 0.2.2
- Added a home page
- Locks first row and first two columns in SAF xlsx output
- Resolved various bugs that surfaced in release 0.9.0
Drastic changes to the way analysis is performed, replacing functionality by SASTADEV counterparts:
- reading SAF files
- performing analysis
- changed the SAF writer
Upgraded SASTADEV dependency
- Adhere to FAIR software principles
- User documentation
- Split Angular frontend into core-, shared-, and feature- modules
- Extended anonymization codes and moved to a centrally located JSON file
- First version of user documentation
- Various frontend refactors
- Corpus and transcript divided in list and detail views to reduce data transfer
- Resolved a bug were corpora were being constantly retrieved, leading to very high data transfer
- Fixed multipe vulnerabilities in backend and frontend.
- Resolved a bug where transcript paths were uncorrectly saved, leaving them unable to be downloaded
- Resolved a bug where a dictionary was changed during iteration, preventing analysis
- Visualize parse trees for utterances
- Resolved a bug where words would be misaligned when analyzing with existing annotations
- Implemented latest SASTADEV methods
- Clearer error messages for parses that cannot be analysed
- Bump corpus2alpino version to 0.3.10
- Authentication guard for protected routes awaits checking user, preventing redirects to login after page refresh.
- Resolved a bug where corpora keep refreshing after leaving their page
- Resolved several vulnerabilities in both backend and frontend
- Implemented preprocessing steps for CHAT input files (anonymization, interpunction cleanup)
- Separated parsed and reparsed/corrected treebanks. Admins can view them separately , both files are included in corpus downloads
- Added comment row to annotation files, allowing free text. These are not interpreted by the analysis
- Allows annotation in unaligned column, signifying either utterance-level or unaligned annotations
- Disabled method selection for non-admins. The latest method is used by default. Admins can still select older versions
- Moved unaligned column to the first position, before word1..wordN in annotation files
- Resolved an issue where default methods were prevented from being set.
- Some forms were not correctly updated after supplying manual corrections, e.g. ASTA WW and N. This is fixed by implementing aligned results functionality.
- New SASTADEV method definitions.
- Extra corpus information and control:
- Overview of number of targeted utterances and targeting flags.
- Overview of all utterances.
- Shortcut to upload additional transcripts to the corpus.
- Upload multiple files without zipping them.
- In SASTA Output Format, lock all cells except annotation cells. This is implemented to avoid errors in manual correction files.
- ASTA corrects Nouns and Verbs
- use exactresults
- Resolved an issue where corrected transcripts were incorrectly analyzed.
- Resolved an issue where CHAT annotations were added to the wrong utterances.
- Implement latest method defintions.
- Upgrade to Python 3.7.x
- Concurrent parsing: up to 8 transcripts can be parsed in parallel.
- All marked utterances are given utterance IDs, no longer use utterance numbers. Analysis now numbers utterance 1-N, where N is number of marked utterances.
- Phase out
python-ucto
, and by proxyucto
(through changes incorpus2alpino
). Severly reduces dependency complexity.
- Dropped Python 3.6.x support.
- Fixed multipe vulnerabilities in backend and frontend.
- Differentiate CHAT postcode markers by method. For STAP
[+ VU]
is marked, for all methods[+ G]
- Process CHAT postcodes
[+ VU]
and[+ G]
as utterance analysis markers
- Updated VKL logo
- Trailing whitespace in SIF headers is ignored
- Run pre-queries before core-queries
- CHAT format input.
- Asynchronous parsing; the parse process continues in the background.
- Updates to query definitions and related SASTADEV functions.
- Automatic corrections on input files to improve parses of irregular language (implemented in SASTADEV).
- Logos for participating organisations.
- Give corpora a method category, can only query and annotate using methods within this category.
- Updated look.
- Remove 'Inform only' choice, the option is forcibly set to true.
- Asynchronous parsing ensures the application does not lock all users for the duration.
- Fix multiple vulnerabilities in both frontend and backend.