Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add aggregate operation for merging analysis glosses #3174

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

jasonleenaylor
Copy link
Contributor

@jasonleenaylor jasonleenaylor commented Jun 21, 2024

  • Only merges if a word has only two senses where one sense has one english gloss and the other has one tok pisin gloss and the english sense has either no semantic domains or where all the semantic domains in the english sense are also in the tok pisin sense

This change is Reviewable

- Only merges if a word has only two senses where one sense
  has one english gloss and the other has one tok pisin gloss and
  the english sense has either no semantic domains or where all the
  semantic domains in the english sense are also in the tok pisin
  sense
Copy link

codecov bot commented Jun 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.14%. Comparing base (e8a7106) to head (4ed1c02).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3174      +/-   ##
==========================================
+ Coverage   75.10%   75.14%   +0.03%     
==========================================
  Files         275      275              
  Lines       10423    10423              
  Branches     1230     1230              
==========================================
+ Hits         7828     7832       +4     
+ Misses       2235     2232       -3     
+ Partials      360      359       -1     
Flag Coverage Δ
backend 84.23% <ø> (+0.08%) ⬆️
frontend 66.96% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jasonleenaylor
Copy link
Contributor Author

database/util/twoAnalysisSenseCleanup.js line 66 at r2 (raw file):

    // Add fields to extract semantic domain GUIDs from both senses
    $addFields: {
      semDomGuidsA: "$senseA.SemanticDomains.guid",

I do prefer the 'to' and 'from' naming over 'a' and 'b' because it helps readability. The direction of the merge is clarified by those names (to me at least)

Code quote:

semDomGuidsA: "$senseA

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, all discussions resolved


database/util/twoAnalysisSenseCleanup.js line 66 at r2 (raw file):

Previously, jasonleenaylor (Jason Naylor) wrote…

I do prefer the 'to' and 'from' naming over 'a' and 'b' because it helps readability. The direction of the merge is clarified by those names (to me at least)

I had made it closer to symmetric, but since something from the second sense is lost, it is alas still unsymmetric. Your "to" and "from" preference has been restored.

@jasonleenaylor
Copy link
Contributor Author

database/util/twoAnalysisSenseCleanup.js line 66 at r2 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

I had made it closer to symmetric, but since something from the second sense is lost, it is alas still unsymmetric. Your "to" and "from" preference has been restored.

Thanks, even if some of the merging is more symmetric the 'from' sense is still being removed, so I like keeping track of the 'to' and 'from' .

@jmgrady jmgrady requested a review from imnasnainaec June 25, 2024 14:51
Copy link
Collaborator

@jmgrady jmgrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r3.
Reviewable status: 1 of 2 files reviewed, 5 unresolved discussions (waiting on @imnasnainaec and @jasonleenaylor)


database/util/twoAnalysisSenseCleanup.sh line 1 at r3 (raw file):

#!/bin/bash
#!/usr/bin/env bash

is preferred.

In addition, the file needs to have execute permission.

Code quote:

#!/bin/bash

database/util/twoAnalysisSenseCleanup.sh line 9 at r3 (raw file):

set -e

usage() {

Most examples have a space between the function name and the parentheses. I believe this is the preferred syntax:

usage () {
...

Code quote:

usage() {

database/util/twoAnalysisSenseCleanup.sh line 30 at r3 (raw file):

  Caveat:
    This script assumes that the backups have been created by the combine-backup

While it prudent to do a backup or export the project first, the script does not make any assumptions about backups one way or another. In fact, backups are not supported on the NUCs or the desktop environment.

Code quote:

  Caveat:
    This script assumes that the backups have been created by the combine-backup

database/util/twoAnalysisSenseCleanup.sh line 73 at r3 (raw file):

      if [[ "${PROJ}" =~ [^0-9a-f] ]]; then
        echo "The -p/--proj argument must be a hexidecimal id"
        exit 1

Should also verify that it is 24 characters long.

Code quote:

      if [[ "${PROJ}" =~ [^0-9a-f] ]]; then
        echo "The -p/--proj argument must be a hexidecimal id"
        exit 1

database/util/twoAnalysisSenseCleanup.sh line 91 at r3 (raw file):

if [[ -z "${LANGA}" ]]; then
  echo "The -A/--langA argument is require"

-> required
(and others)

Code quote:

require

database/util/twoAnalysisSenseCleanup.sh line 92 at r3 (raw file):

if [[ -z "${LANGA}" ]]; then
  echo "The -A/--langA argument is require"
  echo "Run this script with -h/--help for usage"

Rather than have the user run the script again with the -h option, call the usage function before exiting.

Code quote:

  echo "Run this script with -h/--help for usage"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants