Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose similarity matching item pairs in R library (aka crosswalk table) #4

Open
woodthom2 opened this issue Oct 29, 2024 · 1 comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@woodthom2
Copy link
Contributor

woodthom2 commented Oct 29, 2024

Description

The web UI allows users to see the matching item pairs above a given threshold

Can we make the R library also return the matching pairs above a threshold? This is called the crosswalk table

Task make a function called generate.crosswalk.table(match, threshold) or similar which takes the output of match_instruments and gives the pairs that match above a threshold.

image

A crosswalk table is the same information as is currently coming back in the similarity matrix but just in a different format

It is a long-format data frame that shows each matching pair of questions above a certain threshold, along with their respective IDs, question texts, and match scores. Here's an example structure:

# Example structure of crosswalk table DataFrame:

# tibble [n × 6]

# $ pair_name      : chr  # Name of the survey pair

# $ question1_no   : chr  # ID of question from first survey

# $ question1_text : chr  # Text of question from first survey

# $ question2_no   : chr  # ID of question from second survey

# $ question2_text : chr  # Text of question from second survey

# $ match_score    : num  # Similarity score between the questions

See equivalent Python issue: harmonydata/harmony#62

@woodthom2 woodthom2 added bug Something isn't working good first issue Good for newcomers labels Oct 29, 2024
@woodthom2 woodthom2 changed the title Expose similarity (API return item matches) in R library Expose similarity (API return item matches) in R library (aka crosswalk table) Oct 29, 2024
@woodthom2 woodthom2 added the help wanted Extra attention is needed label Oct 29, 2024
@woodthom2 woodthom2 changed the title Expose similarity (API return item matches) in R library (aka crosswalk table) Expose similarity matching item pairs in R library (aka crosswalk table) Oct 29, 2024
@woodthom2
Copy link
Contributor Author

Crosswalks have been implemented in Python by @vkrithika25 - you could use this as a guide for R: harmonydata/harmony#65

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant