Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep population #847

Merged
merged 20 commits into from
Nov 21, 2023
Merged

Keep population #847

merged 20 commits into from
Nov 21, 2023

Conversation

lizihao-anu
Copy link
Contributor

A first draft of transform from SPSS to R for keep_population. Help is needed to check the logic is correct.

Jennit07 and others added 5 commits October 2, 2023 09:21
* Bump `{slfhelper}` version

The new version is needed to read the SLFs now. We use this in `get_existing_data_for_tests()`

* Remove unnecessary code from `get_anon_chi` (#759)

* remove unnecessary code from `get_anon_chi`

`get_anon_chi` was updated in slfhelper v0.10

* [check-spelling] Update metadata

Update for https://github.com/Public-Health-Scotland/source-linkage-files/actions/runs/5669528966/attempts/1
Accepted in #759 (comment)

Signed-off-by: check-spelling-bot <[email protected]>

---------

Signed-off-by: check-spelling-bot <[email protected]>
Co-authored-by: marjom02 <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>

* Set the default reporter for `tar_outdated()` and friends

* Comment out dataset writing targets

These take a very long time to run, so were skipped at the last update. They need to be revisited.

* Make sure `year` is added as the first variable

* Correct some documentation (#769)

* Correct some documentation

This resolves a build warning.

* Style code

---------

Co-authored-by: Moohan <[email protected]>

* Make some changes suggested by lintr

Lots of layout changes, as well as lots of implicit to explicit integer / double changes.

* Document

* Fix documentation typo

* Investigate missing datazone from episode file (#773)

* Format postcode into `pc7` format

* Style code

* Style code

* Update documentation

* Update comment in R/process_extract_ae.R

* Implement catch-all for PC7 format

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: James McMahon <[email protected]>
Co-authored-by: Moohan <[email protected]>

* Remove some obsolete code (#770)

* Remove some obsolete code

Renaming and removing some functions.

* Style code

---------

Co-authored-by: Moohan <[email protected]>
Co-authored-by: Zihao Li <[email protected]>

* Simplify `create_hscp_test_flags` (#772)

* Simplify `create_hscp_test_flags`

* Update documentation

* Style code

* simplify `create_hb_test_flags`

* implement hscp test flags into tests

* Simplify `create_demog_test_flags`

---------

Co-authored-by: James McMahon <[email protected]>
Co-authored-by: Moohan <[email protected]>

* Rewrite case when statements (#780)

* updated code from case_when to case_match as it's a bit easier to read

* Style code

* changed some more `case_when` to `case_match`

* Style code

* [check-spelling] Update metadata

Update for https://github.com/Public-Health-Scotland/source-linkage-files/actions/runs/5752014211/attempts/1
Accepted in #780 (comment)

Signed-off-by: check-spelling-bot <[email protected]>

* Add tests for `convert_sending_location_to_lca`

---------

Signed-off-by: check-spelling-bot <[email protected]>
Co-authored-by: marjom02 <[email protected]>
Co-authored-by: SwiftySalmon <[email protected]>
Co-authored-by: James McMahon <[email protected]>

* Update R-CMD-check.yaml (#781)

Co-authored-by: Jennit07 <[email protected]>

* added solve for hscp names (#789)

In processed extract variable is called hscp, and in final SLF it's called hscp2018.

Fixed with nested if statement

Co-authored-by: marjom02 <[email protected]>

* Fix locality (#802)

Tiny error and a simple fix.

Co-authored-by: Jennit07 <[email protected]>

* Add simple scripts for running targets as a workbench job (#767)

* Fix CHI duplicates of chi in individual file (#791)

* fix duplicated matches in chi in sc data.

* Update R/create_individual_file.R

* update on join_sc_client

* Create a test checking if individual files have duplicated chi

* add duplicated chi number to the tests in process_tests_individual_file

---------

Co-authored-by: lizihao-anu <[email protected]>
Co-authored-by: James McMahon <[email protected]>

* Update NSU code for new 22/23 cohort (#784)

Update `check_year_valid` for NSUs

* Amend `get_boxi_extract_path` function for archiving DN and CMH data  (#785)

* Update `get_boxi_extract_path` for DN/CMH data

* Remove extra function

* [check-spelling] Update metadata

Update for https://github.com/Public-Health-Scotland/source-linkage-files/actions/runs/5856792420/attempts/1
Accepted in #785 (comment)

Signed-off-by: check-spelling-bot <[email protected]>

---------

Signed-off-by: check-spelling-bot <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: James McMahon <[email protected]>

* Fix increase in total preventable beddays (#779)

* further obsolete code change

* fix the preventable_beddays

Co-authored-by: James McMahon <[email protected]>

---------

Co-authored-by: James McMahon <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* fix warning on `:=` (#797)

* fix warning on `:=`

* Update R/aggregate_by_chi.R

Co-authored-by: James McMahon <[email protected]>

* Style code

---------

Co-authored-by: James McMahon <[email protected]>
Co-authored-by: lizihao-anu <[email protected]>

* Add 2324 targets/workbench job file

* Use `get_source_extract_path` in homelessness (#796)

This was already set up, just not used for some reason. Note that this will switch from using a `.rds` to `.parquet` (unless you do `get_source_extract_path(year, "Homelessness", ext = "rds")`).

Co-authored-by: Jennit07 <[email protected]>

* Correct tests for NSU

* Update script for extracting NSU from SMRA space

* Update year in 99_NSU extract script

* Update news for September 23 update (#811)

* Update News for March and June updates

* Update release date

* WIP - update news for Sep update

* Update NEWS.md

Fix some typos / grammar

---------

Co-authored-by: James McMahon <[email protected]>

* Apply styling

* Fix issue with `case_match` types (#810)

* Fix issue with `case_match` types

It seems that `case_match()` is stricter about types than `case_when()`. See the below code:

```r
library(dplyr)
# Breaks
mutate(starwars,
  new_height = case_when(
    height == "172" ~ "170"),
  new_height2 = case_match(
    height,
    "172" ~ "170"
  ),
  .after = "height"
)

# Works
mutate(starwars,
  new_height = case_when(
      height == "172" ~ "170"),
  new_height2 = case_match(
    height,
    172L ~ "170"
  ),
  .after = "height"
)
```

Since `sending_location` is an integer, the LHS of `case_match` must be numeric. It was slightly incorrect previously but `case_when` let us get away with it!

I also updated and added to the tests.

* Style code

* Style code

---------

Co-authored-by: Moohan <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Bug - Outpatients tests failing due to missing HSCP (#816)

* Update `produce_source_extract_tests`

* Update outpatients tests with hscp_var = FALSE

* Revert "Style code"

This reverts commit 8e73d4a.

* Style code

* simplify code

* Update documentation

* Rename `hscp_var` to `add_hscp_count`

* Update documentation

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: James McMahon <[email protected]>
Co-authored-by: Moohan <[email protected]>

* fix read_sc_all_alarms_telecare with incorrect format in period (#814)

* fix read_sc_all_alarms_telecare with the incorrect format in period

---------

Co-authored-by: lizihao-anu <[email protected]>
Co-authored-by: James McMahon <[email protected]>

* Fix `convert_sending_location_to_lca` example

* Use `col_select` instead of `columns` in tests

* Add tests for `compute_mid_year_age` (#809)

* Add tests for `compute_mid_year_age`

* Remove redundant code

* Update documentation

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Add a new function to set up keyring (#800)

* Add a new function to set up keyring

I've tested this by deleting my `.Renviron` and deleting my keyring `keyring::keyring_delete("createslf")` and it seems to work. Would be great to have someone with an existing set-up (Jen) test it, and to have someone who doesn't have it set up to test it.

The code looks complicated but I've just tried to catch every scenario, so the process should be smooth and clear (from the user's point of view).

I've also expanded the code relating to the username, which will now hopefully work in more cases.

* [check-spelling] Update metadata

Update for https://github.com/Public-Health-Scotland/source-linkage-files/actions/runs/5824423711/attempts/1
Accepted in #800 (comment)

Signed-off-by: check-spelling-bot <[email protected]>

* Update documentation

---------

Signed-off-by: check-spelling-bot <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Add additional tests for `get_file_path` (#808)

* Add additional tests for `get_file_path`

* Style code

---------

Co-authored-by: Moohan <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Rename `run_episode_file()` -> `create_episode_file()` (#803)

* Rename `run_episode_file()` -> `create_episode_file()`

This improves consistency! When speaking to Megan we noted that having the two 'main' functions with different names was needlessly confusing!

* Delete run_targets_tests.R

* Update documentation

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>

* Remove incorrect references to rds (#798)

* Remove incorrect references to rds

Since we (mostly) don't use rds anymore these references are incorrect and potentially confusing.

I've updated lots of documentation to remove the reference to rds.

I've also updated many comments that mentioned rds (these were probably the most confusing).

* Update documentation

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>

* Make targets and tarchetypes required packages (#799)

Co-authored-by: Megan McNicol <[email protected]>

* Update episode file functions to pass data through (#754)

* Update `read_file` to return an empty tibble if passed the dummy path

This is needed for some other bits, notably NSUs

* Update SPARRA and HHG paths to return dummy if the year is invalid

* Extract all data as a parameter

* Style code

* Update documentation

* Style code

* Update documentation

* rename `run` to `create_episode_file`

* Update documentation

---------

Co-authored-by: Moohan <[email protected]>
Co-authored-by: Jennifer Thom <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Tests/it extract path (#807)

* Add additional tests for `check_it_reference()`

* Make the check on the IT reference stricter

* Update documentation

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Update workflow to run against the development branch (#795)

* Make test-coverage.yaml run against development

* Make lint-changed-files.yaml run against development

---------

Co-authored-by: Jennit07 <[email protected]>

* Remove package wide imports of `readr` (#792)

* Update documentation

* Use `readr::` where possible

* Update documentation

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>

* Handle OpenData extracts better (#794)

* Refactor the LA Code OpenData

This should now run as its own target and then be passed to the homelessness data.

I also added some tests.

* Also add some tests for the GP prac clusters OpenData

* Update documentation

---------

Co-authored-by: Moohan <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Fix the pkgdown site/job (#804)

* Fix the pkgdown site/job

It generates this site: https://public-health-scotland.github.io/source-linkage-files/ although it hasn't been working for a while since any new function needs to be added to (or captured by) the `_pkgdown.yml` file.

This PR is a pretty minimal fix to get the site working again.

* Update documentation

* Update documentation

* Update `create_episode_file`

* Remove `run_episode_file`

* update documentation

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennifer Thom <[email protected]>

* Add new 'final' file path functions (#787)

* New function for SLF final file paths

* Implement final file path functions

* Style code

* Update documentation

* Update final file paths to use `...`

* fixing conflicts with `run episode file` getting renamed to `create episode file`

* Update documentation

* Update documentation

* Style code

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: marjom02 <[email protected]>
Co-authored-by: SwiftySalmon <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>

* Check scripts are in snake case (#793)

* Update `get_boxi_extract_path` for DN/CMH data

* Remove extra function

* Update documentation

* change `get_boxi_extract_path` to snake_case

* change `get_source_extract_path` to snake_case

* Update documentation

* Update targets with snake_case

* Fix typo

* Style code

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: James McMahon <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>
Co-authored-by: SwiftySalmon <[email protected]>

* transform the python script for sorting BI extracts to R (#833)

* transform the python script for sorting BI extracts to R

* Style code

* Delete 00-Sort_BI_Extracts.py

---------

Co-authored-by: lizihao-anu <[email protected]>

* Use `get_slf_episode_path` in right place

* fix pipe

* Fix typo in string

* Update documentation

* Rename to `convert_sc_sending_location_to_lca`

* Update documentation

* Style code

* Update documentation

---------

Signed-off-by: check-spelling-bot <[email protected]>
Co-authored-by: James McMahon <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>
Co-authored-by: marjom02 <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>
Co-authored-by: Moohan <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Zihao Li <[email protected]>
Co-authored-by: lizihao-anu <[email protected]>
* rename to `add_smrtype`

* Rename script to `add_smrtype`

* update documentation

* Remove TODO comment

* Style code

* Update documentation

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Megan McNicol <[email protected]>
@github-actions

This comment has been minimized.

SwiftySalmon and others added 4 commits October 20, 2023 13:13
A previous pull request changed all capitals to lowercase - however boxi file names have capitals so it was no longer reading in files. This is a fix

Co-authored-by: marjom02 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Bumps [stefanzweifel/git-auto-commit-action](https://github.com/stefanzweifel/git-auto-commit-action) from 4 to 5.
- [Release notes](https://github.com/stefanzweifel/git-auto-commit-action/releases)
- [Changelog](https://github.com/stefanzweifel/git-auto-commit-action/blob/master/CHANGELOG.md)
- [Commits](stefanzweifel/git-auto-commit-action@v4...v5)

---
updated-dependencies:
- dependency-name: stefanzweifel/git-auto-commit-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@lizihao-anu lizihao-anu requested a review from Jennit07 October 24, 2023 15:27
@Jennit07 Jennit07 changed the base branch from development to dec-update-23 October 25, 2023 14:49
@lizihao-anu
Copy link
Contributor Author

Had a test run and showed the code works fine.

@lizihao-anu lizihao-anu requested a review from Jennit07 October 31, 2023 11:19
@github-actions

This comment has been minimized.

Copy link
Collaborator

@SwiftySalmon SwiftySalmon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic all seems good to me. I can't find the original spss script to compare it with though!

@Jennit07 Jennit07 mentioned this pull request Nov 20, 2023
11 tasks
Copy link

@check-spelling-bot Report

🔴 Please review

See the 📂 files view, the 📜action log, or 📝 job summary for details.

Unrecognized words (12)
beddays
discondition
dnas
github
hri
popluation
reftype
roxygenise
SPSS
starwars
ubuntu
yml
To accept these unrecognized words as correct, you could run the following commands

... in a clone of the [email protected]:Public-Health-Scotland/source-linkage-files.git repository
on the keep_population branch (ℹ️ how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/main/apply.pl' |
perl - 'https://github.com/Public-Health-Scotland/source-linkage-files/actions/runs/6946980128/attempts/1'

OR

To have the bot accept them for you, reply quoting the following line:
@check-spelling-bot apply updates.

Available 📚 dictionaries could cover words (expected and unrecognized) not in the 📘 dictionary

This includes both expected items (232) from .github/actions/spelling/expect.txt and unrecognized words (12)

Dictionary Entries Covers Uniquely
cspell:k8s/dict/k8s.txt 153 2 1
cspell:swift/src/swift.txt 53 2
cspell:aws/aws.txt 218 1 1
cspell:filetypes/filetypes.txt 264 1 1
cspell:software-terms/dict/softwareTerms.txt 1288 1 1

Consider adding them (in .github/workflows/spelling.yml) for uses: check-spelling/check-spelling@main in its with:

      with:
        extra_dictionaries:
          cspell:k8s/dict/k8s.txt
          cspell:swift/src/swift.txt
          cspell:aws/aws.txt
          cspell:filetypes/filetypes.txt
          cspell:software-terms/dict/softwareTerms.txt

To stop checking additional dictionaries, add (in .github/workflows/spelling.yml) for uses: check-spelling/check-spelling@main in its with:

check_extra_dictionaries: ''
Errors (3)

See the 📂 files view, the 📜action log, or 📝 job summary for details.

❌ Errors Count
❌ check-file-path 2
❌ ignored-expect-variant 13
ℹ️ no-newline-at-eof 1

See ❌ Event descriptions for more information.

If the flagged items are 🤯 false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

Copy link
Collaborator

@Jennit07 Jennit07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and ready to merge. If we find any bugs identified by the consultancy team then we can fix them during the update

@Jennit07 Jennit07 merged commit 1059538 into dec-update-23 Nov 21, 2023
11 of 12 checks passed
@Jennit07 Jennit07 deleted the keep_population branch November 21, 2023 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants