Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New field in 2020 data call #23

Open
hgerritsen opened this issue Feb 26, 2020 · 6 comments
Open

New field in 2020 data call #23

hgerritsen opened this issue Feb 26, 2020 · 6 comments
Assignees

Comments

@hgerritsen
Copy link

Hello,
The 2020 data call asks for a new field: Anonymized vessel id.
From the 2019 WGFSD report i understand that this means that for rows with 2 vessels or less should be disaggregated by vessel (if there are 2) and a vessel id should be provided.
I have put together some code, that i think could do the trick and could be added to the workflow script (around line 512). I am not 100% sure that it is ok and there may be a better way but here is my attempt:

VE_lut <- data.frame(VE_REF=unique(c(table1$VE_REF,table2$VE_REF)))
VE_lut$VE_ID <- paste0('IRL',sprintf("%03d", 1:nrow(VE_lut))) # use relevant country code!
## check that there are fewer than 999 unique vessels!
nrow(VE_lut)

table1 <- left_join(table1,VE_lut)
table2 <- left_join(table2,VE_lut)


table1 <- table1 %>%
    group_by(RT, VE_COU,
             Year, Month,
             Csquare, LENGTHCAT,
             LE_GEAR, LE_MET) %>% 
    mutate(n_vessels = n_distinct(VE_REF, na.rm = TRUE))
table1$VE_ID[table1$n_vessels>2] <- NA #only provide vessel id for cells with 1 or 2 vessels


## Aggregation of VMS data:
table1Save <- table1 %>%
    group_by(RT, VE_COU,
             Year, Month,
             Csquare, LENGTHCAT,
             LE_GEAR, LE_MET, VE_ID) %>%
    summarise(sum_intv = sum(INTV, na.rm = TRUE),
              sum_kwHour = sum(kwHour, na.rm = TRUE),
              sum_le_kg_tot = sum(LE_KG_TOT, na.rm = TRUE),
              sum_le_euro_tot = sum(LE_EURO_TOT, na.rm = TRUE),
              mean_si_sp = mean(SI_SP, na.rm = TRUE),
              mean_ve_len = mean(VE_LEN, na.rm = TRUE),
              mean_ve_kf = mean(VE_KW, na.rm = TRUE),
              n_vessels = n_distinct(VE_REF, na.rm = TRUE)) %>%
    as.data.frame()

table2 <- table2 %>%
    group_by(RT, VE_COU,
             Year, Month,
             LE_RECT, LE_GEAR,
             LE_MET, LENGTHCAT,
             tripInTacsat) %>%
    mutate(n_vessels = n_distinct(VE_REF, na.rm = TRUE))
table2$VE_ID[table2$n_vessels>2] <- NA #only provide vessel id for cells with 1 or 2 vessels

## Aggregation of LogBook data:
table2Save <- table2 %>%
    group_by(RT, VE_COU,
             Year, Month,
             LE_RECT, LE_GEAR,
             LE_MET, VE_ID, LENGTHCAT,
             tripInTacsat) %>%
    summarise(sum_intv = sum(INTV, na.rm = TRUE),
              sum_kwDays = sum(kwDays, na.rm = TRUE),
              sum_le_kg_tot = sum(LE_KG_TOT, na.rm = TRUE),
              sum_le_euro_tot = sum( LE_EURO_TOT, na.rm = TRUE),
              n_vessels = n_distinct(VE_REF, na.rm = TRUE)) %>%
    as.data.frame()
@colinpmillar
Copy link
Contributor

Thanks Hans - will review and add in tonight - can I add you as author of the commit?

@colinpmillar colinpmillar self-assigned this Feb 26, 2020
@hgerritsen
Copy link
Author

hgerritsen commented Feb 26, 2020 via email

@colinpmillar
Copy link
Contributor

Actually you are in the WGSFD sharepoint group - so I can give you write access. That might be easier, as you have actual data to work with - I just have mocked up data.

If are up for that - would you mind looking at adding it to the https://github.com/ices-eg/wg_WGSFD/tree/test-workflow branch?

@hgerritsen
Copy link
Author

hgerritsen commented Feb 26, 2020 via email

@hgerritsen
Copy link
Author

hgerritsen commented Mar 4, 2020 via email

@colinpmillar
Copy link
Contributor

thanks so much!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants