Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update definitions of terms in file format #95

Open
peterdesmet opened this issue Jun 21, 2022 · 15 comments
Open

Update definitions of terms in file format #95

peterdesmet opened this issue Jun 21, 2022 · 15 comments
Assignees
Labels
1.Documentation Improvements or additions to documentation 2.Format
Milestone

Comments

@peterdesmet
Copy link
Collaborator

To be provided by @peterdesmet

@peterdesmet peterdesmet self-assigned this Jun 21, 2022
@cmspinto cmspinto added 1.Documentation Improvements or additions to documentation 2.Format labels Jun 22, 2022
@peterdesmet
Copy link
Collaborator Author

peterdesmet commented Jul 26, 2022

I have reviewed and updated all definitions on https://esas-docs.ices.dk/tables/, repeated below.

@cmspinto can you update these at http://datsu.ices.dk/web/selRep.aspx?Dataset=148, retaining links and code formatting.

  • The column "Additional information" can be removed, as it is never used.

File information

  • Record type. Use FI for Information records.
  • Organization owning the legal rights over the resource. Combine multiple values with ~.
  • Country associated with the DataRightsHolder. Combine multiple values with ~ in the same order as DataRightsHolder.

Campaign

  • Record type. Use EC for Campaign records.
  • Unique identifier for the campaign. Choose a stable identifier, as it is used in combination with DataRightsHolder to detect resubmissions of data.
  • Access to the data (Public or Restricted). See the ICES Data Policy for conditions.
  • Start date of the campaign, formatted as YYYY-MM-DD.
  • End date of the campaign, formatted as YYYY-MM-DD.
  • Additional details about the campaign, such as research purpose, area covered or route.

Sample

  • Record type. Use ES for Sample records.
  • Unique identifier of the campaign this sample belongs to.
  • Unique identifier for the sample.
  • Sampling date, formatted as YYYY-MM-DD.
  • Platform from which sampling took place.
  • Type of the platform.
  • Side of the platform from which sampling took place.
  • Height (in m) of the platform during sampling, e.g. average flying height of airplane.
  • Width (in m) of the sampling transect.
  • Sampling method.
  • Whether the sample should be considered primary (True) or ancillary (False) in the case of observers recording concurrent but independent data streams.
  • Species (groups) that were counted during sampling.
  • List of distance bin boundaries applied during sampling and to be used for distance analyses, formatted as x|x|...|x. For a typical ship-based sample using a 300 m wide transect this would be 0|50|100|200|300.
  • Extent to which binoculars were used during sampling.
  • Number of observers that contributed to the sample.
  • Additional details about the sample, such as the name of the ship in case it has no SHIPC code.

Position

  • Record type. Use EP for Position records.
  • Unique identifier of the sample this position belongs to.
  • Unique identifier for the position.
  • UTC time at the start of the trajectory/observation bin of this position, formatted as hh:mm:ss.
  • Latitude of the position in decimal degrees, using the WGS84 datum. Often the calculated midpoint of a trajectory.
  • Longitude of the position in decimal degrees, using the WGS84 datum. Often the calculated midpoint of a trajectory.
  • Distance (in km) travelled during the observation bin. Can only remain empty if Area is used.
  • Area (in km²) of sea surveyed during the observation bin. Can only remain empty if Distance is used.
  • Wind force and sea state according to the Beaufort scale.
  • Visibility (in km). Use a range code (A-D) for historical data.
  • Glare that could be affecting observation quality.
  • Angle of the sun in relation to the observer (0359), with 0 being straight ahead.
  • Cloud cover (in oktas).
  • Precipitation.
  • Ice cover percentage within the transect.
  • General impression of the observation conditions.

Observation

  • Record type. Use EO for Observation records.
  • Unique identifier of the position this observation belongs to.
  • Unique identifier for the observation.
  • Unique identifier for an aggregation of individuals. Observations in a Sample with the same GroupID are thus indicated as being observed as part of the same group.
  • Whether the observation was in (True) or out (False) of the transect.
  • Type of SpeciesCode.
  • Code of the observed species. See the species lookup table.
  • Number of animals counted or estimated (not corrected for distance).
  • Distance (in m) at which the animal(s) was observed. Values should be the midpoint of the distance bins defined in DistanceBins or use > to indicate distances outside the transect (e.g. 25, 75, 150, 250 and >300 would be the only valid values for distanceBins = 0|50|100|200|300). For ship-based sampling of birds and marine mammals in contact with water using a standard 300 m transect, one can use the ObservationDistance vocabulary.
  • Life stage of the animal(s), based on plumage.
  • Primary moult of the animal(s). Use only for fulmar, auks, divers and seaduck.
  • Plumage type of the animal(s).
  • Sex of the animal(s).
  • Direction in which the animal(s) is travelling. Use degrees (10° increments) to indicate directions relative to the direction of the platform and (inter)cardinal directions (e.g. NE for heading Northwest) for absolute directions.
  • Observed prey (type) caught or carried by the animal(s).
  • Associations between the animal(s) and vessels/structures/floating matter. Combine multiple values with ~. Originally described in Camphuysen & Garthe (2004) Appendix 1.
  • Observed behaviour of the animal(s). Combine multiple values with ~. Originally described in Camphuysen & Garthe (2004) Appendix 2.
  • Additional details about the observation.

@cmspinto
Copy link
Collaborator

@peterdesmet: I guess you are talking about format descriptions?

Maybe this is something @Osanna123 could look into.

@peterdesmet
Copy link
Collaborator Author

Indeed, I am talking about updating these descriptions (for all 5 tables):

Screenshot 2022-07-27 at 10 59 33

@peterdesmet peterdesmet assigned cmspinto and Osanna123 and unassigned Osanna123 Jul 27, 2022
@cmspinto
Copy link
Collaborator

Hi @peterdesmet, these definitions are in our data screening utility (DATSU) and @Osanna123 can look into this.

She might be a bit busy at the moment, talked with her yesterday and she had a few meetings this week (end of vacations time and meetings start again), but I'm sure she will look into this as soon as possible.

@peterdesmet
Copy link
Collaborator Author

👍 it's not very urgent.

@nicolasvanermen
Copy link
Collaborator

@peterdesmet
Could the type column be in a different text format? This would improve readilibility in for example the description of DistanceBins
image

@nicolasvanermen
Copy link
Collaborator

Having said this... Great job!

@peterdesmet
Copy link
Collaborator Author

@nicolasvanermen what if I switch around the columns for vocabulary and data type? The vocab column would always be a link so it stands out easily.

@nicolasvanermen
Copy link
Collaborator

Perfect!

peterdesmet added a commit that referenced this issue Aug 1, 2022
@peterdesmet
Copy link
Collaborator Author

@nicolasvanermen updated at https://esas-docs.ices.dk/tables/

@peterdesmet peterdesmet added this to the Version 1.0 milestone Sep 5, 2022
@peterdesmet
Copy link
Collaborator Author

@cmspinto this is a documentation issue, all action points are listed at #95 (comment)

@neil-ices-dk
Copy link
Member

@Osanna123 to look at the description changes (field labels); we are restricted by the generic DATSU design in implementing all the suggested changes (RecordType) specifically.

@Osanna123
Copy link

Osanna123 commented Sep 20, 2022

Comments:
-The column "Additional information" can be removed, as it is never used. - this is the generic DATSU design important for some other datasets.
-All field descriptions in DATSU are specified per 'version', not per each record. In this case, 'version' is biodiversity/birds data. Practically, this means that all datasets (and records) within the same version share the same description for the same field name. So the description for record type or notes is generic, valid for all records. the same goes for the key fields.
-I think that the DataRightsHolder description including an EDMO code is important, especially at this stage as many new users won't understand that we're requesting a code, not a name or an acronym. the same goes for other fields with controlled vocabs
-DataRightsHolder and Country - there were different considerations during the project whether to allow multiples for these 2 fields. The latest is not to, right? The descriptions should then not include the tilde option
-in present DATSU design, hyperlinks can not be embedded, only added as text.

@neil-ices-dk
Copy link
Member

DataRightsHolder
suggest:
Organisation owning the legal rights over the resource expressed by EDMO code. Multiple values can be reported with use of tilde (~) as a separator.

-DataRightsHolder and Country - there were different considerations during the project whether to allow multiples for these 2 fields. The latest is not to, right? The descriptions should then not include the tilde option

@nicolasvanermen to confirm

@nicolasvanermen
Copy link
Collaborator

Multiple values are indeed allowed, there I would suggest:
Organisation(s) owning the legal rights over the resource expressed by EDMO code. Multiple values can be reported with use of tilde (~) as a separator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.Documentation Improvements or additions to documentation 2.Format
Projects
None yet
Development

No branches or pull requests

5 participants