Skip to content
RosaMGini edited this page Apr 14, 2021 · 9 revisions

MergeFilterAndCollapse

Context

MergeFilterAndCollapse is used in the context of the execution of a specific family of steps in the data processing of a multi-database study. The step is part of the ‘study variable’ process, labelled T2 in the conceptualization by Gini et al, (Gini et al, eGEMS 2016). The step allows computing study variables for the unit of observations of the study, based on multiple observations occurred during routine healthcare. In the Deliverable 7.5 of ConcePTION this step is further analysed and step T2 is conceptualised as T2.1 (extraction from the raw data), T2.2 (creation of components by processing at record level and within records of the same unit of observation) and T2.3 (creation of composites by processing components). See also the representation at this link. MergeFilterAndCollapse provides an interface for many of the most common steps of T2.2

Purpose

A dataset containing one observation per unit of observation is to be merged with one or more longitudinal datasets. The result is then filtered per some conditions (eg on the timeframe of the longitudinal observations), and then, as an option, collapsed to obtain again one record per unit of observation. Before collapsing, as an option, additional record-level variables can be added, and the resulting dataset can be saved as an intermediate result.

Structure of input data

  • listdatasetL a list of one or more data.table() datasets, containing multiple records per -key-.
  • datasetS(optional) a data.table() dataset, containing one record or more recors per -key
Clone this wiki locally