aggregate-rows

Description

aggregate-rows adaptor groups rows with the same identified variable to produce a new column with a chosen mathematic function.

Inputs

data Type: datatable Required: Yes The datatable to be aggregated.

group column names Type: list Required: Yes The names of the columns containing values to be grouped.

aggregations Type: dictionary Required: Yes A dictionary of column names with aggregation method, where the keys are the column names, and the values should be either max, mean, median, min, mode, sum, unique-values (list of distinct values), unique-number (number of distinct values).

Outputs

data Type: datatable A datatable with aggregated data.

Examples

Example 1: Default behaviour.

Inputs:

data:

nameeye coloursystolic blood pressure

Patient A

blue

120

Patient B

brown

125

Patient C

brown

110

Patient D

blue

90

group column names: eye colour

aggregations:

KEYVALUE

systolic blood pressure

mean

Outputs:

data:

eye coloursystolic blood pressure

blue

105

brown

117.5

-> In this example, we use the aggregate-rows adaptor to calculate the average (mean) value of systolic blood pressure between blue eyed patients and brown eyed patients.

Use Cases

  • Determine the mean values of defined attributes of a population (mean height in men versus women in Country Y).

  • Determine the max number of reported cases of pathogen Y in each region in the north of country W.

  • Pull out the earliest (min) or latest (max) dates different pathogens were identified.

  • Show the frequency distribution of values in a dataset to help visualise data in a histogram (count).

Last updated