aggregate-rows
Description
aggregate-rows adaptor groups rows with the same identified variable to produce a new column with a chosen mathematic function.
Inputs
data
Type: datatable
Required: Yes
The datatable to be aggregated.
group column names
Type: list
Required: Yes
The names of the columns containing values to be grouped.
aggregations
Type: dictionary
Required: Yes
A dictionary of column names with aggregation method, where the keys are the column names, and the values should be either max, mean, median, min, mode, sum, unique-values (list of distinct values), unique-number (number of distinct values).
Outputs
data
Type: datatable
A datatable with aggregated data.
Examples
Example 1: Default behaviour.
Inputs:
data:
Patient A
blue
120
Patient B
brown
125
Patient C
brown
110
Patient D
blue
90
group column names: eye colour
aggregations:
systolic blood pressure
mean
Outputs:
data:
blue
105
brown
117.5
-> In this example, we use the aggregate-rows adaptor to calculate the average (mean) value of systolic blood pressure between blue eyed patients and brown eyed patients.
Use Cases
Determine the mean values of defined attributes of a population (mean height in men versus women in Country Y).
Determine the max number of reported cases of pathogen Y in each region in the north of country W.
Pull out the earliest (min) or latest (max) dates different pathogens were identified.
Show the frequency distribution of values in a dataset to help visualise data in a histogram (count).
Last updated