# aggregate-rows

## Description

<mark style="color:purple;">`aggregate-rows`</mark> *adaptor groups rows with the same identified variable to produce a new column with a chosen mathematic function.*

## Inputs

**`data`**\
Type: `datatable`\
Required: Yes\
The datatable to be aggregated.

**`group column names`**\
Type: `list`\
Required: Yes\
The names of the columns containing values to be grouped.

**`aggregations`**\
Type: `dictionary`\
Required: Yes\
A dictionary of column names with aggregation method, where the keys are the column names, and the values should be either `max`, `mean`, `median`, `min`, `mode`, `sum`, `unique-values` (list of distinct values), `unique-number` (number of distinct values).

## Outputs

**`data`**\
Type: `datatable`\
A datatable with aggregated data.

## Examples

### Example 1: Default behaviour.

#### Inputs:

`data`:

<table><thead><tr><th width="143">name</th><th width="116">eye colour</th><th width="213">systolic blood pressure</th></tr></thead><tbody><tr><td>Patient A</td><td>blue</td><td>120</td></tr><tr><td>Patient B</td><td>brown</td><td>125</td></tr><tr><td>Patient C</td><td>brown</td><td>110</td></tr><tr><td>Patient D</td><td>blue</td><td>90</td></tr></tbody></table>

`group column names`: eye colour

`aggregations`:

<table><thead><tr><th width="300">KEY</th><th width="100">VALUE</th></tr></thead><tbody><tr><td>systolic blood pressure</td><td>mean</td></tr></tbody></table>

#### Outputs:

`data`:

<table><thead><tr><th width="133">eye colour</th><th width="212">systolic blood pressure</th></tr></thead><tbody><tr><td>blue</td><td>105</td></tr><tr><td>brown</td><td>117.5</td></tr></tbody></table>

-> In this example, we use the <mark style="color:purple;">`aggregate-rows`</mark> adaptor to calculate the average (mean) value of systolic blood pressure between blue eyed patients and brown eyed patients.

## Use Cases

* Determine the mean values of defined attributes of a population (mean height in men versus women in Country Y).
* Determine the max number of reported cases of pathogen Y in each region in the north of country W.
* Pull out the earliest (min) or latest (max) dates different pathogens were identified.
* Show the frequency distribution of values in a dataset to help visualise data in a histogram (count).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://cgps.gitbook.io/data-flo/reference-guide/aggregate-rows.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
