> For the complete documentation index, see [llms.txt](https://cgps.gitbook.io/data-flo/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://cgps.gitbook.io/data-flo/reference-guide/remove-duplicate-rows.md).

# remove-duplicate-rows

## Description

<mark style="color:purple;">`remove-duplicate-rows`</mark> *adaptor removes duplicate rows from a datatable*.

If a duplicate is found, those duplicated rows are placed in a separate datatable.

## Inputs

**`data`**\
Type: `datatable`\
Required: Yes\
The datatable containing duplicate rows.

**`column names`**\
Type: `list`\
Required: No\
A list of columns to compare for duplicate values. If unspecified, entire rows will be compared.

**`case sensitive`**\
Type: `boolean`\
Required: No\
When set to `True`, lowercase and uppercase letters are treated as different. When set to `False`, lowercase and uppercase letters are treated as equivalent. If unspecified, defaults to `False`

## Outputs

**`data`**\
Type: `datatable`\
A datatable containing only unique rows, including first instance of duplicate rows.

**`duplicates`**\
Type: `datatable`\
A datatable containing duplicate rows.

## Examples

### Example 1: Default behaviour.

#### Inputs:

`data:`

| id | code | name           |
| -- | ---- | -------------- |
| 1  | GB   | United Kingdom |
| 2  | TR   | Turkey         |
| 3  | US   | United States  |
| 4  | IND  | India          |
| 5  | IND  | India          |
| 6  | us   | United States  |

`column name:` *null (empty)*

`case sensitive:` *null (empty)*

#### Outputs:

`data`:

| id | code | name           |
| -- | ---- | -------------- |
| 1  | GB   | United Kingdom |
| 2  | TR   | Turkey         |
| 3  | US   | United States  |
| 4  | IND  | India          |
| 5  | IND  | India          |
| 6  | us   | United States  |

`duplicates:` Empty Table

-> By default all the rows are compared, since all the values in the `id` column are unique no duplicate rows were removed.

### Example 2: Specify column names to compare.

#### Inputs:

`data:`

| id | code | name           |
| -- | ---- | -------------- |
| 1  | GB   | United Kingdom |
| 2  | TR   | Turkey         |
| 3  | US   | United States  |
| 4  | IND  | India          |
| 5  | IND  | India          |
| 6  | us   | United States  |

`column name:`

1. code
2. name

`case sensitive:` *null (empty)*

#### Outputs:

`data`:

| id | code | name           |
| -- | ---- | -------------- |
| 1  | GB   | United Kingdom |
| 2  | TR   | Turkey         |
| 3  | US   | United States  |
| 4  | IND  | India          |

`duplicates:`

| id | code | name          |
| -- | ---- | ------------- |
| 1  | IND  | India         |
| 2  | us   | United States |

-> Removed the duplicate rows *5* and *6*.

### Example 3: Specify case sensitive column names to compare.

#### Inputs:

`data:`

| id | code | name           |
| -- | ---- | -------------- |
| 1  | GB   | United Kingdom |
| 2  | TR   | Turkey         |
| 3  | US   | United States  |
| 4  | IND  | India          |
| 5  | IND  | India          |
| 6  | us   | United States  |

`column name:`

1. code
2. name

`case sensitive:` True

#### Outputs:

`data`:

| id | code | name           |
| -- | ---- | -------------- |
| 1  | GB   | United Kingdom |
| 2  | TR   | Turkey         |
| 3  | US   | United States  |
| 4  | IND  | India          |
| 5  | us   | United States  |

`duplicates:`

| id | code | name  |
| -- | ---- | ----- |
| 1  | IND  | India |

-> Removed only the duplicate row *5* as row *6* `code` column is in lowercase and does not match row *3* .

## Use case

* Removing duplicated rows after using [<mark style="color:purple;">`append-datatables`</mark>](/data-flo/reference-guide/append-datatables.md) adaptor on two partially-overlapping datasets.
* Determining which rows are duplicated, and how many times they are duplicated.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://cgps.gitbook.io/data-flo/reference-guide/remove-duplicate-rows.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
