Data-flo Docs
  • About Data-flo
    • Data-flo Major Update in April 2024
      • Migrating your workflows into the latest version of Data-flo
        • Streamlined adaptors
        • Deprecated adaptors
        • New and retained adaptors
    • Change Log
    • Privacy and Terms of Service
    • Open source software used by Data-flo
  • Data-flo Basics
    • Account
    • Navigation
    • Terminology
    • Interface Icons
    • Data-flo's building blocks: Adaptors
      • Using adaptors to import data
      • Using adaptors to process data
      • Using adaptors to export data
    • Combining adaptors to create workflows
      • Creating a workflow
        • Building a workflow from scratch
        • Cloning an existing workflow
        • Importing a .dataflo file
      • Testing your workflows
      • Running your workflows
      • Accessing your workflows
  • Adaptor reference guide
    • add-column
    • add-jittering
    • add-value-to-dictionary
    • aggregate-rows
    • append-datatables
    • append-lists
    • append-to-list
    • apply-force-directed-layout
    • calculate-column
    • calculate-time-difference
    • change-column-case
    • compare-columns
    • concatenate-columns
    • concatenate-text
    • convert-date-to-text
    • convert-list-to-datatable
    • convert-text-to-datatable
    • convert-text-to-list
    • create-dictionary-from-datatable
    • create-google-drive-folder
    • create-graph-from-datatable
    • create-graph-from-dot
    • create-list-from-datatable
    • create-text-from-template
    • duplicate-column
    • export-file-to-google-drive
    • export-file-to-smb-share
    • export-graph-to-dot-file
    • export-text-to-file
    • export-to-csv-file
    • export-to-dbf-file
    • export-to-google-sheet
    • export-to-microreact-project
    • export-to-sqlite-file
    • filter-list
    • filter-rows
    • find-value-in-dictionary
    • find-value-in-list
    • format-date-column
    • format-time-column
    • geocoding
    • import-file-from-dropbox
    • import-file-from-figshare
    • import-file-from-google-drive
    • import-file-from-http-request
    • import-file-from-s3
    • import-file-from-smb-share
    • import-file-from-url
    • import-from-csv-file
    • import-from-dbf-file
    • import-from-epicollect-project
    • import-from-excel-file
    • import-from-google-sheet
    • import-from-json-file
    • import-from-microreact-project
    • import-from-mysql
    • import-from-oracle
    • import-from-postgres
    • import-from-spreadsheet-file
    • import-from-sql-server
    • import-from-sqlite
    • import-list-from-text-file
    • import-text-from-file
    • join-datatables
    • list-datatable-columns
    • list-newick-leaf-labels
    • map-column-values
    • prepend-to-list
    • query-datatable
    • remove-columns
    • remove-duplicate-list-values
    • remove-duplicate-rows
    • rename-columns
    • rename-newick-leaf-labels
    • replace-blank-values
    • replace-values-in-columns
    • replace-values-in-list
    • replace-values-in-text
    • reshape-long-to-wide
    • reshape-wide-to-long
    • reverse-geocoding
    • run-openai-model
    • run-replicate-model
    • run-workflow
    • sample-datatable
    • select-columns
    • select-list-values
    • select-rows
    • send-email-message
    • sort-datatable
    • sort-list
    • split-column
    • split-geographical-coordinates
    • split-list
    • summarise-datatable
    • transform-columns
    • workflow-repeater
  • Applying Data-flo
    • Basics in Minutes!
      • Quick Workflow
        • Step 1: Configure a solo adapter to view data.
        • Step 2: Add and link a second adaptor.
        • Step 3: Add a value.
        • Step 4: Complete the workflow.
        • Step 5: Run the workflow.
        • Step 6: Share the workflow.
  • API
    • Data-flo API
    • API Access Tokens
  • Support
    • Contact and Feedback
    • Private Installations
Powered by GitBook
On this page
  • Description
  • Inputs
  • Outputs
  • Examples
  • Example 1: Default behaviour.
  • Example 2: Specify column names to compare.
  • Example 3: Specify case sensitive column names to compare.
  • Use case
  1. Adaptor reference guide

remove-duplicate-rows

Description

remove-duplicate-rows adaptor removes duplicate rows from a datatable.

If a duplicate is found, those duplicated rows are placed in a separate datatable.

Inputs

data Type: datatable Required: Yes The datatable containing duplicate rows.

column names Type: list Required: No A list of columns to compare for duplicate values. If unspecified, entire rows will be compared.

case sensitive Type: boolean Required: No When set to True, lowercase and uppercase letters are treated as different. When set to False, lowercase and uppercase letters are treated as equivalent. If unspecified, defaults to False

Outputs

data Type: datatable A datatable containing only unique rows, including first instance of duplicate rows.

duplicates Type: datatable A datatable containing duplicate rows.

Examples

Example 1: Default behaviour.

Inputs:

data:

id
code
name

1

GB

United Kingdom

2

TR

Turkey

3

US

United States

4

IND

India

5

IND

India

6

us

United States

column name: null (empty)

case sensitive: null (empty)

Outputs:

data:

id
code
name

1

GB

United Kingdom

2

TR

Turkey

3

US

United States

4

IND

India

5

IND

India

6

us

United States

duplicates: Empty Table

-> By default all the rows are compared, since all the values in the id column are unique no duplicate rows were removed.

Example 2: Specify column names to compare.

Inputs:

data:

id
code
name

1

GB

United Kingdom

2

TR

Turkey

3

US

United States

4

IND

India

5

IND

India

6

us

United States

column name:

  1. code

  2. name

case sensitive: null (empty)

Outputs:

data:

id
code
name

1

GB

United Kingdom

2

TR

Turkey

3

US

United States

4

IND

India

duplicates:

id
code
name

1

IND

India

2

us

United States

-> Removed the duplicate rows 5 and 6.

Example 3: Specify case sensitive column names to compare.

Inputs:

data:

id
code
name

1

GB

United Kingdom

2

TR

Turkey

3

US

United States

4

IND

India

5

IND

India

6

us

United States

column name:

  1. code

  2. name

case sensitive: True

Outputs:

data:

id
code
name

1

GB

United Kingdom

2

TR

Turkey

3

US

United States

4

IND

India

5

us

United States

duplicates:

id
code
name

1

IND

India

-> Removed only the duplicate row 5 as row 6 code column is in lowercase and does not match row 3 .

Use case

  • Determining which rows are duplicated, and how many times they are duplicated.

Previousremove-duplicate-list-valuesNextrename-columns

Last updated 1 year ago

Removing duplicated rows after using adaptor on two partially-overlapping datasets.

append-datatables