join-datatables

Description

join-datatables adaptor combines two datatables based on the values of a common column between both datatables.

Inputs

main data Type: datatable Required: Yes The main (left) datatable. When inner join is False, all rows from this datatable will be included.

main column Type: text Required: Yes The column in main data containing values shared by other column in other data.

other data Type: datatable Required: Yes The other (right) datatable, to be joined to main datatable.

other column Type: text Required: No The column in other data containing values shared by main column in main data. If more than one row match main column value, only the first matching row will be joined. If unspecified, the name of main column will be used.

join type Type: text Required: No Specifies the joining method. If unspecified, defaults to Left Join.

case sensitive Type: boolean Required: No When set to False, lowercase and uppercase letters are treated as equivalent when matching values, e.g. Peru = peru. If unspecified, defaults to False.

match diacritics Type: boolean Required: No When set to False, letters with and without diacritics are treated as equivalent when matching values, e.g. Perú = Peru. If unspecified, defaults to False.

columns Type: list Required: No Specifies which columns of other data to include in the output datatable. If unassigned, all columns in other data will be included.

prefix Type: text Required: No An optional text to be prepended to other data column names in the output datatable. If unspecified, column names remain unchanged.

Outputs

data Type: datatable A datatable containing joined rows and columns.

unmatched Type: datatable A datatable containing main data rows which do not have a match in other data.

Examples

Example 1: Default behaviour.

Inputs:

main data:

speciescountry

Bovine

de

Gibbon

fr

Human

gb

Mouse

GB

main column: country

other data:

codename

de

Germany

fr

France

gb

United Kingdom

us

United states

other column: code

join type: null (empty)

case sensitive: null (empty)

match diatrics: null (empty)

columns: null (empty)

prefix: null (empty)

Outputs:

data:

speciescountrycodename

Bovine

de

de

Germany

Gibbon

fr

fr

France

Human

gb

gb

United Kingdom

Mouse

GB

GB

United Kingdom

-> Left Joined the two datatables based on the common columns country and code.

Example 2: Join datatables with specific join type

Inputs:

main data:

speciescountry

Bovine

de

Gibbon

fr

Human

gb

Mouse

GB

Orangutan

Gorilla

main column: country

other data:

codename

de

Germany

fr

France

gb

United Kingdom

us

United states

other column: code

join type: Inner Join

case sensitive: null (empty)

match diatrics: null (empty)

columns: null (empty)

prefix: null (empty)

Outputs:

data:

speciescountrycodename

Bovine

de

de

Germany

Gibbon

fr

fr

France

Human

gb

gb

United Kingdom

Mouse

GB

GB

United Kingdom

-> Inner Joined the two datatables based on the common columns country and code.Since the species Gorilla and Orangutan have no corresponding country value they are not included in the output.

Example 3: Join datatable with specific prefix and case sensitive set to true.

Inputs:

main data:

speciescountry

Bovine

de

Gibbon

fr

Human

gb

Mouse

GB

main column: country

other data:

codename

de

Germany

fr

France

gb

United Kingdom

us

United states

other column: code

join type: Left Join

case sensitive: True

match diatrics: null (empty)

columns: null (empty)

prefix: country

Outputs:

data:

speciescountrycountry.codecountry.name

Bovine

de

de

Germany

Gibbon

fr

fr

France

Human

gb

gb

United Kingdom

Mouse

GB

-> Left Joined the two datatables based on the common columns country and code, with prefix country applied to the other datatable columns.

Use Cases

  • Join outputs of a bioinformatics pipeline with the sample metadata to create a comprehensive file for use in Microreact.

  • Join laboratory and epidemiological data by a common ID.

Last updated