# About Data-flo

**Data-flo** (pronounced data-flow) is an open-source web application for data integration. It provides an easy-to-use visual interface to design reusable Workflows (data pipelines) that import, merge, clean, and manipulate data in many different ways. Once a Workflow has been created, it can be run anytime, by anyone with access, to enable push-button data extraction and transformation.

<figure><img src="https://lh7-us.googleusercontent.com/dUWpHeLPF9StYSWBZWz3xdKJqc-jCC5U87xJpbawhrdlYKtkusoj71GGnzUW_GVN0hZ9n2QKOn4PDkqt3u-lmz9kSf08hGkQ6ADs4c3PJxtuJmWLJshG9xSnhE4fafSRziZV_ugtRS2gj0tI5XFG3H0t0Q=s2048" alt=""><figcaption><p>A diagram of a Data-flo workflow indicating the different data sources and output destinations. </p></figcaption></figure>

### Why use Data-flo?

Data-flo saves you time by **removing the bulk of the manual repetitive workflows** that require multiple, sequential, or tedious steps, **enabling you to focus on analysis and interpretation**.

Armed with Data-flo, users can:

* Rapidly prepare data for visualization, and reporting&#x20;
* Easily share processed data between teams&#x20;
* Consistently reproduce and validate data transformation procedures for updated or new datasets&#x20;
* Seamlessly integrate data from multiple databases and sources
* Automatically update a [Microreact ](https://microreact.org/)project with fresh data

While Data-flo is used across many sectors (data science, academia, public health institutions etc.), it contains a number of features tailored to manage bioinformatics-related datasets such as Newick files, tree files, etc.&#x20;

## Citation

If you use Data-flo within a publication please cite:

> Centre for Genomic Pathogen Surveillance. 2019. Data-flo. <https://data-flo.io>. \[Date accessed].
