Skip to content
Snippets Groups Projects
Commit d154be9e authored by Dominik Brilhaus's avatar Dominik Brilhaus
Browse files

add cwl docs

parent 6fd57ad2
No related branches found
No related tags found
1 merge request!16Cwl
Pipeline #9408 passed
# CWL
**Data analysis** in this ARC is packaged and made reusable via [Common Workflow Language (CWL)](https://www.commonwl.org).
For details, visit the [DataPLANT knowledgebase](https://nfdi4plants.github.io/nfdi4plants.knowledgebase/cwl/).
Briefly, every data analysis step (`runs`) is described with a `run.cwl` document. The `run.cwl` points (i.e. executes) one or multiple `workflows` (stored as `workflow.cwl`). The input parameters required for the `workflow.cwl` are documented in the accompanying `run.yml`. A `workflow.cwl` can be a single command line tool or a more complex workflow pipeline that references and combines other `*.cwl` documents.
```bash
...
├── runs
│ ├── fastqc
│ │ ├── run.cwl
│ │ └── run.yml
│ ...
├── studies
│ ├── ...
└── workflows
├── fastqc
│ ├── collectFilesInDir.cwl
│ ├── fastqc.cwl
│ └── workflow.cwl
...
```
```mermaid
%%{ init: {"flowchart": { "wrappingWidth": "10000" }}}%%
flowchart TD
workflowcwl --o runcwl
subgraph r["runs/fastqc/"]
runcwl@{ shape: doc, label: "run.cwl" }
runyml@{ shape: doc, label: "run.yml" }
end
i[input: DB_097_CAMMD_CAGATC_L001_R1_001.fastq.gz] --o runyml
r --> o[output: DB_097_CAMMD_CAGATC_L001_R1_001_fastqc.html]
subgraph "workflows/fastqc"
workflowcwl@{ shape: doc, label: "workflow.cwl" }
end
```
## Setup and dependencies
Again, for details check the docs linked above.
Executing cwl documents requires a cwl runner, e.g. [cwltool](https://github.com/common-workflow-language/cwltool).
Software and package dependencies are ideally covered by Docker or Conda and described in the hints / requirements sections of cwl documents (e.g. `DockerRequirement` and / or `SoftwareRequirement`).
Additional dependencies may exist for one or the other workflow (e.g. a local installation of R or F# or packages therein), if the workflow is not yet packaged perfectly reusable.
## Default cwltool commands
Here's a list of frequently used `cwltool` commands to validate or execute runs and workflows.
### Validate document
```bash
cwltool --validate run.cwl
```
### Execute workflow in `./runs/*`
```bash
cwltool run.cwl run.yml
```
### capture log and run in bg
```bash
cwltool run.cwl run.yml > $(date +"%Y-%m-%d_%H-%M")-run.log 2>&1 &
```
### capture log, run in parallel and in bg
```bash
cwltool --parallel run.cwl run.yml > $(date +"%Y-%m-%d_%H-%M")-run.log 2>&1 &
```
### Print workflow to file
```bash
cwltool --print-dot run.cwl | dot -Tsvg > run.svg
```
# CWL
## Organisation
### Runs
Every run is described with a `run.cwl` workflow that points to (i.e. steps through) one or multiple workflows or tools.
## Default cwltool commands
### Validate document
```bash
cwltool --validate run.cwl
```
### Execute workflow in `./runs/*`
```bash
cwltool run.cwl run.yml
```
### capture log and run in bg
```bash
cwltool run.cwl run.yml > $(date +"%Y-%m-%d_%H-%M")-run.log 2>&1 &
```
### capture log, run in parallel and in bg
```bash
cwltool --parallel run.cwl run.yml > $(date +"%Y-%m-%d_%H-%M")-run.log 2>&1 &
```
### Print workflow to file
```bash
cwltool --print-dot run.cwl | dot -Tsvg > run.svg
```
```bash
bash plot-cwls.sh "../../" "run.cwl" "runs-wfls.txt" "runs"
```
\ No newline at end of file
# Runs
See [.cwl/README.md](../.cwl/README.md) for more info.
# Workflows
See [.cwl/README.md](../.cwl/README.md) for more info.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment