From 789d48deb0caa9f63c369fc97ba4eb9cba05a110 Mon Sep 17 00:00:00 2001
From: alisandra <alisandra.denton@hhu.de>
Date: Wed, 31 Aug 2022 15:29:02 +0200
Subject: [PATCH] adding draft take home instructions

---
 .gitignore                            |   3 +
 README.md                             |   4 +
 TakeHome.md                           |  92 +++++++++++++++++
 workflows/singularity/Singularity.def | 143 ++++++++++++++++++++++++++
 workflows/singularity/notes.md        |  38 +++++++
 5 files changed, 280 insertions(+)
 create mode 100644 TakeHome.md
 create mode 100644 workflows/singularity/Singularity.def
 create mode 100644 workflows/singularity/notes.md

diff --git a/.gitignore b/.gitignore
index 916af75..0a3830b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -389,3 +389,6 @@ runs/isoseq/polished/
 *.aux
 RNAseqWorkshop.out
 RNAseqWorkshop.toc
+
+# singularity
+*.sif
diff --git a/README.md b/README.md
index 35a2866..900fa64 100644
--- a/README.md
+++ b/README.md
@@ -45,3 +45,7 @@ If you want to build the docker images from scratch (rather than downloading fro
 - [workflows/maindocker](workflows/maindocker)
 - [workflows/rstudiodocker](workflows/rstudiodocker)
 - [workflows/userdocker](workflows/userdocker)
+
+### Take home
+
+Find three options for using the images from the course on your own computer [TakeHome.md](here).
diff --git a/TakeHome.md b/TakeHome.md
new file mode 100644
index 0000000..d4ba188
--- /dev/null
+++ b/TakeHome.md
@@ -0,0 +1,92 @@
+# Take Home
+
+You can access the images used in the course via 
+Sciebo: https://uni-duesseldorf.sciebo.de/s/53pA9W9TbOKTgGQ
+
+password provided via e-mail.
+
+All image files referred to below can be found on Sciebo.
+
+We provide three options in order of recommendation, 
+but take what works best for you.
+
+### Where's Rstudio?
+The Rstudio Docker is not mentioned below, as
+R & Rstudio are probably easier for you to install directly.
+See:
+
+https://www.r-project.org/
+
+https://www.rstudio.com/products/rstudio/download/
+
+### Disclaimer
+> These instructions have not actually been tested on a large
+> variety of machines. The good thing is most issues will 
+> probably become generic Docker and/or Singularity issues that
+> are very google-able. So first thing to do is always search
+> the error message, this will be the fastest help. 
+> If it is not _enough_ help, you are welcome to contact us
+> either via e-mail, or better yet by adding an issue to this
+> repository :-)
+
+## Docker - with the user matching inside and outside the container
+i.e. a better solution for permissions, which didn't work during
+the course for reasons that are hard to summarize here and probably
+not relevant on your machine. 
+
+
+1. Install Docker as appropriate for your machine: https://www.docker.com/
+2. Download pre-built image from course from sciebo `rnaseq_docker.tar.gz`
+3. Load into Docker: `docker image load -i rnaseq_docker.tar.gz`
+4. Build a slight modification to the image where the users match
+  - navigate to the directory `workflows/userdocker`, e.g. by using `cd` on linux
+  - run `docker build --build-arg USERID=$(id -u) -t rnaseqme --rm .`, this should take only a few seconds and ~3GB of hard drive space.
+5. Run for the first time!
+  - for instance from the parent directory of where you have `rnaseq-workshop` downloaded
+    via `docker run -it --name rnalive --mount type=bind,source="$(pwd)"/rnaseq-workshop,target=/home/zim-gast/rnaseq-workshop rnaseqme:latest`.
+  - for your own data / files 
+    - you will want to make sure they are all to be found (directly or better yet nested) within one directory
+    - change the `$(pwd)"/rnaseq-workshop` of the command above to point to the directory with _your_ files
+  - if something goes wrong and you have to e.g. try mounting again, you can remove the container with 
+    `docker container rm rnalive` so that the `docker run` command can be repeated. 
+  - if you want to have _multiple_ containers, e.g. to point to separate ARCs for
+    separate projects, you can also replace `rnalive` in the commands above to a descriptive name for _your_ project. 
+6. Resume with `docker start -i rnalive`
+
+> Note: if you did not configure a docker group during install, you may need
+> to preface all commands above with `sudo`.
+
+## Singularity
+A container option targeted more at convenience and less at security.
+(was not available on host machines during the course).
+
+1. Install Singularity as appropriate for your machine (including via VM for Windows or Mac): https://docs.sylabs.io/guides/3.0/user-guide/installation.html
+2. Download pre-build image from course from sciebo `rnaseq-workshop.sif`
+3. Run!
+  - `singularity run rnaseq-workshop.sif`, all files within your home directory should be automatically accessible
+  - do you need some other directory of files? you can add them with `--bind` like this `singularity run --bind <your_directory>:/mnt/ rnaseq-workshop.sif` will
+    make the files available under `/mnt/` in the image. For instance, if I was working on the HHU HPC I might want to run
+    `singularity run --bind /gpfs/project/alden101/projectA:/mnt/ rnaseq-workshop.sif` to mount my folder 'projectA' in my large storage folder on the HPC.
+4. Resuming is the exact same as running. 
+
+
+## Docker - with permissive permissions. 
+This is what we used during the course, and no, it is still not good
+practice. But we're including it for completeness, after all, it's familiar.
+And realistically, if you're on your own machine where no one else, or
+only trusted colleagues have an account, and considering that we're hardly using 
+Docker to run a public facing web server, the risks could be worse. 
+
+> Still, do it this way at your own risk. It doesn't take a malicious actor
+> to accidentally delete important files. Have backups (actually always have backups,
+> anyways, and completely regardless of Docker).  
+
+1. Install Docker as appropriate for your machine: https://www.docker.com/
+2. Download pre-built image from course from sciebo `rnaseq_docker.tar.gz`
+3. Load into Docker: `docker image load -i rnaseq_docker.tar.gz`
+4. Set permissions on your data directory (here `rnaseq-workshop`) to be 
+   and stay permissive: `setfacl TODO  # can just use chmod 777 if you want to test right now`
+5. Run  `docker run -it --name rnalive --mount type=bind,source="$(pwd)"/rnaseq-workshop,target=/home/zim-gast/rnaseq-workshop rnaseq:latest`
+6. All info on adjusting directories, retrying, resuming, and maybe needing `sudo` is the same as for the other Docker option above.
+
+
diff --git a/workflows/singularity/Singularity.def b/workflows/singularity/Singularity.def
new file mode 100644
index 0000000..50e9ab0
--- /dev/null
+++ b/workflows/singularity/Singularity.def
@@ -0,0 +1,143 @@
+Bootstrap: docker
+From: ubuntu:latest
+Stage: spython-base
+
+%files
+python_installs.sh ./
+./first.sh /opt/
+%post
+#FROM nvidia/cuda:11.2.0-cudnn8-runtime-ubuntu20.04
+
+# Overide user name at build, if buil-arg no passed, will create user named `default` user
+export DOCKER_USER=zim-gast
+
+
+# Create a group and user
+adduser $DOCKER_USER --no-create-home
+mkdir /opt/$DOCKER_USER
+# mv because $DOCKER_USER did not exist yet at file copy
+mv /opt/first.sh /opt/$DOCKER_USER/
+
+#RUN useradd --create-home --shell /bin/bash zim-gast 
+apt-get update -y
+apt install python3-dev \
+python3-pip \
+git \
+libhdf5-dev \
+curl \
+wget \
+nano vim emacs -y
+apt-get autoremove -y
+
+export DEBIAN_FRONTEND=noninteractive
+export TZ=Europe/Berlin
+
+apt install tzdata libncurses5-dev zlib1g-dev libbz2-dev liblzma-dev cmake jellyfish python-tk libcurl4-openssl-dev libgit2-dev libssl-dev -y
+
+mkdir /opt/$DOCKER_USER/repos && \
+cd /opt/$DOCKER_USER/repos && \
+git clone https://github.com/alisandra/RNAseq_workshop_helpers.git && \
+mkdir /opt/$DOCKER_USER/bin && \
+find RNAseq_workshop_helpers . -maxdepth 2 -type f -executable|xargs -I% cp % /opt/$DOCKER_USER/bin/
+
+
+cd /opt/$DOCKER_USER/bin
+wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/faToTwoBit && chmod +x faToTwoBit && \
+wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/blat/blat && chmod +x blat
+
+
+# --- classic bioinf --- #
+cd /opt/$DOCKER_USER/
+apt install hisat2 \
+bowtie2 \
+augustus \
+gffread \
+fastqc \
+salmon \
+samtools \
+minimap2 \
+mash \
+cd-hit tar bzip2 \
+libhdf5-dev m4 -y
+# last ones are for kallisto
+
+# --- used to be conda, now binaries... --- #
+
+# for virtualenv intro
+pip install HTSeq virtualenv
+wget https://anaconda.org/bioconda/isoseq3/3.7.0/download/linux-64/isoseq3-3.7.0-h9ee0642_0.tar.bz2 && \
+tar xvf isoseq3-3.7.0-h9ee0642_0.tar.bz2 && \
+wget https://anaconda.org/bioconda/lima/2.6.0/download/linux-64/lima-2.6.0-h9ee0642_0.tar.bz2 && \
+tar xvf lima-2.6.0-h9ee0642_0.tar.bz2 && \
+wget https://anaconda.org/bioconda/pbccs/6.4.0/download/linux-64/pbccs-6.4.0-h9ee0642_0.tar.bz2 && \
+tar xvf pbccs-6.4.0-h9ee0642_0.tar.bz2 && \
+wget https://anaconda.org/bioconda/bax2bam/0.0.11/download/linux-64/bax2bam-0.0.11-0.tar.bz2 && \
+tar xvf bax2bam-0.0.11-0.tar.bz2
+
+# kallisto
+cd /opt/$DOCKER_USER/repos && \
+curl -O -L http://ftpmirror.gnu.org/autoconf/autoconf-2.69.tar.gz && \
+tar -xzf autoconf-2.69.tar.gz && cd /opt/$DOCKER_USER/repos/autoconf-2.69 && \
+./configure && make && make install && cd /opt/$DOCKER_USER/repos && \
+git clone https://github.com/pachterlab/kallisto.git && \
+mkdir kallisto/build  && \
+cd /opt/$DOCKER_USER/repos/kallisto/build && \
+cmake -DCMAKE_INSTALL_PREFIX=/opt/$DOCKER_USER/ -DUSE_HDF5=ON .. && make && make install
+# python
+./python_installs.sh && rm python_installs.sh
+
+
+# jars
+mkdir /opt/$DOCKER_USER/sw && \
+cd /opt/$DOCKER_USER/sw && \
+wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.39.zip && \
+apt install unzip -y && \
+unzip Trimmomatic-0.39.zip && \
+rm Trimmomatic-0.39.zip
+
+# cleanup
+cd /opt/$DOCKER_USER/
+rm *.bz2 && rm -r info
+
+
+# shared folder
+# rnaseq-workshop folder
+wget https://github.com/git-lfs/git-lfs/releases/download/v3.2.0/git-lfs-linux-amd64-v3.2.0.tar.gz && \
+mv git-lfs-linux-amd64-v3.2.0.tar.gz sw/ && \
+cd /opt/$DOCKER_USER/sw/ && \
+tar xvf git-lfs-linux-amd64-v3.2.0.tar.gz && \
+cd /opt/$DOCKER_USER/sw/git-lfs-3.2.0/ && \
+./install.sh && \
+rm ../git-lfs-linux-amd64-v3.2.0.tar.gz
+cd /opt/$DOCKER_USER/
+
+#RUN git clone https://git.nfdi4plants.org/brilator/rnaseq-workshop.git
+
+mkdir /opt/$DOCKER_USER/rnaseq-workshop
+
+apt install gmap -y
+rm -rf /var/lib/apt/lists/*
+
+# EXPOSE 8889
+
+chown $DOCKER_USER:$DOCKER_USER /opt/$DOCKER_USER/first.sh
+
+cd /opt/$DOCKER_USER/repos/alisandra/cDNA_Cupcake && \
+pip install .
+
+su -  $DOCKER_USER # USER $DOCKER_USER
+
+git lfs install
+echo "alias gmap='/usr/bin/gmap'" >> .bashrc
+
+
+%environment
+DOCKER_USER=zim-gast
+export TZ=Europe/Berlin
+export PATH=/opt/$DOCKER_USER/.local/bin:${PATH}
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/hdf5/serial/lib
+export PATH="/opt/$DOCKER_USER/bin:${PATH}"
+%runscript
+exec /bin/bash "$@"
+%startscript
+exec /bin/bash "$@"
diff --git a/workflows/singularity/notes.md b/workflows/singularity/notes.md
new file mode 100644
index 0000000..402b30e
--- /dev/null
+++ b/workflows/singularity/notes.md
@@ -0,0 +1,38 @@
+# Singularity from Docker
+
+While there's probably better ways, a 1:1 generation of the
+Singularity image from Docker would not have entirely made sense.
+So instead a singularity build file was autogenerated from the Dockerfile,
+changes were applied, and a singularity image was subsequently built.
+
+## autogen
+The file Singularity.def was initially created automatically from the 'maindocker/Dockerfile' via
+spython, according to info found here: https://stackoverflow.com/questions/60314664/how-to-build-singularity-container-from-dockerfile
+in the answer from Serge. 
+
+Briefly, in a virtual environment:
+
+```bash
+pip install spython
+cd </path/to/maindocker>
+spython recipe Dockerfile &> ../singularity/Singularity.def
+```
+## tailor for singularity (and to actually build successfully)
+Substantial modifications were necessary, e.g.
+making variables TZ, DOCKER\_USER, and DEBIAN\_FRONTEND available during
+and after build as necessary (changing to be in the right `%` block, and
+adding `export`).
+
+Changing '/home' to '/opt', as singularity will mount the 'home' from the
+host and having container content there can lead to trouble. 
+
+Changing the entry command to simply be `/bin/bash`
+
+Remove recipe `cd` changes so that default working directory is $HOME
+
+Instead of an exact listing, please simply find the file Singularity.def,
+included here with modifications. Run a diff to the automatically
+generated one, if useful. 
+
+## build
+`sudo singularity build rnaseq-workshop.sif Singularity.def`
-- 
GitLab