-
Dominik Brilhaus authoredDominik Brilhaus authored
00_section_arc.tex 6.65 KiB
\section{Course material}
This workshop is organized based on an ARC (Annotated Research Context).
\subsection{Annotated Research Context}
The ARC stores all required input data, backup data, scripts, presentations as well as this reader in one research data package.
\subsubsection{The basic ARC structure summarized}
\begin{itemize}
\item \il{studies} \verb| -> | external data and sample metadata
\item \il{assays} \verb| -> | measurement data (i.e. "raw" sequence data) and metadata
\item \il{workflows} \verb| -> | computational analyses (i.e. "scripts")
\item \il{runs} \verb| -> | outputs of workflow (i.e. "results")
\end{itemize}
\subsubsection{Material added to the ARC for this course}
\begin{itemize}
\item \il{_reader} \verb| -> | The basis to this reader (written in \LaTeX)
\item \il{_slides} \verb| -> | Slides presented during the workshop
\item \il{_handouts} \verb| -> | Cheat sheets and additional materials
\item \il{runs/_backups} \verb| -> | Backups for runs, in case a workflow unexpectedly did not work or takes too long.
\begin{itemize}
\item If a workflow would output \il{blat_results}, but failed, copy \il{runs/_backup/blat_results} to \il{runs/blat_results}.
\end{itemize}
\end{itemize}
\subsubsection{Disclaimer}
For this workshop we do not take full advantage of all ARC features. This is in part due to the fact that some developments are work in progress.
More so, we could easily spend two more days exploring those features, which are simply out of scope for a three-day workshop focusing on RNA-Seq data analysis. And besides, the ARC will in the future circumvent some annoyances that we run into for training purposes during this course.
If you want to learn more, check out the \href{https://nfdi4plants.org}{DataPLANT website}.
\subsection{During the workshop}
For the in-person workshop, we have already stored the ARC to the "Raumlaufwerke"
folder under our room number (25.41.00.41). Please copy the folder "rnaseq-workshop"
into your own home folder, so that everyone has their own copy.
We have written all code and commands so that they can be "executed from" the root of the ARC.
As a check now, please open a terminal (applications \verb| -> | terminal) and run:
\lstset{language=bash, style=bashstyle}
\begin{lstlisting}
cd $HOME/rnaseq-workshop
\end{lstlisting}
This is also a very good thing to try throughout the workshop if you're getting
any sort of 'file not found' error, make sure you're in the right place.
%% TODO: Software dependencies.
\subsection{Docker}
We rely heavily on Docker for this workshop, which allows us to provide an 'image'
with all the software you will need for the workshop installed and ready to use.
There are two major reasons we chose Docker. 1) Convenience for development, as it
works the same on our normal work machines, and on the ZIM teaching machines we
use for the workshop, and more importantly 2) Portability, you can all take the
Docker file home, work once through standardized instructions for installing
Docker: \url{https://docs.docker.com/} (or ask your admin to), and then you can
readily reproduce the work from the course.
As a brief disclaimer however, none of this is intended to demonstrate
good practice with Docker, particularly not for other purposes.
The Docker image ``rnaseq\_docker.tar" is available on Sciebo at: %% TODO!
\url{https://uni-duesseldorf.sciebo.de/s/53pA9W9TbOKTgGQ}.
the password is written on the board, or you have received it by e-mail.
The image can also be found in the ``Raumlaufwerke" folder under our room (25.41.00.41).
\subsubsection{load image}
To load the image, run:
\begin{lstlisting}
docker image load -i </path/to/>rnaseq_docker.tar
\end{lstlisting}
Where \il{</path/to/>} is replaced by e.g. ``Downloads"
or the path to the ``Raumlaufwerke" folder as appropriate, e.g.
\begin{lstlisting}
docker image load -i Downloads/rnaseq_docker.tar.gz
\end{lstlisting}
\subsubsection{run image as container}
To start a writable `container' from the image where you can work
interactively, \emph{from your home directory} (\il{cd $HOME})
run:
\begin{lstlisting}
docker run -it --name rnalive -p 8889:8889 --mount \
type=bind,source="$(pwd)"/rnaseq-workshop,target=/home/ZIM-gast/rnaseq-workshop \
rnaseq:latest
\end{lstlisting}
\subsubsection{restart container}
\begin{lstlisting}
docker start -i rnalive
\end{lstlisting}
\subsubsection{exiting}
You can exit a container with Ctr+D or by typing \il{exit}.
You won't need to do this much in the workshop, however.
\subsubsection{Then what?}
You should now have a terminal, that looks a lot like the one before,
except now know that you're in the 'container', where all the necessary
software is installed.
\begin{lstlisting}
# the workshop directory is shared
cd $HOME/rnaseq-workshop
# you should be able to see a list of all the files, that you can,
# e.g. by opening the folder in a file manager
ls
\end{lstlisting}
\subsubsection{What to do when}
At home:
\begin{itemize}
\item{load: once per machine or update to the image}
\item{run: once after each load, on first use}
\item{start: whenever you want to use it}
\end{itemize}
For the workshop however, the computers are wiped clean each night.
So you will have to run :
\begin{itemize}
\item{load: each morning}
\item{run: after running load each morning}
\item{start: as needed, should you exit the container}
\end{itemize}
\textbf{Important:} keep any data you wish to save within the folder
rnaseq-workshop, and make sure the contents of this folder are copied
to either the ``Raumlaufwerke" folder or to e.g. your USB flash drive or
your sciebo account before you logout of the computer!!!
\fbox{\begin{minipage}{45em}
If you have any trouble whatsoever with Docker, please ask!!
Properly understanding docker is beyond the scope of this workshop,
so don't worry if the above ``doesn't make sense",
but you will absolutely need it to be running as expected for
the other parts to work, so just ask :-)
\end{minipage}}
\subsection{After the workshop}
At least for the next half year, the ARC to this workshop is shared publicly under a CC-BY 4.0 license at \url{https://git.nfdi4plants.org/brilator/rnaseq-workshop.git}.
Feel free to download and unarchive the whole ARC as a zip / tar archive or \il{git clone} or \il{git fork} the ARC. Just like docker, explaining the full usage of `git` is beyond the scope of this course.
To ensure, that all code works as used in this reader, make sure to store the ARC as \il{rnaseq-workshop} in your \$HOME directory.
Similarly, the built docker image will remain available on sciebo. The built image is not included
in the ARC as we did not have the time to check licenses for included software, regarding distribution.