EPFL
Wikis
EpiGN
Scientific Computing resources and High Performance Computing

Scientific Computing resources and High Performance Computing

Getting Started

Getting an account

- Description of different types of account and how to get them

Tutorials

- https://scitas-doc.epfl.ch/courses/training-courses/

Connecting to the cluster

- Using X11 forwarding: This does not provide you with a remote desktop, but allow you to launch graphical applications (e.g, rstudio) on the cluster and forward their windows to your local workstation.

Running jobs

- submit jobs, check status etc

- slurm QOS and partition (set job resources and priorities etc)

File System

- Introduction to the default system:

- Adding $WORK and $SCRATCH to your .bashrc

# Set up some useful environment 
export WORK='/work/upzenk'
export SCRATCH='/scratch/<username>'

- Addtional location: our group own server. To mount our group server to the cluster see instruction. In our case:

gio mount smb://intranet\;<username>@sv-nas1.rcp.epfl.ch/upzenk

Notes:

$HOME and $WORK are snapshotted daily. To recover data you can use /.snapshots, see Recovering data
$SCRATCH: stagging area for large input and output data. Data in the scratch should be backed up to our own group server.
To transfer data from our own group server to the cluster, it is much faster to mount our share first, then do rsync

Using the cluster

Setup some softwares

- define $PATH where we installed cellranger, cellranger-atac, cellranger-arc, rstudio etc

cd $WORK
source source_softwares_setup

Python

Runing Jupyter notebook

Step 1: Activate the venvs on /work/upzenk: Python virtual environment with common packages installed (e.g, scanpy).

cd $WORK
source softwares/python_venvs/venv-single-cell/bin/activate

Step 2: Running Jupyter notebook. This jupyter notebook session can use all the packages you have in the virtual environment.

Install new packages

pip install to a virtural environment
conda install: activate the miniconda environment using source miniconda3/bin/activate

Notes

Create a new environment. This new environment is independent of the venvs on $WORK/softwares/python_venvs (i.e, whatever additonal packages we installed there are not synced).
Jupyter session would imply paying for as long as jupyter is open, regardless of you being actually trying to compute something, or just being at lunch time with your colleagues (provided you don't close jupyter, of course).

R

Using Rstudio

# login with -X11 forwarding
ssh -X <username>.jed.epfl.ch

# prepare rstudio
module load gcc r
cd $WORK
source source_softwares_setup

# login to a node; specify the mem, core and time you want
Sinteract -m 16G

# open Rstudio
rstudio

# check library path
.libPaths()

Install R packages

module load r

# check R library path
R .libPaths()

# we install packages in '/work/upzenk/softwares/r/r_packages'
R .libPaths( c( .libPaths(), "/path/to/the/dir") )

# install the packages
R install.packages('dummy', lib = '/path/to/the/dir')

Tips for bioinformatics tools

Snakemake

- Set up the environment: first you need to load modules: gcc, snakemake; then load the tools useful for your pipeline (e.g, bwa, samtools).

- After setting up the environment, you can run the very minimum example: https://snakemake.readthedocs.io/en/stable/tutorial/setup.html (do not need to follow the installation part).

Snakepipes

- A tool trailored for epigenetics data analysis, based on snakemake and python (some R scripts are used as well). See documentation.

- createIndices

Follow the guidelines to indexing hybird genomes (reference + spike-in). To submit the job to the cluster, as an example you can create a .sh file:
tip1: when submitting this job, allocate enough memories (~200G)
tip2: check the logs when have errors
tip3: when a certain aligner failed, delete the folder and redo it

This wiki
- Home
- Sitemap
- Files
- New page
- Administration
This page
- Edit
- Clean
- Delete
- History
- Print
- Comments (0)
Share

Prospective students portal

Students portal

Researchers portal

Staff portal

Business portal

Mediacorner

Teaching portal

EPFL Alumni Portal

Architecture, Civil and Environmental Engineering ENAC

Basic Sciences SB

Engineering STI

Computer and Communication Sciences IC

Life Sciences SV

Management of Technology CDM

College of Humanities CDH

EPFL

Education

Research

Innovation & Tech Transfer

EPFL Campus