Login for PhD students at UCPH
Login for others
Home
Course Catalogue
Communication & Teaching
Online Courses
Responsible Conduct of Research
Specialist Courses
Statistics
PhD Supervision for Academic staff
Course fee, cancellation policy and invoice details
How to apply for a course
PhD students from NorDoc universities
Newly enrolled PhD students at SUND
PhD students at UCPH
Other applicants
How to log on to the course system
How to log in as a student
How to log in as a course provider
Contact information
Processing...
HPC Pipes
Provider: Faculty of Health and Medical Sciences
Activity no.: 3949-24-00-00
Enrollment deadline: 10/10/2024
Date and time
04.11.2024, at: 08:45 - 05.11.2024, at: 16:30
Regular seats
40
Course fee
2,040.00 kr.
Lecturers
Anders Krogh
ECTS credits
1.50
Contact person
HeaDS Administration E-mail address: heads-admin@sund.ku.dk
Enrolment Handling/Course Organiser
PhD administration E-mail address: phdkursus@sund.ku.dk
Aim and content
This is a generic course. This means that the course is reserved for PhD students at the Graduate School of Health and Medical Sciences at UCPH.
Anyone can apply for the course, but if you are not a PhD student at the Graduate School, you will be placed on the waiting list until enrollment deadline. After the enrollment deadline, available seats will be allocated to the waiting list.
The course is free of charge for PhD students at Danish universities (except Copenhagen Business School), and for PhD students at NorDoc member faculties. All other participants must pay the course fee.
Learning objectives
A student who has met the objectives of the course will be able to:
1. Explain the purpose and structure of a bioinformatics pipeline
2. Develop and manage reproducible data analysis pipelines using Snakemake or Nextflow
3. Control the software environment of a workflow/pipeline using workspace management tools like conda and docker
4. Manage data and computing using best practices (RDM) and appropriate compute provisioning (HPC)
Content
The course HPC-Pipes introduces best practices for setting up, running, and sharing reproducible bioinformatics pipelines and workflows. Rather than instruct on the whys and wherefores of using particular tools for a bioinformatics analysis, we will cover the general process of building a robust pipeline (regardless of data type) using workflow languages, environment/package managers, optimized HPC resources, and FAIRly managed data and tools. On course completion, participants will be able to use this knowledge to design their own custom pipelines with tools appropriate for their individual analysis needs.
The course will provide guidance on how to automate data analysis using common workflow languages such as Snakemake or Nextflow. Subsequently, we will delve into ensuring the reproducibility of pipelines and explore available options. Participants will learn how to share their data analysis and software with the research community. We will also delve into different strategies for managing the produced research data. This includes addressing the challenges posed by large volumes of data and exploring computational approaches that aid in data organization, documentation, processing, analysis, storing, sharing, and preservation. These discussions will encompass the reasons behind the increasing popularity of Docker and other containers, along with demonstrations on how to effectively utilize package and environment managers like Conda to control the software environment within a workflow. Finally, participants will learn how to manage and optimize their pipeline projects on HPC platforms, using compute resources efficiently.
Exercises will be run on the UCloud HPC platform, and participants will be expected to build on existing familiarity with bioinformatics tools and the scripting languages bash and R/Python.
Participants
The course is intended for PhD students, postdocs, and junior faculty at SUND who are interested in learning how to construct and manage bioinformatics pipelines and projects on high-performance computing resources.
Requirements
The workshop is for PhD students at SUND who seek to acquire skills in effectively managing data and analyses in bioinformatics. Knowledge of R/Python and bash is required, as well as basic understanding of an omics analysis pipeline.
We strongly recommend taking this course after completing the course HPC-Launch, a single day course which covers theoretical concepts for HPC and RDM in health data science.
Relevance to graduate programs
The course is relevant to PhD students from the following graduate programs at the Graduate School of Health and Medical Sciences, UCPH:
- All graduate programmes
Language
English
Form
Lectures with active discussion sessions, interactive demos using the UCloud platform, and group work and exercises navigating UCloud and practicing with workflow languages and tools for RDM-compliant project set-up.
Course director
Anders Krogh,
Professor, Head of Center for Health Data Science, Head of Health Data Science Sandbox
Center for Health Data Science,
anders.krogh@sund.ku.dk
Teachers
The workshop is provided by project members of the Health Data Science Sandbox, a national training and research infrastructure project.
The Sandbox team is building training resources and guides for learning bioinformatics, predictive modeling in precision medicine, high performance computing and data carpentry.
These resources are accessible to all Danish university employees (PhD students and up) via academic supercomputing infrastructure.
Jennifer Bartell
PhD, Senior consultant and Sandbox project manager
Center for Health Data Science, KU
bartell@sund.ku.dk
Alba Refoyo Martinez
PhD, Data Scientist, Sandbox Team
Center for Health Data Science, KU
alba.martinez@sund.ku.dk
Adrija Kalvisa
PhD, Special Research Consultant
ReNEW Genomics Platform, KU
adrija.kalvisa@sund.ku.dk
Stefano Pupe
PhD, Senior Consultant
Center for Health Data Science,KU
stefano.pupe@sund.ku.dk
Dates
4 - 5 November 2024
Course location
Faculty of Health and Medical Sciences, Panum,
Blegdamsvej 3B, 2200 København.
4-Nov Panum 13.1.41/61
5-Nov Holst Auditorium
Registration
Please register by 10 October 2024
Expected frequency
This course will be repeated in Spring 2025.
Seats to PhD students from other Danish universities will be allocated on a first-come, first-served basis and according to the applicable rules.
Applications from other participants will be considered after the last day of enrollment.
Note: All applicants are asked to submit invoice details in case of no-show, late cancellation or obligation to pay the course fee (typically non-PhD students). If you are a PhD student, your participation in the course must be in agreement with your principal supervisor.
Search
Click the search button to search Courses.
Choose course area
Course Catalogue
Choose sub area
Communication & Teaching
Online Courses
Responsible Conduct of Research
Specialist Courses
Statistics
PhD Supervision for Academic staff
Course calendar
See which courses you can attend and when
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Processing...
RadEditor - HTML WYSIWYG Editor. MS Word-like content editing experience thanks to a rich set of formatting tools, dropdowns, dialogs, system modules and built-in spell-check.
RadEditor's components - toolbar, content area, modes and modules
Toolbar's wrapper
Paragraph Style
Font Name
Real font size
Apply CSS Class
Custom Links
Zoom
Content area wrapper
RadEditor hidden textarea
RadEditor's bottom area: Design, Html and Preview modes, Statistics module and resize handle.
It contains RadEditor's Modes/views (HTML, Design and Preview), Statistics and Resizer
Editor Mode buttons
Statistics module
Editor resizer
Design
HTML
Preview
RadEditor - please enable JavaScript to use the rich text editor.
RadEditor's Modules - special tools used to provide extra information such as Tag Inspector, Real Time HTML Viewer, Tag Properties and other.
N
ew courses
Courses are published regularly. High demand courses are announced in spring and autumn.
Learn which courses are announced on fixed dates