Login for PhD students/staff at UCPH      Login for others
Data Science Projects (generic course)
Provider: Faculty of Science

Activity no.: 5543-23-07-21 
Enrollment deadline: 25/01/2023
PlaceDepartment of Mathematical Sciences
Universitetsparken 5, 2100 København Ø
Date and time06.02.2023, at: 08:00 - 21.04.2023, at: 16:00
Regular seats20
Activity Prices:
  - Deltager/Participant from SCIENCE0.00 kr.
  - Deltager/Participant Others4,800.00 kr.
ECTS credits4.00
Contact personNina Weisse    E-mail address: weisse@math.ku.dk
Enrolment Handling/Course OrganiserBo Markussen    E-mail address: bomar@math.ku.dk
Written languageEnglish
Teaching languageEnglish
Semester/BlockBlock 3
Scheme group notePlenum sessions some Tuesday mornings.
Exam formA report on an agreed topic followed by an oral presentation
Exam formOral examination
Exam detailsThe students hand in a report. The report is presented at the final course day. Both report and oral presentation must be approved. The students are allowed to work and present in 2-person groups (3-person groups may be allowed in exceptional cases).
Grading scalePassed / Not passed
Exam re-examinationIf the student didn't pass based on the written report and the oral defense, then the student has the possibility of resubmitting the report based on the feedback from the examination. The resubmitted report then has to be sufficiently elaborate by itself in order to pass the reevaluation.
Course workload
Course workload categoryHours
Course Preparation6.00
Lectures8.00
Theory exercises4.00
Project work80.00
Exam6.00

Sum104.00


Content
Data Science covers both Machine Learning and Statistics. This generic course provides a platform to develop and work on projects with the student’s own data using either Machine Learning methods, Statistical data analysis, or possibly a combination, with supervisory support from the course teachers. The data sources can range from data from designed experiments, observational data, and surveys in text or digital formats, to pictures, scans, videos or graphs. All related to some scientific investigation, typically from the PhD students own work.

Depending on the primary scope of each of data analysis problems, either an expert in Machine Learning or an expert in Statistics will supervise the project. Typically, Machine Learning projects will use Python, and Statistics projects use R. However, other software platforms are possible depending on the student’s preferences.

Typical analysis within the scope of Machine Learning could be automated quantification of objects of interests in the data (for example, image analysis), or combining different types of data to address a common research question (like combining text and measurements), or building predictive models using Machine Learning.

Typical analysis within the scope of Statistics is modelling of experimental data in order to establish an associative or causal relation between an outcome of interest and some explanatory variables, e.g. application of different treatments. Subsequent, to quantify relations that cannot be explained by biological variation, but must be attributed to real effects.

The course report will be a manuscript written like a draft of a research paper – that may ideally be completed and submitted to a journal following the course.

No later than one week prior to the course, the participants should submit a synopsis with a short draft description of their data and the desired outcome. This will allow us to consider plenum lectures on some specific analysis methods and to plan the project supervisions.

Formel requirements
We invite PhD students from all SCIENCE departments. For Machine Learning projects, some experience with programming is required. For Statistics projects, our course “Statistical Methods for SCIENCE” or a similar course is required. Please email and ask the course organizer in case of doubts about prerequisites. The number of participants is limited to 20. If the course is overbooked, priority is given to students who previously followed another Data Science Lab course (Introduction to Python or R, Statistical Methods for SCIENCE, and Machine Learning for SCIENCE).

PhD students from outside UCPH SCIENCE are permitted for a fee, if seats are available.

Learning outcome
After course completion, the students are expected to be able to:

Knowledge:
- Describe the analysis methods used by others for similar problems.
- Describe relevant, alternative approaches for solving the problem.

Skills:
- Develop/adapt/extend a computer-based software method for quantification and/or analysis of their own data.

Competences:
- Formulate scientific questions from their PhD project in terms of research hypotheses.
- Interpret the results of their computer-based analysis in relation to their PhD project.

Literature
This depends on the individual project.

For potential background literature, see the course pages for the Introduction to Python, Introduction to R, Statistical Methods for SCIENCE (SMS), Machine Learning for SCIENCE (MLS), all listed on the Data Science Lab homepage.

Teaching and learning methods
The first few course days include traditional lectures. Following this, the majority of the work will be organized during individual supervision meetings with the course lecturers. The projects must result in research article style reports and be presented before the class at the concluding examination seminar. The projects may be done using software packages of the participant’s own choice. The lecturers have particular experience with the software frameworks Python, Matlab, R, and SAS.

Remarks
For details for this and other Data Science Lab courses, see: http://datalab.science.ku.dk/english/course/

Search
Click the search button to search Courses.


Course calendar
See which courses you can attend and when
JanFebMarApr
MayJunJulAug
SepOctNovDec



Publication of new courses
All planned PhD courses at the PhD School are visible in the course catalogue. Courses are published regularly.