|Weekly Meetings||10:00||3 Hours every Friday||Campus North Room 243 Building 449 of SCC-IAI||Discussions|
|Functional Specification (Pflichtenheft)||27.11.2015||10:00||3 weeks||Campus North Room 243 Building 449 of SCC-IAI||written|
|Design (Entwurf)||15.01.2016||10:00||4 weeks||Campus North Room 243 Building 449 of SCC-IAI||written|
|Implementation||12.02.2016||10:00||4 weeks||Campus North Room 243 Building 449 of SCC-IAI||written|
|Test - Quality Controll (Qualitätssicherung)||04.03.2016||10:00||3 Weeks||Campus North Room 243 Building 449 of SCC-IAI||written|
|Preliminary Presentation||15.03.2016||Internal test|
Dynamic scheduler for scientific simulations @ SimLab EA Teilchen
Simulation Laboratories (SimLabs) are established at Steinbuch Centre for Computing as an advanced interface between users and operators of high performance computing. In particular SimLab for Eelementary- and Astro- Particle is providing high level support for research groups working in homonymous field to make the complex theoretical and experimental problems solvable using world best supercomputing resources. Many of simulation codes solving problems in particle and astrophyics are easily paralelisable. These simulations can be performed as a fixed or dynamic set of independent, logically identical tasks, that can be run separately (one-by-one or in parallel).
The goal of the work is to develop an adaptive scheduler with statistics collection, bookkeeping and visualisation system for multi-task scientific simulation codes.
The parallel tasks would need to manage additional tasks that are dynamically rising during simulation.
Information about current and done tasks must be collected and used for estimation of future possible tasks, time of execution.
Interconnection must be organised between schedulers of different parallel tasks to optimize the use of computational power - reach higher load balance and skalability.
The code prototype solving problems from astrophysics, nuclear physics and executer interfaces are provided.
Working and meeting rooms will be provided at Campus North and South of SCC/KIT.
The project would be supervised by Elizaveta Dorofeeva and Gevorg Poghosyan
Internal structure of parallel internal Scheduler
Scheduler contains of separate modules: Scheduling, Data Mining, Database, Visualization, Communication
The modules of scheduler are supposed to do the following job.
- contains a variety of scheduling algorithms and strategies;
- makes a decision how to place a task in a schedule (queue / run immediately, and how much resources the task needs);
- does bookkeeping;
Data Mining module:
- is called from Scheduling Module;
- analyses the statistics to define resource requirements for a task;
- keeps statistics for completed tasks;
- is used to keep statistics, bookkeeping and local schedule;
- can either be a parallel-accessed or multiplied among processes with synchronisation during or accumulation at the end of computational process;
- is used to visually analyse the collected data;
has to organize a multi-level communication among multiple parallel schedulers (one computation – several processes – multiple tasks). You can choose how to distribute tasks among multiple processes:
- submit new multiprocessor jobs using shell script generation or cluster submission system API (e.g. Moab-API);
- use MPI 3.0 mechanism to change the size of you parallel world (the number of processes you are using for computation) according to current requirements;
- fix the size of parallel world and distribute tasks among reserved processes according to best suited scheduling algorithm;
- use the combination of previous or your own methods.
By “task” is meant the set of two: Data Type and the Data of this type.
During the work you will:
- study different scheduling algorithms and strategies;
- learn C++ Template Meta-Programming to develop adaptive modules;
- deal with Database system (on your choice) to work with statistics and bookkeeping;
- work with Visualisation systems (on your choice, e.g. Gnuplot or ParaView) API;
- learn more about parallel programming models;
- work with MPI/OpenMP/CUDA (on your choice);
- learn about distributed and parallel database systems;
- do the final tests of your software on real high-performance systems;
- English (at least basic)
- UNIX-based systems (basic knowledge)
- C++ (basic knowledge)
- MPI/OpenMP/CUDA (optional)
- eager to learn new things