DDIT Afdeling

Het doel van deze dienst is het leveren van IT-ondersteuning (hardware en software) voor zowel wetenschappelijke (bioinformatica) als logistieke (IT) taken die dit binnen het instituut vereisen.

Team

  • Raphaël Helaers, bioinformatician, is the head of the service. He manages software projects, the HPC cluster and related services. You can contact him for the services described below with the mention (RH).

  • Jacques Vanclève, IT Manager, deals with solving IT problems that arise at the Institute and projects requiring IT support. You can contact him for the services described below with the mention (JV).

  • Mickaël Nguyen, developer, is mainly involved in the IT development of Sirona.

Proposed services

The following services are currently offered, with examples already in place and/or under development :

  • Develop and maintain custom software to manage or automate scientific or logistical data (RH).

    • DDUV Order: software that allows all members of the institute to place orders, which via the same interface can be validated by the PIs, encoded by the CLCs in the UCLouvain system, traced from their delivery to their storage in the lab, and automating part of invoice management for accounting. All information about DDUV Order can be found on its intranet page.

    • Sirona: extensive solution that will manage all the data from each laboratory of the institute – storage, samples, products, patients, experiments, results, projects, etc. All information about Sirona can be found on its intranet page.

    • Phosphofinder: software used by the MASSPROT platform which automates the analysis of peptides via a graphical interface. All information about Phosphofinder can be found on its intranet page.

    • Any other software development proposal to perform specific tasks or solve a problem can be submitted to the DDIT service.

  • Provide the computing capacity needed for bioinformatics analyzes (RH). This involves maintaining and regularly updating our HPC clusterbut also providing access and supervising researchers wishing to use it themselves.

    • Currently around thirty researchers from the institute have a user account on the cluster and can launch their analyzes there themselves with a resource reservation system (number of processors and desired memory) and intelligent queuing.

  • Set up and maintain bioinformatics services on the HPC cluster, for groups that have regular/intensive use of them (RH)

    • NGS analysis pipelines: they process raw data from WES, WGS, panels or RNAseq and produce usable data for biological analyses. The PGEN platform uses it daily.

    • 10X Genomics analsyis pipeline: it processes raw data from Single Cell Sequencing and produces data that can then be viewed with 10X Genomics graphical tools on a “simple” workstation.

    • Alphafold: an algorithm recently released by Google that uses artificial intelligence to predict the structure of any protein far more accurately than any existing alternatives. To do this, it uses graphics processors (GPUs) which are installed on a dedicated infrastructure of the cluster. You can submit your proteins via the dedicated intranet page.

    • R Studio server: a centralized version of R studio (usable remotely from any workstation) benefiting from the significant resources available to the HPC cluster. Laurent Gatto's CBIO group uses it daily and the interface can be accessed here.

    • Any other proposals for existing software to be installed on the cluster can be submitted to the DDIT service.

  • Make computer servers available in the form of virtual machines. (RH & JV) In practice, it is part of the cluster that isolates itself from the rest of the infrastructure by reserving the resources necessary for its operation (processors, memory, storage space). This allows for example to be able to generate on demand a server to host an internet/intranet site or for software requiring a “powerful” computer to run. This avoids buying, maintaining and backing up dedicated computers, while optimizing the use of resources (you can easily increase or decrease them on demand).

  • Manage and provision a mass storage solution. (RH & JV)  Several hundred terabytes of space are made available in the form of directories with controlled access (allowing, for example, all members of the same group to access them). This data is replicated several times and "snapshots" are kept several times a week (so you don't have to keep a "backup" of your data). It is therefore a much safer and faster solution to access than shared hard disk / NAS type solutions that some groups currently use.

  • Provide support for the purchase and installation of computer equipment. (JV) Support will be provided for choosing the right hardware configuration for users' needs, both for Windows and OSX machines, and for installing the hardware on delivery of the order.

  • Provide support for the configuration of the various software(JV) Those available via UCL, but also for specific analysis and administration tools within the Institute.

  • Provide network support. (JV) Enables network access for equipment requiring a wired or Wi-Fi network connection within the Institute.

  • Provide first-line support to resolve IT issues(JV) In collaboration with the UCL IT department (SIBX), direct support will be provided to users for all problems related to computer hardware and software.

  • Assist in the maintenance and management of research group specific databases(JV) Different databases related to the specific research field of the groups are hosted and managed within the Institute.

  • Provide individual research data backup solutions(JV) Help users save their data in a sustainable way by offering them appropriate tools according to their needs.

  • Provide support for the use of videoconferencing systems. (JV) Several Visio conferencing solutions are available within the institute, fixed or mobile.

Pricing

Since most of these services are useful to the community, their cost is borne entirely by the Institute. For potential projects with a scope limited to one group, they may be invoiced according to the scope of the task (to be assessed by the DDI).

HPC Cluster

The de Duve Institute has a high-performance computing cluster. You can find all useful information, including documentation in the form of a WIKI, on the cluster's intranet page. The cluster is made up of several groups of machines with dedicated profiles.

  • 1 entry node (ddus): the so-called “ddus” machine is intended to be dedicated to users. It has 64Gb of RAM and 8 cores (16 threads), a local storage space of 12 Tb in RAID 1, as well as a 1Tb SSD dedicated to MySQL. It allows users to test their jobs locally before submitting them. The storage space allows researchers using this cluster to more easily transfer information such as datasets that probably need to be developed/stored directly on the cluster. The software installed on ddus are all deployed identically on all the machines in the cluster.

  • 10 nodes with high computing capacity (dd01-dd10): these machines are used for all the services offered; and represent the largest net labor force in the cluster. Their configuration has been designed with the objective of maximizing the number of cores, the amount of ram installed and both HDD and SSHD disk space. Shared storage is made available in a single volume of ~600Tb via BGFS on ZDF (RAID 5). Each of the 10 machines has 2 AMD Epyc 7281 CPUs (16 Cores @ 2.1Ghz) and 128 Gb of RAM. They offer a total of 320 cores (740 threads), 1.2 Tb of RAM and 1.1 Pb of raw storage.

  • 2 nodes with high memory capacity (ddc1-ddc2): these machines with more modest processors, however, have a lot of RAM memory at their disposal, allowing certain very memory-intensive analyzes to be carried out. Each of the 2 machines has 64-core AMD CPUs and 256 Gb of RAM. They offer a total of 128 cores, 512 Gb of RAM and 8 Tb of storage.

  • 2 nodes dedicated to GPUs (dd-gpu01 – dd-gpu02): these machines are equipped with GPUs (graphics processors) allowing to run at high speed certain applications taking advantage of this technology (e.g. Alphafold). The dd-gpu01 machine is equipped with an RTX 3080Ti GPU (10,240 CUDA cores, 12G memory), 2 AMD CPUs (12 cores each) and 220 GB of RAM. The dd-gpu02 machine is equipped with 4 GTX 1070Ti GPUs (2,432 CUDA cores, 8G memory), 2 AMD CPUs (12 cores each) and 220 GB of RAM.

  • 3 backup nodes: these machines are dedicated to the replication of cluster data in order to overcome any hardware problems that may arise and guarantee the integrity of stored data.