Research Cluster (Plato)

Plato is a Linux-based, high performance computing (HPC) cluster designed to support your research projects. It can also be used for training highly qualified personnel on advanced research computing (ARC) techniques and applications. HPC cluster Plato has total 120 nodes, with total 7.4 TB RAM and 2,000 CPU cores, producing 64 Dual precision Theoretical TFLOPS. It is a General Purpose (GP) cluster, which consists of 94 general purpose compute nodes, 2 GPU nodes, 2 large memory nodes and 22 CONTRIBUTED nodes. The cluster uses SLURM  software for job scheduling and resource management; Bright Computing software for cluster management and runs RedHat Enterprise Linux (RHEL) 7.3 as operating system.

HPC cluster Plato is provided by the University ICT Academic and Research technologies (ART). Although, it is not part of ComputeCanada ARC infrastructure, Plato is very similar, in its principles, to ComputeCanada resources. Using CVMFS, HPC cluster Plato provides University's researchers access to ComputeCanada scientific software stack.  

Access to HPC cluster Plato is facilitated through the University ICT Authentication and Access Management  and is based on certain eligibility criteria. Computational resorces of HPC cluster plato, such as: CPU cores, RAM, shared storage space, GPU and othes, allocated on FairShare basis and is based on Accounting groups. Following are commonly asked questions and answers:     

Check out sections below for additional information: 

Specifications


  • 2 Dell PowerEdge R430 head nodes:
    • 2 x eight-core Intel Xeon processors
    • 31 GB RAM
    • 10 Gb Ethernet to USASK network
    • 10 Gb Ethernet to Cluster network
    • 18 TB NAS shared storage (/home)
    • 450TB DATAStore (/datastore)
    • High availability
    • No user access

  • 2 Dell PowerEdge R430 login/interactive nodes (plato.usask.ca):
    • 2 x eight-core Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
    • 31 GB RAM
    • 10 Gb Ethernet to USASK network
    • 10 Gb Ethernet to Cluster network
    • 18 TB NAS shared storage (/home)
    • 450TB DATAStore (/datastore)
    • Software stack via CVMFS2 (/cvmfs/soft.computecanada.ca) 
    • High availability
    • Eligible users access

  • 30 HPE Proliant DL160 G8 compute nodes (plato101-124, plato201-202, plato301-304):
    • 2 x eight-core Intel(R) Xeon(R) CPU E5-2650L 0 @ 1.80GHz
    • 31 GB RAM
    • 2 x 1 Gb Ethernet to Cluster network
    • 18 TB NAS shared storage (/home)
    • 347 GB local storage drive (/local)
    • Software stack via CVMFS2 (/cvmfs/soft.computecanada.ca)

  • 64 Dell PowerEdge C6220 II high density compute nodes (plato225-248, plato309-348):
    • 2 x eight-core Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz 
    • 31 GB RAM
    • 2 x 1 Gb Ethernet to Cluster network
    • 18 TB NAS shared storage (/home)
    • 347 GB local storage drive (/local)
    • Software stack via CVMFS2 (/cvmfs/soft.computecanada.ca)

  • 2 Dell PowerEdge C4130 GPU nodes (platogpu103-104):
    • 2 x eight-core Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
    • 31 GB RAM
    • 2 x 1 Gb Ethernet to Cluster network
    • 2 x NVIDIA K40 GPU
    • 18 TB NAS shared storage (/home)
    • 805 GB local storage drive (/local)
    • Software stack via CVMFS2 (/cvmfs/soft.computecanada.ca)

  • 1 Dell PowerEdge R920 big memory node (platobmem502):
    • 4 x twelve-core Intel(R) Xeon(R) CPU E7-4850 v2 @ 2.30GHz
    • 2 TB RAM
    • 2 x 10 Gb Ethernet to Cluster network
    • 18 TB NAS shared storage (/home)
    • 1.5 TB local storage drive (/local)
    • Software stack via CVMFS2 (/cvmfs/soft.computecanada.ca)

  • 1 Dell PowerEdge R910 big memory node (platobem501):
    • 4 x eight-core Intel(R) Xeon(R) CPU E7- 8837  @ 2.67GHz
    • 840 GB RAM
    • 2 x 10 Gb Ethernet to Cluster network
    • 18 TB NAS shared storage (/home)
    • 163 GB local storage drive (/local)
    • Software stack via CVMFS2 (/cvmfs/soft.computecanada.ca)

  • 20 HPE ProLiant SL210t G8 CONTRIBUTED high density compute nodes (plato205-224):
    • 2 x eight-core Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz 
    • 31 GB RAM
    • 2 x 1 Gb Ethernet to Cluster network
    • 18 TB NAS shared storage (/home)
    • 347 GB local storage drive (/local)
    • Software stack via CVMFS2 (/cvmfs/soft.computecanada.ca)
    • No public access

  • 2 Dell PowerEdge C4130 CONTRIBUTED GPU nodes (platogpu101-102):
    • 2 x sixteen-core Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz
    • 250GB RAM
    • 2 x 1 Gb Ethernet to Cluster network
    • 2 x 50 Gb InfiniBand to MPI network 
    • 2 x NVIDIA K80 GPU
    • 18 TB NAS shared storage (/home)
    • 768 GB local storage drive (/local)
    • Software stack via CVMFS2 (/cvmfs/soft.computecanada.ca)
    • No public access

  • Total 120 nodes, 7.4 TB RAM, 2,000 CPU cores,  64 Dual precision Theoretical TFLOPS
  • 3 Dell N2048 1 Gb Ethernet Compute fabric switches
  • 3 dell N3048 1 Gb Ethernet Storage fabric switches
  • 2 dell S4048-ON 10/40 Gb Ethernet Core fabric switches  
  • RHEL 7.3 Operating System
  • CVMFS access to Computecanada scientific software stack
  • Local Custom scientific  software stack
  • Bright Computing Cluster management software 
  • 18 TB high I/O Dell FS8600 NAS shared storage
  • Local hard drives on compute nodes are used for local scratch

Available Software

Most of the software you will need to use the Plato research cluster can be found here:

/share/apps

Environmental variables are configured via the "module" software for given applications.

To see your currently configured modules, use

module list

To see the list of available modules, use

module avail

To load a module for one login shell/session:

module add <modulename>
or
module load <modulename>

To load a module for every future login shell/session:

module initadd <modulename>

For more information, see

man module

Access Policies

  1. ICT provides Plato for research use requiring High Performance Computing by University of Saskatchewan research groups working on U of S research projects.
  2. Groups eligible for Plato access are headed by a U of S faculty member, referred to as the Principle Investigator (PI) and include the PI and their associated students, staff and collaborators.
  3. Access to Plato is granted by ICT only on request of the PI, who manages membership of his or her group by requesting access for a list of students, staff and collaborators. The PI is responsible for the actions of the group's members on Plato, ensuring appropriate (e.g. related to U of S research projects and per university policies) work is done on Plato.
  4. There is currently no direct charge for the service to researchers.
  5. Each group is assigned an equal potential share of the system, regardless of number of members/accounts in a group. Scheduling occurs at the group level.
    1. Members of the same group are expected to coordinate their usage with each other. 
    2. Maximising a group’s actual usage in the “equal potential share” of the system requires regular submission of jobs over the long term (months and years). 
    3. Share of the system enforced by scheduling software; users shall use the job batching system, unless other arrangements have been made.
  6. Priority access for a faculty group or guaranteed allocations to Plato resources can be requested in service of faculty grant applications (and similar). For each request, ICT will determine the details of the request and make a recommendation to the HPC Advisory Committee.
  7. Non-traditional uses (e.g. custom images, web services for cluster nodes, etc.) are possible and feasibility is determined on a case-by-case basis by ICT.
  8. Limits on job parameters (e.g. maximum walltime or processor equivalents, limits to number of jobs submitted at a time, etc.) may be imposed to maintain the integrity of the system, optimize the use of the hardware and/or improve fairness of scheduling.
  9. Plato cannot by itself provide all needed HPC computational cycles for the entire university. Plato should be considered a stepping stone to shared resource usage (e.g. WestGrid/Compute Canada).
  10. For ICT to grant access to Plato for a PI and/or the members of his group, the PI must agree to:
    1. Provide a one paragraph abstract describing the research project:
      1. ICT intends to publish the abstracts on a webpage of ongoing research using Plato. ICT will ask the PI for permission (and approval of publication date, where needed) to publish the abstract.
    2. Report annually on research outcomes achieved using Plato:
      1. Provide (and allow ICT to publish on webpage or similar) a list of papers, theses and conference presentations published using results generated on Plato.
  11. Conditions of use:
    1. Runaway user jobs may be terminated with little or no warning to preserve the integrity of the system.
    2. Plato's uptime and network availability are on a best effort basis. Plato is not on emergency power or UPS.
    3. Users will arrange their own long term data storage. Some disk space is available for intermediate calculations and input data, but data on Plato in user directories is not backed up. ICT provides several data storage options. Users may be asked to remove data from Plato that is not immediately required as inputs for computation. Quotas may be enforced.
    4. There is no guarantee on availability of computational cycles to a PI Group.
    5. Development tools, parallel libraries, math libraries and compilers are provided on Plato; additional compiler/library purchase/installation requests will be handled on a case-by-case basis. ICT cannot commit in advance to the purchase or installation of all scientific libraries or software.
    6. Installation of scientific/research software, customization of Plato nodes and technical support is available on a best-effort basis from ICT.
    7. ICT/Plato administrators may make changes to the priorities of jobs, or request delay/termination of running jobs, to facilitate overall throughput.

Compilers

Compilers

/usr/bin/gcc
/usr/bin/g77
/usr/bin/gfortran

Intel Compilers

In order to configure your session to use the Intel compilers for "mpif90" and "mpicc", you should ensure that you do not have another MPI configuration loaded:

module unload rocks-openmpi

you must use the following command to load the correct configurations:

module load intel/<version>
module load openmpi/<version>_intel

You will have to use the "initadd" command to use the correct libraries for running your programs in the batch system.

module initadd intel/<version>
module load openmpi/<version>_intel

Note that the Intel compilers include the optimised Math Kernel Library with BLAS, LAPACK, ScaLAPACK, LINPACK and FFT libraries. This utility can help you determine the library linking procedure.

Modules (Set up environment for MPI)

The "module" command controls which parallel processing environment is loaded at each shell invocation for the default installed versions of mpi on the cluster. Unless your software requires something else, use the OpenMPI environment.

module initadd openmpi/<version>_gnu
or 
module initadd openmpi/<version>_intel 
Last modified on