Die letzten Meldungen

Behoben: Störung beim E-Mail Dienst Exchange

26. Mai 2017

Die Störung des Exchange-Dienstes wurde am 26.5.2017 um 10:45 behoben.
Weiterlesen...

(behoben) Störung der Glasfaserverbindung Verwaltungsnetz zwischen Halbmondstraße und RRZE

22. Mai 2017

Aufgrund einer Störung an der Glasfaserverbindung des Verwaltungsnetzes (ZUV) zwischen dem Aufpunkt Halbmondstr. und dem RRZE können die Verwaltungsarbeitsplätze im Schloss, der Halbmondstr. sowie Krankenhaus- und Turnstr. gegenwärtig keine der zentralen Dienste oder das Internet erreichen.
Weiterlesen...

Downtime of HPC clusters LiMa, TinyFAT & parts of Woody on Mon, May 15 (FINISHED)

12. Mai 2017

Due to urgent work on the power grid, the HPC clusters LiMa, TinyFAT & parts of Woody (w10xx = :sb and w12xx = :sl16) as well as Memoryhog have to be shut down on Monday, May 15th starting at 7 o’clock in the morning. As usual, jobs that would collide with the downtime will be postponed.
Weiterlesen...

Meldungen nach Thema

 

Tinyblue Cluster

Photograph of the RRZE Tinyblue Cluster

Dieses Cluster wurde Ende September 2016 abgeschaltet. Diese Seite ist somit veraltet und dient nur noch der historischen Dokumentation.

The RRZE's Tinyblue cluster (Externer Link:  IBM) is a high-performance compute resource with high speed interconnect. It is intended for distributed-memory (MPI) or hybrid parallel programs with medium to high communication requirements.

  • 84 compute nodes, each with two Xeon 5550 "Nehalem" chips (8 cores + SMT) running at 2.66 GHz with 8 MB Shared Cache per chip, 12 GB of RAM (DDR3-1333) and 200 GB of local scratch disk

  • Infiniband interconnect fabric with 40 GBit/s bandwith per link and direction

  • Overall peak performance of ca. 7 TFlop/s (?.?? TFlop/s LINPACK)

Tinyblue is a system that is designed for running parallel programs using significantly more than one node. Jobs with less than one node are not supported by RRZE and are subject to be killed without notice.

This website shows information regarding the following topics:

Access, User Environment, and File Systems

Access to the machine

Users can connect to

woody.rrze.uni-erlangen.de

and will be randomly routed to one of the frontends for woody, as there are no extra frontends for tinyblue. See the documentation for the Woodcrest cluster for information about these frontends. Although the tinyblue compute nodes actually run Ubuntu LTS, the environment is compatible. There is no difference in compiling things for Woody or Tinyblue, i.e. Programs compiled for Woody will just run on Tinyblue as well.

For submitting Jobs, you will have to use the command qsub.tinyblue instead of the normal qsub.

In general, the documentation for Woody applies. This page will only list the differences to woody.

File Systems

Node-local storage $TMPDIR

Each node has 200 GB of local hard drive capacity for temporary files (instead of the 130 woody has) available under /tmp/ (also accessible via /scratch/).

Batch Processing

The batch system works just like on Woody, the few notable differences are:

  • The command for job submission is qsub.tinyblue instead of just qsub.
  • The compute nodes do not have 4 cores like Woody, but 8 physical cores plus 8 SMT cores. This means that the operating system will see 16 cores. You thus always have to request ppn=16 on every qsub.
  • With the Nehalem, Intel has reintroduced the concept of Hyper Threading, although they now call it Simultaneous multithreading (SMT) and it actually is useful for some applications this time. You should test if your application runs better or worse with SMT. To run a job without using SMT, your still have to request all 16 cores of a node (see last paragraph!), and then restrict your program to only the 8 "real" of them. The "real" cores on tinyblue are the ones numbered 0-7. Core numbers 0-3 are the first physical socket, 4-7 the second; 8-15 are the corresponding virtual cores created by SMT. If you use mpirun, you can just use the parameters -npernode 8 -pin "0 1 2 3 4 5 6 7" to restrict your program to the right cores.
  • Effective June 2010, jobs requesting more than 32 nodes will wait in the route queue until the big queue is enabled. The big queue will usually be activated only once or twice per week to avoid draining of TinyBlue for short running huge jobs.

Further Information

Intel Xeon 5550 "Nehalem" Processor

The Externer Link:  Xeon 5550 processor implements Intel's Nehalem microarchitecture and is a quad-core chip running at 2.66 GHz. The most significant improvements compared to the Core 2 based chips (as used, e.g., in our Woodcrest cluster) have been made to the memory interface, and they can dynamically overclock themselves as long as they stay within their thermal envelope.

The memory interface controllers are now no longer in the chipset, but integrated into the CPU, a concept that is familiar from the Opteron CPUs of Intels competitor AMD. Intel has however decided to go the whole hog: Each CPU has no less than three independant memory channels, which leads to a vastly improved memory bandwidth compared to Core 2 based CPUs like the Woodcrest. Please note that this improvement really only applies to the memory interface. Applications that run mostly from the cache do not run better on Nehalem than on Woodcrest.

The physical CPU sockets are coupled with something called QPI. As the memory is now attached directly to the CPUs, accesses to the Memory of the other socket have to go through QPI and the other processor, so they are more expensive and slower. In other words, the Nehalems are CC-NUMA machines.

InfiniBand Interconnect Fabric

The InfiniBand network on tinyblue is a quad data rate (QDR) network, i.e. the links run at 40 GBit/s in each direction. It is fully non blocking, i.e. the backbone is capable of handling the maximum amount of traffic coming in through the client ports without any congestion. However, due to the fact that InfiniBand still uses static routing, i.e. once a route is established between two nodes it doesn't change even if the load on the backbone links changes, it is possible to generate traffic patterns that will cause congestion on individual links. This is however not likely to happen on normal user jobs.

Letzte Änderung: 7. Oktober 2016, Historie

zum Seitenanfang

Startseite | Kontakt | Impressum

RRZE - Regionales RechenZentrum Erlangen, Martensstraße 1, D-91058 Erlangen | Tel.: +49 9131 8527031 | Fax: +49 9131 302941

Zielgruppennavigation

  1. Studierende
  2. Beschäftigte
  3. Einrichtungen
  4. IT-Beauftragte
  5. Presse & Öffentlichkeit