Die letzten Meldungen

Abschaltung Novell-Server Memory am 01.10.2016

16. August 2016

Der Novell-Server Memory (memory.rrze.uni-erlangen.de) wird am 01.10.2016 außer Betrieb genommen wird.
Weiterlesen...

Wartungsankündigung für die FAU Zeiterfassung (12.08. – 15.08.2016)

11. August 2016

Wartungsankündigung für die FAU Zeiterfassung
Weiterlesen...

Kurzfristige Wartung der FAUbox-Server

28. Juli 2016

Heute, 28.07.2016 wird die FAUbox wegen dringend nötiger Wartungsarbeiten in der Zeit zwischen 12:50 und 14:00 nur eingeschränkt verfügbar sein. Ein kompletter Ausfall ist nicht geplant, kann aber kurzfristig nötig werden.
Weiterlesen...

Meldungen nach Thema

 

LBM: Optimized Implementations of the Lattice Boltzmann Method in 3D

Lattice Boltzmann Methods (LBM) are popular for numerical simulation of incompressible flows. This project is aimed to investigate and optimize simple lattice Boltzmann kernels for different architectures. This includes both commodity "off-the-shelf" architectures and tailored HPC systems, such as vector computers. We cover modern 64-bit processors ranging from IA32 compatible (Intel Xeon/Nocona, AMD Opteron), superscalar RISC (IBM Power4), IA64 (Intel Itanium 2) to classical vector (NEC SX6) and novel vector (Cray X1) architectures.

Writing to a cell using LIJK layout In the course of this project, we adviced the Bachelor Thesis of Stefan Donath and Johannes Habich and published several papers.
The Bachelor Thesis of Stefan Donath as well as the first report (see section Papers below) is on the influence of different memory layouts on the performance of simple lattice Boltzmann kernels. By reordering the data of the array used it was able to supersede standard cache-optimizing techniques like spatial blocking.
Stefan Donath himself presented his results on the SIAM Conference on Computational Science & Engineering 2005 in Orlando, Florida.

Parallelization and scaling behavior of LBM was examined in a second part of this project. Extensive experiments with both OpenMP and MPI on different contemporary Terascale architectures have been done and published at e.g. Supercomputing Conference 2004 and Parallel CFD Conference 2005.

Optimization and Application of 3D LBM for complex structures

Indirect addressing handling bounce back implicitly In a further stage of the project we investigated in optimization possibilities of a LBM code for complex structures. In cooperation with the Lattice Boltzmann Development Consortium a data representation which only stores fluid cells, ommitting obstacle data, was examined. The results using memory traversion by space-filling curves were presented on the ASIM Conference 2005.

Logo des Elitenetzwerks Bayern This part of the project was partially funded by the Externer Link:  Bavarian Graduate School for Computational Engineering which is part of the Elitenetzwerk Bayern.

Blocking factor Parameter study on Intel Itanium2
	plattform To regard the increasing complexity of continous surfaces the Bachelor Thesis of Johannes Habich implemented more advanced and accurate boundary conditions of second order. The influence on performance of the additional calculations as well as the possibility of different fluid to obstacle ratios were well-investigated. This lead to the implementation of an compressed list storage format which was thouroughly tested for performance with different compressed list storage spatial blocking factors. A shared memory parallelization was done to meet todays increased medium grained parallelism.

Optimized GPU (Graphics Processing Unit) Implementations of the Lattice Boltzmann Method in 3D

Special purpose accelerators are an emerging topic over the last years. To evaluate the effort of implementing numerical kernels and the proposed benefit, the Master Thesis of Johannes Habich implemented several benchmarks to get hands-on knowledge about initial implementation effort and optimization techniques on the currently available nVIDIA Geforce G80 GPU. The huge thread level parallelism leads to a new way of parallel programming, which is supported by the nVIDIA CUDA framework. Buffering propagations before coalesced
	write back The well known Streambenchmark was implemented and demonstrated the potential of the memory subunit. The implementation of a lattice Boltzmann driven fluid flow solver showed deep insights into pitfalls of the hardware and led to sophisticated optimization techniques which are in general applicable. Threads treat one cell along x-axis of constant y,z
	index

In cooperation with the Externer Link:  Department of Computer Science 10 (Systemsimulation) a new kernel was derived which was better suited for deployment in an MPI parallelized heterogeneous framework called Externer Link:  widely applicable Lattice Boltzmann from Erlangen (waLBerla). An indepth analysis of the computation and communication pattern led to a very efficient and fast solver which is now developed towards different kinds of applications, e.g. particulate flows. A major concern in comparison to stand alone solver development is that different communication networks lead to inevitable performance drawbacks. To optimize these communication stages is the most important part of performance optimizations.

Acknowledgements

Logo des KONWIHR Projektes This project is partially funded by Externer Link:  KONWIHR (Competence Network for Technical, Scientific High Performance Computing in Bavaria).

By cooperation with the Externer Link:  Department of Computer Science 10 (Systemsimulation) and the Externer Link:  Chair of Fluid Dynamics we ensure that the project is always as near as possible to the engineering demands. Furthermore we are working together with Externer Link:  Peter Lammers at HLRS and Externer Link:  Jörg Bernsdorf of German Research School for Simulation Sciences GmbH.

Logo des KONWIHR Projektes This project is partially funded by Externer Link:  SKALB (Lattice-Boltzmann-Methoden für skalierbare Multi-Physik-Anwendungen).

Infos & Talks

Papers

  • Gerhard Wellein, Thomas Zeiser, Stefan Donath, Georg Hager
    On the Single Processor Performance of Simple Lattice Boltzmann Kernels

    Externer Link:  Computers & Fluids, 35:8-9 (2006) 910-919

  • Thomas Pohl, Nils Thürey, Frank Deserno, Ulrich Rüde, Peter Lammers, Gerhard Wellein, Thomas Zeiser
    Performance Evaluation of Parallel Large-Scale Lattice Boltzmann Applications on Three Supercomputing Architectures
    accepted for Supercomputing Conference, 2004.
    PDF: PDF-File

  • Peter Lammers, Gerhard Wellein, Thomas Zeiser, Georg Hager, Michael Breuer
    Have the vectors the continuing ability to parry the attack of the killer micros?
    accepted for Proceedings of the 2nd Teraflop-Workshop at HLRS, March 2005.

  • Gerhard Wellein, Thomas Zeiser, Peter Lammers, Uwe Küster
    Towards Optimal Performance for Lattice Boltzmann Applications on Terascale Computers
    accepted for Parallel CFD Conference, 2005.
    PDF: PDF-File

  • Stefan Donath, Thomas Zeiser, Georg Hager, Johannes Habich, Gerhard Wellein
    Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures
    In Proceedings "Frontiers in Simulation: Simulationstechnique - 18th Symposium in Erlangen, September 2005 (ASIM)" (Editors: F. Hülsemann, M. Kowarschik, U. Rüde), SCS Publishing House, Erlangen, 2005, Pages 728-735.
    PDF: PDF-File

  • Johannes Habich, Thomas Zeiser, Georg Hager, Gerhard Wellein
    Speeding up a Lattice Boltzmann Kernel on nVIDIA GPUs.
    In Proceedings of the " First International Conference on Parallel, Distributed and Grid Computing for Engineering, April 2009, Pecs, Hungary, PARENG09-S01" (Editors: B.H.V. Topping and P. Ivanyi ), Civil-Comp Press, Stirling, 2009.
    Externer Link:  Externer Link

Technical Reports

  • Thomas Zeiser, Gerhard Wellein, Georg Hager, Stefan Donath, Frank Deserno, Peter Lammers, Monika Wierse
    Optimized Lattice Boltzmann Kernels as Testbeds for Processor Performance
    PDF: PDF-File

Master Theses

  • Performance Evaluation of Numeric Compute Kernels on nVIDIA GPUs
    PDF: Thesis (PDF-File) (Externer Link:  Johannes Habich)

    supervised by:
    Prof. Ulrich Rüde, Dr. Gerhard Wellein, Dr. Thomas Zeiser, Dr. Georg Hager, Stefan Donath.
    July 2008.

Bachelor Theses

  • On Optimized Implementations of the Lattice Boltzmann Method on Contemporary High Performance Architectures
    PDF: Thesis (PDF-File) (Externer Link:  Stefan Donath)

    supervised by:
    Prof. Ulrich Rüde, Dr. Gerhard Wellein, Thomas Zeiser, Georg Hager, Frank Deserno.
    August 2004.

  • Improving computational efficiency of Lattice Boltzmann methods on complex geometries
    PDF: Thesis (PDF-File) (Externer Link:  Johannes Habich)

    supervised by:
    Prof. Ulrich Rüde, Dr. Gerhard Wellein, Thomas Zeiser, Georg Hager.
    Februar 2006.

Talks

  • Gerhard Wellein
    Optimization Approaches and Performance Characteristics of Lattice Boltzmann Kernels
    invited talk, International Conference for Mesoscopic Methods in Engineering and Science, Braunschweig, July 28, 2004.

  • Stefan Donath
    On Optimized Implementations of the Lattice Boltzmann Method on Contemporary High Performance Architectures
    PDF: Presentation Slides
    SIAM CSE05 Conference, Orlando, February 2005.

  • Gerhard Wellein
    Architecture and Performance of Terascale Computers
    International Conference on Parallel Computational Fluid Dynamics, Maryland, May 24-27, 2005.

  • Stefan Donath
    Optimizing Performance of the Lattice Boltzmann Method for Complex Structures
    ASIM Conference, Erlangen, September 2005.

  • Gerhard Wellein
    Efficient implementations of simple lattice Boltzmann kernels
    Short Course
    PDF: PDF-Datei: Presentation Slides
    International Conference for mesoscopic Methods in Engineering and Science (ICMMES) 2006, Hampton/Norfolk, July 24, 2006.

Letzte Änderung: 16. September 2014, Historie

zum Seitenanfang

Startseite | Kontakt | Impressum

RRZE - Regionales RechenZentrum Erlangen, Martensstraße 1, D-91058 Erlangen | Tel.: +49 9131 8527031 | Fax: +49 9131 302941

Zielgruppennavigation

  1. Studierende
  2. Beschäftigte
  3. Einrichtungen
  4. IT-Beauftragte
  5. Presse & Öffentlichkeit