Blue Gene

Blue Gene

Parallel Architectures

Published in Pranabananda Chakraborty, Computer Organisation and Architecture, 2020

On March 25, 2005, IBM’s Blue Gene/L prototype, a customized version of IBM’s PowerPC architecture, became the fastest supercomputer in a single installation using its 1,31,072 processors to run at 280.6 TFLOPS (1012 FLOPS). On October 28, 2005, the machine reached 280.6 TFLOPS, but the system is expected to achieve at least 360 TFLOPS, and a future update is targeted at attaining a peak performance in the region of petaflop, i.e., 1 PFLOP = 1015 FLOPS. In November, 2005, IBM Blue Gene/L became the number one on TOP 500’s most powerful supercomputer list. The IBM Blue Gene series is at present the fastest range of supercomputer systems in the world.

Aggregation of clans to speed-up solving linear systems on parallel architectures

View Article

Journal Information

Published in International Journal of Parallel, Emergent and Distributed Systems, 2022

Dmitry A. Zaitsev, Tatiana R. Shmeleva, Piotr Luszczek

Hypertorus communicating structures are widely applied as communication facilities of supercomputers and clusters since multi-dimensional torus (hypertorus) possesses ideal qualities of the minimal distance between two chosen nodes. For instance, IBM Blue Gene supercomputer communication system was implemented as a three-dimensional torus and its communication on chip as a five-dimensional torus [37]; the most recent supercomputer Fugaku [38], ranked number 1 on the TOP500 list, uses the TOFU Inerconnect D whose topology can be characterised as six-dimensional torus. Models of hypertorus communication structures in a form of Petri Nets and process algebra have also been studied [34]. An example of two-dimensional torus is shown in Figure 8. A node model specifies a packet switching device with four ports situated on each side of a square, the nodes are connected by merging their contact places. Torus topology requires connection of opposing borders of the resulting lattice. In d-dimensional case, a node is specified by a hypercube. We use von Neumann neighbourhood, when neighbouring nodes are connected by facets considered as -dimensional hypercubes, though a generalised neighbourhood [39] can be applied as well. Here, we consider four-dimensional torus of size 3 and represent the source decomposition graph in Figure 9 and its aggregation into seven clans in Figure 10 as generated by METIS [5]. Note that in this case, the decomposition graph ( Figure 9) is not much informative and for big size real-life models, having thousands of Petri net vertices, graphical representation does not help much. That is why in the next section, we use textual forms to specify briefly the obtained decomposition.

Explore chapters and articles related to this topic

Parallel Architectures

Aggregation of clans to speed-up solving linear systems on parallel architectures