New UV300 big-memory systems for TGAC

We (CiS) have recently completed the procurement of 2 new UV300 systems for The Genome Analysis Centre (TGAC). Each system comprises 256 CPU cores (16x E7-8867V3 16core Haswell), 12TB RAM, 16x 2TB Intel NVMe FLASH and a fibrechannel-connected 100TB InfiniteStorage IS5100 disk array.

That’s a combined capacity of 512 cores, 24TB RAM, 64TB of NVMe FLASH and 200TB scratch disk. Right now, this is the largest system of its kind in the World, and we think it is a winning configuration.

These are now the 4th and 5th UV systems that we’ve recommended and procured for TGAC, since the first one in late 2010, and the need/desire for this type of system continues unabated. We’re excited to have these in our data centre, and we are looking forward to seeing the impact they can make on the various genomic workflows within the organisation.

Although our previous UVs were bought primarily for their large memory footprint, they also run mixed workloads that take advantage of the high core count, and are all configured at 8GB RAM per core. The new UV300 is different; being memory-heavy and core-light, with only 256 cores to service 12TB of RAM, and that works out at a whopping 48GB per core! This is expected to be a much more performant ratio for large memory jobs, and especially DNA assembly (De Brujin graph).

The 16x 2TB NVMe FLASH cards in each system are bonded together as a single RAID0 array, with an XVM/XFS layer that provides a single filesystem with parallel IO at 1.6Gbyte/sec+ per card, allowing equality of access from all parts of the system. Right now, this is the largest implementation of it’s kind but as soon as others catch on to the idea I am sure it will be overtaken. We’ve bought these cards for a number of reasons, but will publicise more about that in due course.

Please see the TGAC press release here: http://www.tgac.ac.uk/news/280/15/TGAC-installs-largest-SGI-UV-300-supercomputer-for-life-sciences-worldwide-to-solve-the-most-demanding-data-intensive-problems/

Why choose UV?

Back in 2010, we knew that TGAC wanted to perform de-novo assembly of bread-wheat DNA, which is notoriously large and complex (5.5 times bigger than human DNA) but thus far it had not been achieved anywhere on standard hardware. We knew it would need a lot of memory in a single space, but no-one knew exactly how much since, earlier attempts at other institutions had crashed on systems as large as 400GB. We guesstimated that we might need a bare minimum of 2TB RAM, and could easily swallow up to 6TB as the workflow was developed, so we had to find a machine that was capable of this. Bear in mind our customer (TGAC) would be using existing (DNA) assembly tools, and we would be stretching their use significantly beyond their originally intended application, we needed a platform that would appear like any other simple server, presenting a single system image and running standard Linux. After a procurement competition, we reviewed the offerings and it was apparent that the only technology that was proven to work at this scale was the UV platform, and it had the backing of SGI’s long-standing reputation as a partner in HPC. So, on TGAC’s behalf we bought our first UV100 with 768 cores and 6TB RAM back in 2010. Since then we’ve also procured a second UV100, followed by a UV2000 with (2560c/20TB); and now we are adding the two UV300s (2x 256c/12TB) to the estate.

Our approach to implementing UV for TGAC has been vindicated repeatedly, with the sterling efforts of researchers, bioinformaticians and talented programmers in TGAC producing the first assembled drafts of the (bread) wheat genome sequence, as a key part of the International Wheat Genome Sequencing Consortium. Without SGI UV large shared memory this would not have been possible, and we are proud to have been able to make a small contribution to their success.

Paul Fretter