Radio Free HPC Podcast: Nvidia

Showing posts with label Nvidia. Show all posts

Monday, June 22, 2020

New Top on the TOP500 – 415 PF!

Breaking News Edition

We have a new #1 on the TOP500 list of most powerful supercomputers! Big gets bigger by a factor of 2.8x as Fujitsu’s “Supercomputer Fugaku” tops the list at 415 PFlops. There are also an additional three new entries in the top ten. We break down the top of the list in this fascinating episode of RadioFreeHPC.

Listen to us now! It will help you to amaze your friends and dismay your enemies with your newfound knowledge of the list. We have it here and first! Or at least not much later than others!

Join us!

* Download the MP3
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
* Subscribe on Spotify
* Subscribe on Google Play
* Subscribe on iTunes
* RSS Feed
* eMail us

Monday, June 1, 2020

A is for Ampere, Nvidia A100's Public Debut

NOTE: The publication of this episode was delayed due to the untimely passing of our partner and pal Rich Brueckner. So what we’re announcing as "breaking news" isn’t so fresh today, but our takes on what NVIDIA’s new A100 processor brings to the table are still valid.

Breaking News! This special edition of RadioFreeHPC takes a deep dive into NVIDIA’s spanking new A100 GPU – which is an impressive achievement in processor-dom. The new chip is built with a 7nm process and weighs in at a hefty 54 billion transistors and capped at 400 Watts. It sports 6,912 FP32 CUDA cores, 3,456 FP64 CUDA cores and 422 Tensor cores.
This 8^th generation GPU, using what the company calls its Ampere technology, is a replacement for both their V100 GPU and Turing T4 processors, giving the company a single platform for both AI training and inferencing.
We talk about the specs of the A100, breaking down its game both in terms of typical HPC FP64 processing and FP32 (and lower precision) computing for AI workloads. On the HPC side, the new GPU seems to offer an across the board 25% speedup, which is substantial. But the A100 really shines when it comes to tensor core performance which the company reports at an average speed up of 10x on Tensor Core 32 bit vs. V100 FP32.
New features of the A100 include Sparsity (a mechanism that doubles sparse matrix performance), a much speedier NVLink (2x), and a hardware feature that allows the A100 to be partitioned into as many as 7 GPU instances to support individual workloads.
All in all, this is an amazing new processor, a behemoth large and hot, but so fast, chip that is heavily tilted towards new AI and Tensor workloads with a passing but welcome nod to 64-bit HPC apps.

Join us!

* Download the MP3
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
* Subscribe on Spotify
* Subscribe on Google Play
* Subscribe on iTunes
* RSS Feed
* eMail us

Friday, May 8, 2020

ColdQuanta Serves Up Some Bose-Einstein Condensate

The show starts with Dan, Jessi and Shahin in attendance. Henry is traveling from his old home base in Minnesota to his new command bunker in lovely Las Cruces, NM. Last we heard he was in Kansas City and making good time. We’re not sure how long we’re going to have to do without him as Comcast seems to be slow playing him on his internet installation timeline.

Why Freeze the Whole Room If you Just want a Frozen Atom?

Our big topic today is the quantum computing company ColdQuanta. It’s headed by an old pal of ours Bo Ewald and has just come out of stealth mode into the glaring spotlight of RadioFreeHPC. They have a unique approach to quantum computing, trapping atoms themselves to create Bose-Einstein Condensate. This is a fifth state of matter, which matters quite a bit. When you freeze a gas of Bosons at low density to near zero, you start to get macroscopic access to microscopic quantum mechanical effects, which is a pretty big deal. With the quantum mechanics start, you can control it, change it, and get computations out of it. The secret sauce for ColdQuanta is served cold, all the way down into the micro-kelvins and kept very locally, which makes it easier to get your condensate.

The company is focused on measurement and sensing but also mention straight computation, the latter like most of the other quantum competitors. They were the first company to put their quantum computer in space and the first to create Bose-Einstein Condensate while in orbit at the International Space Station.

Catch of the Week

Jessi: Want to chill out and help NASA at the same time? Jessi has found a way with NeMO-Net, a game where users cruise through an animated ocean floor and classify coral structures. Your answers are then fed into NASA’s Pleiades supercomputer, which uses the data as fodder to improve it’s own identification prowess. It’s a great way to while away the hours during these Covid19 shut downs, right?

Shahin: has two catches, the first is a celebration of IBM’s quant-iversary, marking the fourth anniversary of them having a quantum computer on the web – many happy returns to Big Blue. They’re also sponsoring a contest, see the web link for details.

In his second catch, Shahin shamelessly promotes his recent talk at the HPC AI Advisory Council virtual Stanford conference. He did a great job on covering just about every buzzword topic in the industry in only 30 minutes, well done.

Dan: Dano likes fast things and seeing fast things get even faster. This is what attracted him to the story about ISV Risk Fuel and Microsoft’s Azure posting an article boasting a 20 million x speedup of derivative processing. A 20 million times speedup of anything is pretty significant and they achieve this with a combination of 8 NVIDIA V100 GPUs (w/32GB memory each), InfiniBand and Risk Fuel’s amazing software. What’s great about this is that with this speed the model has complete fidelity with traditional calculations. In other words, you can speed all you like without any downside when it comes to accuracy – amazing stuff.

Join us!

* Download the MP3
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
* Subscribe on Spotify
* Subscribe on Google Play
* Subscribe on iTunes
* RSS Feed
* eMail us

Sunday, August 11, 2019

Who will benefit from Intel dropping Omni-Path?

Spoofing the Spoofers

Henry has a brilliant idea to weaponize his password generator against phishing attacks.

Intel Drops Omni-Path

Henry and Shahin take a close look at the history of High Performance Interconnects, recent news, and how the market is changing profoundly. The departure of Intel from this segment is good news for some, and it remains to be seen what strategy Intel will adopt for the HPC market.

Catch of the Week

Henry:

Henry brings up one his favorite topics (going all the way back to our very first episode): the dreaded Silent Data Corruption, this time as part of the testing that the 737 MAX is undergoing. As he's wont to do, Shahin puts this in the context of our collective transition from the Industrial Age to Information Age. He thinks the series of issues with the plane prove just how difficult it is for manufacturers to go more and more digital.

Another rewrite for 737 Max software as cosmic bit-flipping tests glitch out systems – report

Testing focused on flipping five bits, said to control some of the most crucial parameters: positioning of flight controls and activation state of flight control systems, such as the infamous MCAS anti-stall system.

Shahin:

Shahin thinks the mention of building an AI supercomputer by Microsoft is intriguing. They already offer Cray capability in Azure and inquiring minds want to know more.

Microsoft to invest $1 billion in OpenAI, will jointly develop new supercomputer technologies

Microsoft and OpenAI also plan to work together on new AI supercomputing technologies to solve the world’s hardest problems. “The companies will focus on building a computational platform in Azure of unprecedented scale, which will train and run increasingly advanced AI models, include hardware technologies that build on Microsoft’s supercomputing technology, and adhere to the two companies’ shared principles on ethics and trust..."

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

Monday, June 17, 2019

TOP500 Jun2019, Facebook Coin

The new TOP500 list of most powerful supercomputers is out and we do our usual quick analysis. Not much changed in the TOP10 but a lot is changing further down the list. Here is a quick take:

There are 65 new entries in 2019.
US science is receiving support via DOE sites and academic sites like TACC.
26 countries are represented. China continues to widen its lead, now with 219 entries, followed by the US with 116, Japan with 29, France with 19, the UK with 18, Germany with 14, Ireland and the Netherlands with 13 each, and Singapore with 10.
Vendors substantially reflect the country standings. Lenovo has 175 entries, Inspur 71, and Sugon 63, all in China. Cray with 42 and HPE with 40 (which will combine when their deal closes), followed by Dell at 17 and IBM at 16. Bull has 21 entries.
There are a lot of "accidental supercomputers" on the list. These are systems that probably are not be doing much science or AI work but they could, and the vendors counted them and it seems to be within the rules to list them. It's controversial but not a new practice.
There are several systems listed as "Internet" companies. Hard to tell what that means but it points to the existence of very large clusters in the cloud for whatever purpose. Last year, there was one system listed as Amazon EC2, which remains on the list. This time, there is also one at Facebook. Usually the big social/cloud players don't care to participate, though they obviously could summon the resources to run the benchmarks.
Just over half of systems use Ethernet as a fabric. A quarter us InfiniBand, nearly 50 use Intel's OmniPath, and the rest, 55, use custom interconnects like the ones Cray provides. The team talks about Cray+HPE entering the interconnect business for real and if so, they will be formidable.
The majority of entries, 367, do not have any accelerators. 125 use Nvidia GPUs.
The overwhelming majority of the systems, 478 of them, are based on Intel CPUs. 13 are IBM, and there is 1 system based on Arm provided by Cavium, now part of Marvell.
So the when it comes to chips, it's an Intel game with a respectable showing by Nvidia when GPUs are used. Alternatives are bound to appear as the tens and tens of AI chips in the works become available and Arm, AMD, and IBM build on. The recently announced system at Oakridge will be all AMD, and that will point to an alternative as well.
Notably, Intel is listed as the vendor for 2 entries and Nvidia is listed for 4. While Intel has stayed largely away from looking like a system vendor, Nvidia is going for it with its usual alacrity. That, and the pending acquisition of Mellanox by Nvidia should serve as a warning to all system vendors who might feel stuck between treating Nvidia as an important supplier and an up and coming competitor.

CryptoSuper500

Shahin mentions the 2nd edition of the CryptoSuper500 list (really 50 for now), a list developed by his colleague Dr. Stephen Perrenod, which was launched last November, and is being released at the same time as the TOP500. The TOP500 has spawned variations that look at different workloads and attributes, for example, the Green500, Graph500, and IO500 lists. CryptoSuper500 was inspired by those lists. The material for the inaugural edition of the CryptoSuper500 list here.
Cryptocurrency mining operations are often pooled and are very much supercomputing class, typically using accelerator technologies such as custom ASICs, FPGAs, or GPUs. Bitcoin is the most notable of such currencies. Scroll down for the top-10 list and see the slides for the full list and the methodology.

Catch of the Week

Henry:

Henry talks about check-out lanes at Target all being down for unknown reasons, though he hesitates to call that a cybersecurity breach. It turned out he's right and the company blamed an "internal technology issue".

Target down (then back up) as cash registers fail and leave long lines

Target's payment systems appeared to be missing the mark the day before Father's Day, as terminals went AWOL for a couple of hours in a number of the company's US retail outlets. The outage caused long lines but prompted an encouraging show of sympathy for Target employees from people on Twitter. And there were some jokes too, of course.

Shahin:

Facebook is expected to release a new cryptocurrency that is already impacting the crypto market.

Here’s what we know so far about the secretive Facebook coin

Facebook is likely to release information about its secretive cryptocurrency project, codenamed Libra, as soon as June 18, TechCrunch reports.

As is traditional with new cryptocurrencies, the social networking giant is expected to release a so-called “white paper” outlining how the currency works and the company’s plans for it.

Dan:

Dan reminds us all of the inimitable Erich Anton Paul von Däniken and his ancient astronauts hypotheses!

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

Thursday, April 11, 2019

Enterprises going HPC, Chips go Open Source, China goes for the top spot

We continue to want to make these introductions pretty brief here but not this time, apparently! Here's this week's synopsis.

Nvidia GTC 2019 announcements

We discussed the recent GTC conference. Dan has been attending since well before it became the big and important conference that it is today. We get a quick update on what was covered: the long keynote, automotive and robotics, the Mellanox acquisition, how a growing fraction of enterprise applications will be AI. In agreement with the message from GTC, Shahin re-iterates his long-held belief that the future of enterprise applications will be HPC and once again asserts that AI as we know it today is a subset of HPC. Not everyone agrees. Henry brings up varying precisions in AI and a discussion ensues about what is HPC. There seems to be agreement that regardless of what label you put on it, it is the same (HPC) industry and community that is driving this new trend. And that led to a discussion of selling into the enterprise and the need for new models and vocabulary and such. Speaking of varying precision, there is also Nvidia's new automatic mixed precision capability for Tensorflow and there is a bit of discussion on that.

China plans multibillion dollar investment in supercomputing

On the heels of the Aurora announcement, there was news in the South China Morning Post that the top spot in supercomputing is something the country is investing in. No surprise, but interesting to see, and consistent with the general view that supercomputing drives competitive strength.

Catch of the Week

Henry:

Facebook Stored Hundreds of Millions of User Passwords in Plain Text for Years
Hundreds of millions of Facebook users had their account passwords stored in plain text and searchable by thousands of Facebook employees — in some cases going back to 2012, KrebsOnSecurity has learned. Facebook says an ongoing investigation has so far found no indication that employees have abused access to this data.

Shahin:

MIPS R6 Architecture Now Available for Open Use
MIPS 32-bit and 64-bit architecture – the most recent version, release 6 – will become available Thursday (March 28) for anyone to download at MIPS Open web page. Under the MIPS Open program, participants have full access to the MIPS R6 architecture free of charge – with no licensing or royalty fees.

Dan:

Vengeful sacked IT bod destroyed ex-employer's AWS cloud accounts. Now he'll spent rest of 2019 in the clink
An irate sacked techie who rampaged through his former employer's AWS accounts with a purloined login, nuking 23 servers and triggering a wave of redundancies, has been jailed.

Dead LAN's hand: IT staff 'locked out' of data center's core switch after the only bloke who could log into it dies
'We can replace it but we have no idea what the config is on the device'

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter