Showing posts with label AI. Show all posts
Showing posts with label AI. Show all posts

Monday, June 1, 2020

A is for Ampere, Nvidia A100's Public Debut

NOTE:  The publication of this episode was delayed due to the untimely passing of our partner and pal Rich Brueckner. So what we’re announcing as "breaking news" isn’t so fresh today, but our takes on what NVIDIA’s new A100 processor brings to the table are still valid. 


Breaking News! This special edition of RadioFreeHPC takes a deep dive into NVIDIA’s spanking new A100 GPU – which is an impressive achievement in processor-dom. The new chip is built with a 7nm process and weighs in at a hefty 54 billion transistors and capped at 400 Watts. It sports 6,912 FP32 CUDA cores, 3,456 FP64 CUDA cores and 422 Tensor cores.
This 8th generation GPU, using what the company calls its Ampere technology, is a replacement for both their V100 GPU and Turing T4 processors, giving the company a single platform for both AI training and inferencing.
We talk about the specs of the A100, breaking down its game both in terms of typical HPC FP64 processing and FP32 (and lower precision) computing for AI workloads. On the HPC side, the new GPU seems to offer an across the board 25% speedup, which is substantial. But the A100 really shines when it comes to tensor core performance which the company reports at an average speed up of 10x on Tensor Core 32 bit vs. V100 FP32.
New features of the A100 include Sparsity (a mechanism that doubles sparse matrix performance), a much speedier NVLink (2x), and a hardware feature that allows the A100 to be partitioned into as many as 7 GPU instances to support individual workloads.
All in all, this is an amazing new processor, a behemoth large and hot, but so fast, chip that is heavily tilted towards new AI and Tensor workloads with a passing but welcome nod to 64-bit HPC apps.

Join us!

* Download the MP3 
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
Subscribe on Spotify 
Subscribe on Google Play 
Subscribe on iTunes 
RSS Feed
* eMail us

Wednesday, April 15, 2020

AI in Science. When is it real?

We fire up the show with introductions and a little snippiness on Dan’s part. Henry reports that the weather in Minnesota is nearly human.

AI in Science

Jumping into our main topic, Shahin introduces an article from HPCwire interviewing Argonne’s Associate Laboratory Director Rick Stevens about how the DOE will be using AI in science. This is one of the biggest potential changes in our industry and well worth the investigation. But figuring out where AI fits into the traditional world of research and simulation is a difficult problem. Henry points out that nearly every grant proposal needs to include “AI” in order to get serious consideration.

We discuss Dan's Great HPC Road Trip* of national labs in 2018 and how nearly every lab is looking at using AI to inform their simulations and cut down on the brute force computing they’re doing now. Dan’s national lab interviews are here: Idaho National LabNCARNRELLos AlamosSandiaNERSCLawrence Livermore

There’s also a slight tangent where Dan talks about driving hundreds of miles out of his way to mess with Henry’s Las Cruces lot and future home. This resulted in an epic short film “The Haunting of Henry House” which is stuck in bureaucratic  approval cycles according to Henry.

RFHPC Hall of Fame?

We also discuss the possibility of founding a Radio Free HPC Hall of Fame, but discarded it when we realized that no one would want to be in it.

COVID-19

As the conversation continues, Dan brings up an article that discusses how COVID-19 might affect processor foundry revenues and demand. We are, as a group, underwhelmed by the analysis. Henry notes that he has seen a significant increase in the price of laptops when shopping for a graduation gift for his nephew. Henry has reportedly seen an increase of around 20% in prices since February.

Reasons Why No One Should Ever Be Online. Ever.

Hackers have stolen and ransomed AMD’s GPU test files, a dastardly act, but not surprising to see. They’re looking for $100 million to give the files back, while AMD has downplayed their importance and value.

Catch of the Week

Henry:  Another empty net week for our pal Henry
Shahin: How is the internet coping with all of the extra traffic caused by Covid19 isolation?
Jessi:   For the first time in recorded history, Jessi’s net is empty….sad.

SuperCatch

Dan:  has a SuperCatch! He does a promo of the inaugural episode of a new RadioFreeHPC segment. Suffice to say that RadioFreeHPC Studios has a brand new production of “Charles Babbage, His Life & Times,” a gripping radio drama that will engage your emotions from A-B.

Listen in to hear the full conversation

* Download the MP3 
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
Subscribe on Spotify 
Subscribe on Google Play 
Subscribe on iTunes 
RSS Feed
* eMail us

Tuesday, April 7, 2020

Supercomputers Battle Corona

The conversation today begins with discussing how long it will take until Henry moves into his rammed earth Los Cruces bunker. For those of you keeping track at home, the correct answer was 41 days at the time of taping, which, by the time the podcast is out will probably be "a couple of weeks ago".

We quickly move on to discussing the Corona Virus, which started sharing the headlines with its handiwork COVID-19. What else is anyone talking about these days, right? We discuss how the supercomputing community has joined the fight and the impact on the battle against the virus.
We do our best to keep the conversation light, knowing that everyone out there is suffering from the virus – it’s the one thing we all have in common these days. We hope you enjoy the episode.

Catch of the Week:

Henry:  Hackers target medical field during Covid19 crisis, one of the crappiest things we’ve heard in a long time.
Shahin:  Tells us about a great paper title, “Software Defined Microarchitecture An Arguably Terrible Idea, but Certainly not the Worst Idea” as found on InsideHPC.
Jessie:  Discusses how Globus is offering free access for anyone working on the Covid19 Virus. Great job, Globus, way to pitch in.
Dan:  Shares his latest addiction, the ancient Asian game of Go. Here is an intro to the game and some puzzles to work on, get crackin’.

Listen in to hear the full conversation

* Download the MP3 
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
Subscribe on Spotify 
Subscribe on Google Play 
Subscribe on iTunes 
RSS Feed
* eMail us

Monday, March 9, 2020

Let's Learn Deeply about Extreme Weather

This week we have Dan, Jessi, and Shahin on the call. Henry is off in Los Cruces overseeing construction of what can only be called a bunker. Why? Its main feature is 21-inch rammed earth walls, guaranteed to withstand withering heat waves, cold snaps, and probably any high caliber round. We speculate on the exact configuration of the home, wondering if Henry is running wild with the rammed earth and concrete theme, with concrete chairs and tables, plus rammed earth interior walls.

Applying Deep Learning to Extreme Weather

Dan deftly moves us on to the main topic of this show, how researchers are using supercomputers to apply deep learning to extreme weather. A research team from Rice University utilized three supercomputers (TACC’s Stampede 2, Wrangler, and Pittsburg Supercomputing Center’s Bridges system) to see if data on heat waves and cold spells could be predicted by analysis of atmospheric circulation and prior surface temperature. The results of these tests indicated that this deep learning approach is more accurate at predicting extreme weather.

In the call, we discuss the computational difficulty of weather forecasting and the use case that the Rice researchers are testing. This promising research can pay great dividends in terms of  giving early warning to hazardous weather, saving crops and perhaps saving lives in the process.  As promised in the podcast, here’s a link to the paper. We also have a short discussion of what motivates Dan to read a particular paper and what turns him off. Jessi’s main standard in papers is that it has to be able to be printed in black and white and remain legible and understandable. So if you want to attract Jessi’s attention for your paper, make sure your charts don’t use color.

Things You Think You Know, But Maybe Don't.

The question this week is why Cray computers were horseshoe shaped.  One of the reasons was wire length and this shape puts the components closely together to reduce the length of the wires needed to connect them. It also gave them enough room for a person to get their hands inside to weave the wires. So the key was minimal, uniform, and accessible wire length. There are also a couple of other explanations, one is that it gave room for the liquid cooling pipes necessary to cool the box, another is that the system forms a capital “C” shape, which stands for, of course, Cray.

Catch of the Week



Henry: is away this week. (We know some of you don't read this all and come straight here!)

Jessi:  Tells us that the US might want to take a close look at Estonia as a model to overcome cybersecurity. The country has put together a civilian cybersecurity force and instituted mandatory cyber classes in schools. This is a response to massive cyberattacks launched against Estonia in 2007 that took down much of their digital infrastructure for weeks.

Shahin:  Discusses how Justine Haupt came up with a way to keep her cell phone from distracting her – she built a rotary dial interface for it. Along with helping save her from using the most time-wasting features on her phone, it will also confound an entire generation of folks who have never seen a rotary phone dialer. Justine also is working in robotics and has a page of her inventions and thoughts.

Dan:  Brings up a story about a man convicted of murder mainly on the basis of DNA evidence, although that evidence was shaky, mainly saying that they couldn’t exclude him. His case was reopened by the Innocence Project who reached out to a company called Cybergenetics for further analysis. Cybergenetics ran samples through their 170,000 line AI algorithm and found that there was zero chance that the convicted man’s DNA was present in the sample. So the man will be released, which is great. The problem is that the Cbyergenetics code is a black box and the company, citing competitive advantage, will not release the code.  How should we deal with situations like this in the future?

Listen in to hear the full conversation

* Download the MP3 
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
Subscribe on Spotify 
Subscribe on Google Play 
Subscribe on iTunes 
RSS Feed
* eMail us

Tuesday, January 28, 2020

ZFS, AI for System Design, Power in GCE

Surprise! It's Snowing in Minnesota

The show starts on a combative note with Henry refusing to discuss how much snow is arrayed around his house. Dan shares his dream running a snowblower and Henry offers up his house but doesn’t offer airfare, which, assumedly, would be a deal breaker for the ever-cheap Dan Olds.

ZFS

With no big news in the industry this week, it’s a grab-bag show covering various topics. Shahin is up first with his discussion of  Linus Torvals’ dissing the ZFS file system. Henry weighs in on the evolution of ZFS and how his opinion of ZFS has changed over the last decade or so. Both Shahin and Henry feel ZFS is unique and highly useful and that maybe Linus isn’t up on current ZFS capabilities. Dan brings up the licensing issue with ZFS, in the context of Oracle typically acting like a rabid dog in defense of their intellectual property. In further conversation, Shahin makes the brilliant point that “Data is Data” to the confusion and delight of the others.

AI to Help Design Systems

Dan brings up the topic of machine learning being use for computer architecture design. Shahin is a bit skeptical and has several questions. Henry chips in with some comments about how this will probably aid app-specific hardware design. Dan then relates this article to another story about how MIT is using machine learning to predict how code will perform on a processor. Shahin states his belief that he's dubious about many of today’s proposed use cases for AI. After some coaching from Dan, Shahin is moved to a neutral position, maybe.
As a tangent, we discuss benchmarking and speculating with SPECint and SPECfp to figure out competitive performance.

More Power to GCE

Shahin then brings up a story about Google bringing IBM’s Power systems into their cloud, which leads to a brief discussion of why they’re doing it and what types of applications will be supported.

Why Nobody Should Ever be Online. Ever.

Henry Newman’s Reasons Why No One Should Ever Be Online. Ever:  in this week’s installment, Henry discusses how an online organization was hosting 56 million records of US citizens, including names, addresses, etc., in the open. Ouch.

Catch of the Week



Jessi:  her net is empty and there’s nothing on the hook. It’s her first week back in school, so we can cut her some slack this time. We do make the announcement that Jessi is now part of the RadioFreeHPC team as a co-host, which is pretty cool. We also discuss that one requirement for the position is that we get to monitor her transcripts, starting in high school. We’ll analyze major trends and developments in a comprehensive spreadsheet that will be posted online at some point in the distant future. Dan demurs when asked to show his transcripts.

Shahin:  Discusses LEO Labs, a company that tracks items in space and evaluates the probability of collisions. The company analyzes as many as 800,000 potential collision scenarios per day – wow – that’s a lot of number crunching. Shahin explains how they do this and the results.

Henry:  Not only has nothing in the boat, he didn’t even get a nibble this week.

Dan:  Eulogizes the late, great, Mira supercomputer. After eight long years, Mira will be laid to rest later on this year. Mira is one of the last IBM Blue Gene/P systems and propelled the system to the third spot on the TOP500 list. It was the go-to system for ‘one in a billion’ simulations, drug discovery, and particle physics to name a few. It was a great system and it will be missed. Job well done, Mira, job well done.

Listen in to hear the full conversation

* Download the MP3 
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
Subscribe on Spotify 
Subscribe on Google Play 
Subscribe on iTunes 
RSS Feed
* eMail us

Wednesday, January 15, 2020

2020 Predictions, Get it?!

Shiny Crystal Ball

It’s our first episode of 2020, yay! The first that was recorded in 2020 anyway.  It's a predictable 20/20 joke (more of a meh comment really) but the topic today is... PREDICTIONS. More specifically, it's our predictions of what’s going to happen in the next year. We may not always be correct, but we think maybe we’re always certain. We look at compute, interconnects, security, and general innovations:

Compute

Dan says that we’re going to have more of it. Henry predicts that we’ll see a RISC-V based supercomputer on the TOP500 list by the end of 2020 – gutsy call on that.  This is a double down on a bet that Dan and Henry have, so he’s reinforcing his position. Dan also sees 2020 as the “Year of the FPGA” when we start to see more and more HPC boxes fueled by FPGA, which is something Shahin mostly agrees with while Henry disputes it. We also touch on liquid cooling and process size as part of this topic.

Interconnects

Dan thinks that InfiniBand will announce 400 GBs interconnect by the end of this year – a bold prediction. On a communications note, Henry says that 20% of the US user base will have access to 5G phone coverage by the end of the year. Shahin asserts that only 3% of the market will actually buy it, but Dan and Henry say not so fast – it’ll be closer to 10%. Shahin is looking for a 5G connection for servers. Not as an interconnect, but more as a WAN or a cluster that spans an entire county. On another note, Shahin believes that HPE will formally get into the interconnect business, selling the Slingshot interconnect.

Security Trends

Dan says we need more of it but doesn’t see anything that’s going to move the needle back towards the users. Jessi thinks that security education has improved things security-wise and that will continue in 2020. Henry and Dan disagree. Jessi is adamant.

Innovation/Trends

Dan pegs in-memory computing as a field that will blossom over the coming year(s). Shahin agrees that in-memory is very interesting and ripe for innovation as well. But he also sees a lot of developments in the AI processor space. Henry talks about a new application workflow that will go something like this:  Object > MemMap > Compute on the MemMap file/data > back to Object, with no POSIX in the way. Shahin also sees more quantum supremacy in the news in the coming year.

Letter(s) to the Editor!

We discuss our first letter to the editor, from a listener who wasn’t a fan of the episode where we answered Jessi’s question about why tape is still used. His term for that feature? “Poor.” This prompted Shahin to quip, “I’m surprised we don’t get more of these…..”  Please keep those comments (good, indifferent, or critical) coming, our email is podcast@radiofreehpc.com.

Why Nobody Should Ever be Online. Ever.

This week, Henry doesn’t have a “Reason Why No One Should Ever Be Online. Ever.” He was offline all week, so thus doesn’t have anything to scare us with.

Catch of the Week



Henry:  has no catch, his net came up empty.

Shahin:  was practicing Catch & Release this week, so his creel is fishless.

Jessi:  discusses her new phone. She lost her old one in a Czech toilet (nasty, yikes). This is her first phone upgrade since junior high school – probably 6-7 years – and she’s agog at how the phones have advanced. She can now take pictures and use apps. Yay Jessi!

Dan:   Encourages listeners to have a good year and to let us know what you think via email (podcast@radiofreehpc.com) and twitter (@radiofreehpc). He also highlights the new RadioFreeHPC logo along the way.

Listen in to hear the full conversation

* Download the MP3 
* Sign up for the insideHPC Newsletter
* Follow us on Twitter
Subscribe on Spotify 
Subscribe on Google Play 
Subscribe on iTunes 
RSS Feed
* eMail us

Thursday, October 24, 2019

Fragile: Python, sudo, AI

How Many Episodes, Did You Say?!

The show is geographically skewed today with all three hosts on the west coast. Henry is gracing Seattle with his presence, resulting in plunging coffee inventories and skyrocketing sushi prices.

The first item of discussion is a problem in the scientific software world. There’s a bug in Python scripts that caused different results in identical routines run on different operating systems. For example, the results on macOS Maverick and Windows 10 were significantly different than results from the same application run on Ubuntu 16 and macOS Mojave. As the guys discuss, it’s not a Python thing but a problem with the order in which files got read according to the operating system’s protocols. This impacts the sort order and thus the end results. This reminds Dan and Shahin of, as Dan regards it, the crime that is IEEE Floating Point. The gang speculates on other causes of these types of problems and the fixes that should be employed.

Chemists bitten by Python scripts: How different OSes produced different results during test number-crunching

Chemistry boffins at the University of Hawaii have found, rather disturbingly, that different computer operating systems running a particular set of Python scripts used for their research can produce different results when running the same code.

Why No One Should Be Online - Ever (WNOSBOE?)

In Henry’s signature feature “Why No One Should Be Online. Ever” he discusses how a stalker in Japan was trying to pin down the location of a female pop star. He used her selfies posted online to closely examine the reflection in her eyes, then using Google street view to find out where she lives. Very scary stuff. Listen to the show for more details. It leads to a brief conversation of whether Henry Newman is stalk-worthy and an extended discussion of how to avoid this type of thing.

Stalking suspect allegedly studied pop idol's pupil images online to find her location

The man allegedly studied reflections of the woman's pupils in photos on social media and using Google Street View to find where she lived and what train stations she used.

Why AI is Dooming Us All (WAIIDUA?)

Dan introduces a new occasional feature, “Why AI is Dooming Us All.” According to Dan, AI is very brittle and can be fooled easily. He cites a case where just a few pieces of tape can make a stop sign look like a “Speed 45” sign to an AI. Dan makes a lot of broad general anti-AI statements in his typical fashion. For some reason, we find that when you attack AI, AI finds a way to respond and the brutal AI response is included in this episode. Take a listen to the episode to hear how the AI rips Dan a new one and threatens promises to ruin his life.

Artificial intelligence isn’t very intelligent and won’t be any time soon

For all of the recent advances in artificial intelligence, machines still struggle with common sense

Catch of the Week

Henry:  there is a great documentary about the history of computing in Minnesota, going in depth on the companies and technologies that originated in “The Star of the North” (Minnesota’s state motto. Their other state motto is, I think, “Minnesota:  Gateway to the Dakotas”).

Shahin:  Gives us an update on Facebook’s plans for their shiny new Libra cryptocurrency, which is facing a bit of a bumpy ride. Several high-profile Libre partners have bailed out while Facebook stays the course. Interesting stuff.

Dan:  Discusses a bug in the Linux Sudo command. Some miss-configured systems allow Sudo to have local/remote root access, thus making them superusers. He also manages to insult Phil Collins and his horrible Su-Su-Sudio song in the process. The guys discuss asking Linus Torvalds this question and Dan brings up how a person he knows once sold Linus a Christmas tree, which brings up a short discussion of what kind of tree Linus would purchase.

Just Another Episode

Finally, we're not so taken by round numbers these days, but we touch on the fact that this is our 250th RadioFreeHPC episode and offer great prizes to whoever has listened to all of our episodes. We also thank our listeners – like you, maybe we could do it without you, but it wouldn’t be very much fun, right?

Stay tuned -- and by "tuned", we mean "optimized" -- for a more proper commemoration in another episode.

Listen in to hear the full conversation

* Download the MP3 
Subscribe on iTunes 
RSS Feed
* Follow us on Twitter
* Sign up for the insideHPC Newsletter

Monday, September 30, 2019

FinTech and HPC-AI

@RadioFreeHPC Has Entered The Building

First things first, you can call us @RadioFreeHPC now, thanks to our new Twitter account. We decided maybe this social media thing is not a fad after all. We are also pleased to inform you that our Twitter account is almost as heavily followed as the podcast itself. Thank you! We should be up to about 6 or 7 followers by the time you read this. Good thing we allocated 64 full bits to track the number.

FinTech and HPC-AI

Shahin gives an update on the HPC-AI on Wall Street conference. We discuss the well-received Cryptocurrency panel that he moderated, the challenges of using of AI in financial services, the emerging computational storage, and advanced HPC-class modeling that helps venture capital investors decide whether to invest in a startup. Check out his blog on the panel and top-10 crypto topics of the day here:

Top-10 Crypto/Blockchain Topics

Why? What’s the big deal? Blockchain or Crypto? ICOs Political Support Libra Apps Security Other Coins Digital Assets Smart Contracts

Henry Newman's Why No One Should be Online, Ever.

Once again, Henry actually has good news, and once again, it's the kind of good news that highlights the bad news.

Man Who Hired Deadly Swatting Gets 15 Months

An Ohio teen who recruited a convicted serial “swatter” to fake a distress call that ended in the police shooting an innocent Kansas man in 2017 has been sentenced to 15 months in prison. “Swatting” is a dangerous hoax that involves making false claims to emergency responders about phony hostage situations or bomb threats, with the intention of prompting a heavily-armed police response to the location of the claimed incident.

Catch of the Week

Shahin talks about France and Germany planning to block the Libra cryptocurrency. Henry and Dan think this is a good time to say "we told you so"! Nobody's surprised, though Shahin thinks this is the beginning of this, not the end.

Germany's Scholz: We cannot accept parallel currencies such as Facebook's Libra

German Finance Minister Olaf Scholz said on Tuesday policymakers could not accept the emergence of parallel currencies such as Facebook’s planned Libra, adding that Berlin would reject any such plans. Facebook’s planned Libra is the most well-known of the stablecoins, a certain form of cryptocurrency backed by assets such as traditional money deposits, short-term government securities or gold.
Henry doesnt know whether to laugh or cry as he describes some of the "ignoble" prize winners and wonders how they ever got funded.

Magnetic cockroaches, dirty money, wombat poo and posties' balls: It's the Ig Nobels 2019

This year's theme was 'habits' and they were baaaaad The Annals of Improbable Research held its annual award-giving ceremony – the Ig Nobel Prize – on Thursday night at Harvard's Sanders Theatre, and the entries were as worthy as ever.
Dan talks about the call-center scammer whose plea deal backfires:

Call-center scammer loses $9m appeal in stunning moment of poetic justice

But I only expected to pay $250,000, wails scumbag to wall of blank faces. A call-center scammer has lost his appeal to overturn a $9m fine – after a court pointed out the crook had specifically waived the right to appeal when he pleaded guilty.

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed Sign up for our insideHPC Newsletter

Saturday, August 24, 2019

Coral is Cray for All

Cray Pulls an Exascale Hat Trick

Guess who's having a great year? Think Aurora, Frontier, and El Capitan. Cray has put some nice numbers on the accounts receivable ledger, and these are not ordinary numbers. The Exascale era is being defined substantially by the DOE Coral program and the commercial markets are watching as their computing needs start looking like those of the national labs. In that context, Cray's clean sweep makes its leadership in this area very important.
All of this is happening as Cray gears up to become what we hope to be an important part of HPE. The last time Cray sold anything like this to anyone was Cray BSD going to Sun, and that ended up being a multibillion dollar juggernaut. Exascale is a bigger deal, especially as supercomputing goes mainstream because of AI and data science. Exciting times. And kudos to HPE for snapping up Cray at the right time.

The impact of AI on Science

Speaking of AI, there is a series of town halls is being held around the nation by Argonne National Labs "aimed at collecting community input on the opportunities and challenges facing the scientific community in the era of convergence of High Performance Computing (HPC) and artificial intelligence (AI) technologies and the expected integration of large-scale simulation, advanced data analysis, data driven predictive modeling, theory, and high-throughput experiments. The term we are using to represent the next generation of methods and scientific opportunity is 'AI for Science'."
Co-chairing the town halls are Rick Stevens of Argonne, Kathy Yelick from Berkeley Labs, and Oak Ridge Labs' Jeff Nichols. Dan references a very good interview with Rick Stevens.

Henry Newman's Feel-Good Security Corner

Henry delights us all once again by describing how your camera can be an "Enter Here" sign for malware:

Canon DSLR Camera Infected with Ransomware Over the Air

Vulnerabilities in the image transfer protocol used in digital cameras enabled a security researcher to infect with ransomware a Canon EOS 80D DSLR over a rogue WiFi connection.
A host of six flaws discovered in the implementation of the Picture Transfer Protocol (PTP) in Canon cameras, some of them offering exploit options for a variety of attacks.

Catch of the Week

It was a pretty full episode and so we skip Catch-of-the-Week segment this week.

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

Thursday, July 18, 2019

HPC Market Eyes $44B in 5 Years

HPC Market Eyes $44B

New report from Hyperion Research has the HPC+AI market growing to $44B, with a B, in 5 years. The industry is hitting on all cylinders, benefiting from
  • The ExaScale race,
  • AI coming to the enterprise only to find that it needs, or really is, HPC, depending on your point of view, and
  • it's usual, sometimes slow but always steady, growth
The big news continues to be AI fundamentally bringing HPC closer to the mainstream of enterprise computing whether it is on-prem, in a co-location facility, or in a public cloud.

All of this is starting big changes in the industry. We see this in mergers and acquisitions (basically new companies), new technologies, new architectures, and new business models. An example of the latter is the loosening of chip licensing, with open source models starting to get attention. Unlike open source software, however, silicon needs a fab, and the necessary electronic design automation software applications don't have equivalent open source alternatives.

Catch of the Week

Henry:

Following a supply chain security breach, Henry predicts that standards bodies like NIST and ISO will become even more active in this area with guidelines for hardware, software, and processes.

Shahin:

Shahin talks about Apple's design chief, Jony Ive, leaving the company and shares some jokes on social media that fall flat for Dan and Henry, who probably claim it has nothing to do with them being such PC aficionados.

Jony Ive, Designer Who Made Apple Look Like Apple, Is Leaving to Start a Firm

Jony Ive, Apple’s chief design officer and one of the most influential executives in the history of the Silicon Valley giant, is leaving the company. Mr. Ive will depart this year to start his own design company, Apple said on Thursday. Through his new firm, LoveFrom, Mr. Ive will continue to work on a wide range of Apple products, the company said.

Dan:

Dan concludes the show without a "catch" this week!

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

Wednesday, July 3, 2019

Why did HPE buy Cray?

Why did HPE buy Cray?

The RFHPC team tackles the HPE-Cray acquisition as it reviews the companies' recent moves and strengths and market conditions in the context of:
  • the 5-tier data center application architecture: Embedded, Mobile, Desktop, On-premises, Off-premises
  • the emergence of AI as a must-do enterprise app, and
  • increasing commonality between supercomputers and enterprise servers.

Catch of the Week


Henry:

Another week another breach!

Massive Quest Diagnostics data breach impacts 12 million patients

A massive data breach has struck Quest Diagnostics and the information of up to 11.9 million patients has potentially been compromised. On Monday, the US clinical laboratory said that American Medical Collection Agency (AMCA), a billing collections provider that works with Quest, informed the company that an unauthorized user had managed to obtain access to AMCA systems.

Dan:

Dan points out that the new Apple Mac Pro can be configured to cost tens of thousands of dollars. Given that he and Dan are PC people, the nuances of the Apple value are obviously lost of them, goes the counter argument.

Apple’s top spec Mac Pro will likely cost at least $35,000

That’s before you count the GPUs or a Pro Display XDR screen.
Apple announced today that its new Mac Pro starts at an already pricey $6,000, but the company neglected to mention how much the top-of-the-line model will cost. So we shopped around for equivalent parts to the top-end spec that Apple’s promising. As it turns out: $33,720.88 is likely the bare minimum — and that’s before factoring in the four GPUs, which could easily jack that price up to around $45,000.
Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

Monday, June 17, 2019

TOP500 Jun2019, Facebook Coin

The new TOP500 list of most powerful supercomputers is out and we do our usual quick analysis. Not much changed in the TOP10 but a lot is changing further down the list. Here is a quick take:
  • There are 65 new entries in 2019.
  • US science is receiving support via DOE sites and academic sites like TACC.
  • 26 countries are represented. China continues to widen its lead, now with 219 entries, followed by the US with 116, Japan with 29, France with 19, the UK with 18, Germany with 14, Ireland and the Netherlands with 13 each, and Singapore with 10.
  • Vendors substantially reflect the country standings. Lenovo has 175 entries, Inspur 71, and Sugon 63, all in China. Cray with 42 and HPE with 40 (which will combine when their deal closes), followed by Dell at 17 and IBM at 16.  Bull has 21 entries.
  • There are a lot of "accidental supercomputers" on the list. These are systems that probably are not be doing much science or AI work but they could, and the vendors counted them and it seems to be within the rules to list them. It's controversial but not a new practice.
  • There are several systems listed as "Internet" companies. Hard to tell what that means but it points to the existence of very large clusters in the cloud for whatever purpose. Last year, there was one system listed as Amazon EC2, which remains on the list. This time, there is also one at Facebook. Usually the big social/cloud players don't care to participate, though they obviously could summon the resources to run the benchmarks.
  • Just over half of systems use Ethernet as a fabric. A quarter us InfiniBand, nearly 50 use Intel's OmniPath, and the rest, 55, use custom interconnects like the ones Cray provides. The team talks about Cray+HPE entering the interconnect business for real and if so, they will be formidable.
  • The majority of entries, 367, do not have any accelerators. 125 use Nvidia GPUs.
  • The overwhelming majority of the systems, 478 of them, are based on Intel CPUs. 13 are IBM, and there is 1 system based on Arm provided by Cavium, now part of Marvell.
  • So the when it comes to chips, it's an Intel game with a respectable showing by Nvidia when GPUs are used. Alternatives are bound to appear as the tens and tens of AI chips in the works become available and Arm, AMD, and IBM build on. The recently announced system at Oakridge will be all AMD, and that will point to an alternative as well.
  • Notably, Intel is listed as the vendor for 2 entries and Nvidia is listed for 4. While Intel has stayed largely away from looking like a system vendor, Nvidia is going for it with its usual alacrity. That, and the pending acquisition of Mellanox by Nvidia should serve as a warning to all system vendors who might feel stuck between treating Nvidia as an important supplier and an up and coming competitor.

CryptoSuper500

Shahin mentions the 2nd edition of the CryptoSuper500 list (really 50 for now), a list developed by his colleague Dr. Stephen Perrenod, which was launched last November, and is being released at the same time as the TOP500. The TOP500 has spawned variations that look at different workloads and attributes, for example, the Green500Graph500, and IO500 lists. CryptoSuper500 was inspired by those lists. The material for the inaugural edition of the CryptoSuper500 list here.
Cryptocurrency mining operations are often pooled and are very much supercomputing class, typically using accelerator technologies such as custom ASICs, FPGAs, or GPUs. Bitcoin is the most notable of such currencies. Scroll down for the top-10 list and see the slides for the full list and the methodology.

Catch of the Week


Henry:

Henry talks about check-out lanes at Target all being down for unknown reasons, though he hesitates to call that a cybersecurity breach. It turned out he's right and the company blamed an "internal technology issue".

Target down (then back up) as cash registers fail and leave long lines

Target's payment systems appeared to be missing the mark the day before Father's Day, as terminals went AWOL for a couple of hours in a number of the company's US retail outlets. The outage caused long lines but prompted an encouraging show of sympathy for Target employees from people on Twitter. And there were some jokes too, of course.

Shahin:

Facebook is expected to release a new cryptocurrency that is already impacting the crypto market.

Here’s what we know so far about the secretive Facebook coin

Facebook is likely to release information about its secretive cryptocurrency project, codenamed Libra, as soon as June 18, TechCrunch reports.
As is traditional with new cryptocurrencies, the social networking giant is expected to release a so-called “white paper” outlining how the currency works and the company’s plans for it.

Dan:

Dan reminds us all of the inimitable Erich Anton Paul von Däniken and his ancient astronauts hypotheses!

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

Sunday, June 9, 2019

Forty+ different AI chips

What are we going to do with 40+ different AI chips?

This week, the team looks at AI chips again, this time motivated by an article in EE Times about once such chip, Graphcore, and touts it as "the most complex processor" ever at some 20 billion transistors. The VC-backed company out of Bristol, UK is also valued on paper at $1.7b, gaining it the coveted "unicorn" status, apparently the "only western semi-conductor unicorn".

This being one of 40+ such AI chips (and that may be conservative), the odds of success are tough and the task formidable. But even if only 2 or 3 of such chips become successful, that's already a significant disruption to the market.

The Graphcore chip is 16nm, 1.6GHz, and comes in a PCIe card at 300W. You can stack 8 of these in a 4U chassis, so 2.4 kW just for those.

After a mini-rant about respected publications succumbing to clickbaits, the team talks about how cooling will be an issue and calls again for more clarity in performance metrics since the chip is rated at 125 TFlops but we don't know at what precision. Shahin reminds the team of his suggestion to clarify things by including precision in the metric, like DFlops for double precision, and then S for single, H for half, and Q for quarter precision.

Henry talks about how hard it is to build and test complex software like this despite Shahin's view that the modern software stack is too high so the chip need only be concerned with a couple of layers, codes are new and open to getting recompiled, it's increasingly open source, cloud providers and large customers have the wherewithal to do the job, and traditional HPC customers have the willingness to do the work if performance enhancements are there.

No "Catch of the Week" this time since Henry had a hard stop. We're used to it!

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

Thursday, April 11, 2019

Enterprises going HPC, Chips go Open Source, China goes for the top spot

We continue to want to make these introductions pretty brief here but not this time, apparently! Here's this week's synopsis.

Nvidia GTC 2019 announcements

We discussed the recent GTC conference. Dan has been attending since well before it became the big and important conference that it is today. We get a quick update on what was covered: the long keynote, automotive and robotics, the Mellanox acquisition, how a growing fraction of enterprise applications will be AI. In agreement with the message from GTC, Shahin re-iterates his long-held belief that the future of enterprise applications will be HPC and once again asserts that AI as we know it today is a subset of HPC. Not everyone agrees. Henry brings up varying precisions in AI and a discussion ensues about what is HPC. There seems to be agreement that regardless of what label you put on it, it is the same (HPC) industry and community that is driving this new trend. And that led to a discussion of selling into the enterprise and the need for new models and vocabulary and such. Speaking of varying precision, there is also Nvidia's new automatic mixed precision capability for Tensorflow and there is a bit of discussion on that.

China plans multibillion dollar investment in supercomputing

On the heels of the Aurora announcement, there was news in the South China Morning Post that the top spot in supercomputing is something the country is investing in. No surprise, but interesting to see, and consistent with the general view that supercomputing drives competitive strength.

Catch of the Week

Henry:

Facebook Stored Hundreds of Millions of User Passwords in Plain Text for Years

Hundreds of millions of Facebook users had their account passwords stored in plain text and searchable by thousands of Facebook employees — in some cases going back to 2012, KrebsOnSecurity has learned. Facebook says an ongoing investigation has so far found no indication that employees have abused access to this data.
Shahin:

MIPS R6 Architecture Now Available for Open Use

MIPS 32-bit and 64-bit architecture – the most recent version, release 6 – will become available Thursday (March 28) for anyone to download at MIPS Open web page. Under the MIPS Open program, participants have full access to the MIPS R6 architecture free of charge – with no licensing or royalty fees.
Dan:

Vengeful sacked IT bod destroyed ex-employer's AWS cloud accounts. Now he'll spent rest of 2019 in the clink

An irate sacked techie who rampaged through his former employer's AWS accounts with a purloined login, nuking 23 servers and triggering a wave of redundancies, has been jailed.  

Dead LAN's hand: IT staff 'locked out' of data center's core switch after the only bloke who could log into it dies

'We can replace it but we have no idea what the config is on the device'
Listen in to hear the full conversation.

  Download the MP3 * Subscribe on iTunes * RSS Feed

  Sign up for our insideHPC Newsletter

Sunday, March 24, 2019

Multicore Scaling Slow Down, and Fooling AI

The team has an animated discussion about multicore scaling, how easy it seems to be to mislead AI systems, and some good sized catches of the week. A common thread is "data" as is often the case these days.

We continue with making these introductions pretty brief here. This time, we include not only the links but also the first paragraph of the linked page as a block quote so you have a bit more information about what is discussed.

Specialized Chips Won’t Save Us From Impending ‘Accelerator Wall’

As CPU performance improvements have slowed down, we’ve seen the semiconductor industry move towards accelerator cards to provide dramatically better results. Nvidia has been a major beneficiary of this shift, but it’s part of the same trend driving research into neural network accelerators, FPGAs, and products like Google’s TPU. These accelerators have delivered tremendous performance boosts in recent years, raising hopes that they present a path forward, even as Moore’s law scaling runs out. A new paper suggests this may be less true than many would like.

Nice 'AI solution' you've bought yourself there. Not deploying it direct to users, right? Here's why maybe you shouldn't

Top tip: Ask your vendor what it plans to do about adversarial examples.
RSA It’s trivial to trick neural networks into making completely incorrect decisions, just by feeding them dodgy input data, and there are no foolproof ways to avoid this, a Googler warned today.

Catch of the Week


MyEquifax.com Bypasses Credit Freeze PIN

Most people who have frozen their credit files with Equifax have been issued a numeric Personal Identification Number (PIN) which is supposed to be required before a freeze can be lifted or thawed. Unfortunately, if you don’t already have an account at the credit bureau’s new myEquifax portal, it may be simple for identity thieves to lift an existing credit freeze at Equifax and bypass the PIN armed with little more than your, name, Social Security number and birthday.

Announcing the Open Sourcing of Windows Calculator


Today, we’re excited to announce that we are open sourcing Windows Calculator on GitHub under the MIT License. This includes the source code, build system, unit tests, and product roadmap. Our goal is to build an even better user experience in partnership with the community. We are encouraging your fresh perspectives and increased participation to help define the future of Calculator.

Huawei Sues The US, Prodding It to Prove Suspicions

THE WORLD'S LARGEST telecommunications-equipment company, China's Huawei, is suing the US government. But the suit isn't just about US law. It's part of Huawei's larger campaign to defend its role as a global provider of telecom gear amid fears that its technology is or could be used by the Chinese government for spying. In essence, Huawei is challenging the US government to prove its suspicions.
Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter