Wasabi Roll: Wafer-Scale Engine Opens a Can of Whoop-A%$ on Joule, the Supercomputer

As you know, Wasabi Roll’s charter includes exploring the scientific and technological breakthroughs shaping the future. Hence, submitted for your approval, the Wafer-Scale Engine (WSE), that blew the doors off today’s supercomputer, Joule.

Smaller is Better?

The adage in semiconductor physics is that smaller is better. However, is it?

This is what Cerebras Systems, the creator of the world's biggest computer.

The Cerebras Systems' wafer-scale engine is massive no matter how you cut the chip is 8 point inches to the side that contains 1.2 trillion transistors the next biggest chip NVIDIA A100 GPU measures just one inch at a time and has only 54 billion transistors.

The former is new, new largely untested, and to date, is one of a kind. The latter is well-loved, mass-produced, and has taken over the AI at supercomputing world in the last decade.

Chips beyond AI

When Cerebras Systems came out, for the first time last year the company said it could significantly speed up the training of deep learning models. Since then the WSE has made its way into a handful of supercomputing labs where the company's customers are going through their paces.

200X!!!!

One of those labs, the national energy technology laboratory, is looking to see what can be done beyond AI. In a recent study, researchers pitted the chip which is housed in an all-in-one system about the size of a dorm room mini-fridge, called the CS-1, against a supercomputer in a fluid dynamics simulation

Simulation of fluid movements is a common supercomputer implication that is useful for solving complex problems, such as, weather forecasting and aircraft wing design.

The trial was described in a paper written by a team led by Cerebras Systems, Michael James, and National Energy Technology Laboratory (NETL)’s Dirk VanEssendelft, Ph.D. They presented the paper at the SC20 supercomputing conference in January of this year.

The team said that the CS one had completed a combustion simulation in a power plant approximately 200X faster than the joule 2.0 supercomputer had taken on a similar task.

The CS1 was faster than real-time, i.e., Cerebras Systems posted, that the CS1 can tell you what's going to happen in the future faster than the laws of physics produced the same result.

Researchers said the performance of the CS1 could not be matched by several CPUs and GPUs. CEO and co-founder Andrew Feldman said in VentureBeat, an online publishing platform that publishes information on technology, the result would be true no matter how big a supercomputer is. Specifically, at one point scaling a supercomputer like Joule doesn't produce any better results in this kind of problem. This is why Joule's simulations peaked at 16,384 cores which is a fraction of its total of 86,400 cores it had available.

The comparison between the two machines drives the point home.

Where joule:
Joule, 81st Fastest Supercomputer

· Is the 81st fastest supercomputer in the world

· Packs dozens of server racks

· Consumes up to 450 kilowatts of power

· Costs 10s of millions of dollars to build.

CS1, by comparison:

· Fits into a third of the server rack

· Consumes 20 kilowatts of power.

· Costs only a few $1,000,000

Admittedly, the applications are niche today, but useful in the mammoth calculations is chaos systems, and still a stunning result.

Cut Down the Commute?

So, how does this work? It's all about design, cut down the commute. Computer chips begin life on a large piece of Silicon called a wafer.

Multiple chips are etched onto the same waiver and the wafers are then cut into individual chips. While the WSE is also edged on a Silicon wafer, the wafer remains intact as a single operating unit this waiver scale chip contains nearly 400,000 processing cores.

Each core is linked to its dedicated memory and its four adjacent chords putting that many cores on a single chip and giving them their memory is why the WSE is bigger. That's also why, in this case, it's better.

Most large-scale computing tasks depend on massive processing. Researchers are distributing the task to hundreds or thousands of chips. The chips need to work in concert so they're in constant communication shuttling information back and forth. A similar process takes place within each chip. As the information moves between the processor cores that make the calculations and the shared results.

It's kind of like an old-fashioned company doing all its business on paper the company uses

couriers to send and collect documents from other branches and archives across the city the couriers know the best route through the city but the journeys take a minimum amount of time determined by the distance between the branches in the archives other careers on road. In short, the distance and the traffic slowdown.

Now imagine that same company building a brand-new gleaming skyscraper. Every branch is moved to a new building and every worker has a small filing cabinet in his office to store documents now any document they need can be stored and

retrieved in the time it takes to move across the office or down the hall to their neighbor's office the information exchange has all but vanished it's all in the same house.

The mega-chip of Cerebras Systems is a bit like this skyscraper the way it shuttles information with the help of its specially tailored compiling software is far more efficient than a traditional supercomputer that needs a ton of traditional chips to be networked.

Simulation of the world as it unfolds

It's worth noting that the chip can only handle problems small enough to fit on the wafer but such problems may have quite practical applications because of the machine's ability to high Fidelity simulation in real-time the authors note.

For example, that the machine should theoretically be able to accurately simulate the flow of

air around a helicopter trying to land on a flight deck and semi-automate the process something that is not possible with traditional chips another opportunity they note would be to use simulation as input to train network also resides on a chip in an intriguing.

In a related example Caltech, machine learning technique recently proved to be 1000 times faster in solving the same kind of partial differential equations that play here to simulate fluid dynamics they also note that improvements in the chip and others like it should they arrive will push back the limits of what can be achieved Cerberus has already teased the release of its next-generation chip which will have 2.6 trillion transistors 85,000 cores and more than double the memory.

This is the first to seriously pursue it they believe that they have solved the problem in a way that is both useful and economical other new architectures are also being pursued in the lab memristor-based neuromorphic chips for example mimic the brain by applying processing and memory to individual transistor-like components and of course, once and computers are in a separate link but they tackle similar problems it could be that this seems just as likely computing the splinter into a bizarre quilt of radical chips all stitched together to make the most in each of them depending on the situation.

UPDATE

Cerebras’ New Monster AI Chip Adds 1.4 Trillion Transistors

Shift to 7-nanometer process boosts the second-generation chip’s transistor count to a mind boggling 2.6-trillion.

Almost from the moment Cerebras Systems announced a computer based on the largest single computer chip ever built, the Silicon Valley startup declared its intentions to build an even heftier processor. Today, the company announced that its next-gen chip, the Wafer Scale Engine 2 (WSE 2), will be available in the 3rd quarter of this year. WSE 2 is just as big physically as its predecessor, but it has enormously increased amounts of, well, everything. The goal is to keep ahead of the ever-increasing size of neural networks used in machine learning.

Share and enjoy…

_________________________________________

We would like to thank our sponsors, for without them - our fine content wouldn't be deliverable!

Rick's Cafe' American

____________________________________________

Source(s):

https://www.youtube.com/watch?v=NQGyd2kuctA

So “Once more unto the breach, dear friends, once more;”

____________________________________________________________

About Rick Ricker

An IT professional with over 23 years' experience in Information Security, wireless broadband, network and infrastructure design, development, and support.

Currently a Computer Science Instructor at SDSU, Shout-out to Cohort 1 - Good Luck class!

Wasabi Roll

Wednesday, May 26, 2021

Wafer-Scale Engine Opens a Can of Whoop-A%$ on Joule, the Supercomputer

Smaller is Better?

Chips beyond AI

200X!!!!

Where joule:
Joule, 81st Fastest Supercomputer

CS1, by comparison:

Cut Down the Commute?

Simulation of the world as it unfolds

UPDATE

Cerebras’ New Monster AI Chip Adds 1.4 Trillion Transistors

No comments:

Post a Comment

Wednesday, May 26, 2021

Wafer-Scale Engine Opens a Can of Whoop-A%$ on Joule, the Supercomputer

Smaller is Better?

Chips beyond AI

200X!!!!

Where joule:Joule, 81st Fastest Supercomputer

CS1, by comparison:

Cut Down the Commute?

Simulation of the world as it unfolds

UPDATE

Cerebras’ New Monster AI Chip Adds 1.4 Trillion Transistors

No comments:

Post a Comment

Where joule:
Joule, 81st Fastest Supercomputer