Register: Sun's new orbit - COMA?

By kmself in News
Wed Oct 11, 2000 at 05:44:30 PM EST
Tags: Technology (all tags)

Andrew Orlowski of The Register has scooped the details on Sun's next-generation server technology, Serengheti. The design uses COMA, Cache-Only Memory Architecture, instead of the more prevelant NUMA (non-uniform memory architecture) to allow scaling beyond 64 CPUs.

comments (24)
Computer power can grow in three ways: faster CPUs, more CPUs in a box, or more boxes tied together. SMP, or symmetric multi-processor, systems have been the forefront of high-end server technology for the past two decades. The bottleneck of the technology is sharing memory among multiple processors via various caching and shared-bus systems. NUMA, the option of choice, scales to about 64 processes, but as of yet, not further.

The article discusses some of the particulars of Sun's COMA implementation, apparently shared between both hardware and software. COMA has been proven academically, but not yet used in commercial products. There are issues both in system and software design which may make or break its success.

I'd be interested in hearing from the EE wonks as to what the up- and down-sides of this design are, and (of course) any thoughts on whether Linux has any potential to move into higher-end SMP space with either NUMA or COMA designs.

For some background reading, the following EE Times article, BM, Sun eye NUMA architectures to make servers sizzle, may be of interest. Also recommended is Greg Pfister's In Search of Clusters, a highly readable book on high-performance computing.


My computer has ? processors:
o 0-1 68%
o 2 20%
o 4 1%
o 8 1%
o 16 2%
o 32 1%
o 64 1%
o 128+ 5%

Votes: 98
Votes: 98

Register: Sun's new orbit - COMA? | 12 comments
More 2s than 1s? (2.00 / 3) (#1)
by psicE on Wed Oct 11, 2000 at 03:28:49 PM EST

I would have thought that there'd be mostly 1s, some 2s, the occasional 4, but we got this...

Well, the poll didn't specify.... (4.00 / 2) (#2)
by Anonymous 242 on Wed Oct 11, 2000 at 03:36:11 PM EST

Notice the poll didn't specify Central Processing Units, just processors. My computer has many, many processors.....

[ Parent ]

Re: Well, the poll didn't specify.... (none / 0) (#10)
by psicE on Wed Oct 11, 2000 at 06:25:40 PM EST

Heh heh. Actually, it's turning out the way I thought now, except for the jerks who got SETI@home and now say they have a 5000+ node network making 128+ processors :)

[ Parent ]
Is it cheap? (4.00 / 1) (#9)
by Maniac on Wed Oct 11, 2000 at 05:50:36 PM EST

I've been looking at how to replace a large (28 CPU), expensive ($500k) server with something less expensive. I hate to pass that kind of $$ to any vendor and don't care if its Sun, IBM, sgi, or any of the others. We are currently looking at clusters of PC's for some of the following reasons:
  • dual CPU machines are extremely cheap. Say $1,300 for dual 700 mhz PIII, 256M memory.
  • 4-32 CPU machines run anywhere from $10,000 to $500,000 - not very linear
  • we believe we can divide our application into pieces and live within the bandwidth of dual switched 100baseT ethernet
  • our application already is in pieces w/ a "message queue" (now shared memory)
  • as our application grows, we can add more nodes
  • we can buy better/faster computers as part of the growth strategy
  • it appears to scale acceptably from 1 node to 48
By the way, a 40 node cluster of dual CPU machines (10G memory) costs less than $100,000, including the racks, network links, power distribution, etc.

These kind of announcements by Sun and others are interesting, but I'd rather hear about investment made to leverage inexpensive parts and get the best of both worlds - fast & cheap. With a cost target of say $500 per node, what is the best I can do for an interconnect?

it depends on what you want to do (none / 0) (#11)
by Anonymous 242 on Thu Oct 12, 2000 at 09:50:35 AM EST

These kind of announcements by Sun and others are interesting, but I'd rather hear about investment made to leverage inexpensive parts and get the best of both worlds - fast & cheap.

There are two problems that prevent this from happening.

  1. The majority of uses for machines this powerful also require machines to be insanely reliable. As an example, Sun Enterpreise servers can have virtually any part, including CPU's hot swapped without taking the system offline. This type of reliability costs money.
  2. Because there is such a small, relatively speaking, market for machines in this class, there is both a higher profit margin and a higher amount of overhead. Enterprise class systems have yet to meet an economy of scale because typically each supercomputer needs to be highly customized for the task at hand.

Personally, I would rather keep paying manufacturers a premium for premium hardware than see the commodization of enterprise class hardware. While I'm quite happy that I can buy PC hardware dirt cheap in comparisson to five years ago, I'm less than happy about the quality of most of that hardware.

And of course, with the POPC (Pile of PC) solutions such as Beowulf class clusters, the types of problems that don't need the ultrareliability of enterprise class machines no longer have to continue to pay the premium for enterprise class gear.

[ Parent ]

Register: Sun's new orbit - COMA? | 12 comments (5 topical, 7 editorial, 0 hidden)
