D-EJ915 writes:
> usually those tests are just a single core if the cpu has multiple cores
That's generally true for the integer tests, but most of the fp tests are done in parallel, ie. cores/CPUs
enabled, Autoparallel compiler options turned on.
TeeTylerToe writes:
> ummm, didn't the 3.16GHz beat the itanium2 in both? ...
Both what? Please don't say you're judging based on the final averages! The key point is that one
should compare those tests closest to one's application, and in that regard there are many codes
where the XEON is much slower than IA64. And the answer to your next question might surprise you.
Comparing based on final SPEC averages is dumb and tells one nothing (eg. the final SPEC averages
for O2 hide an order of magnitude difference between lowest to highest). Look at the codes where
IA64 does well (lots of Fluid Dynamics stuff), when you compare the difference in the number of
cores and clock speeds involved, the IA64's results are very impressive.
> ... not as well as I predicted. also, how many cores, and sockets.
The XEON was using 2 chips, 4 cores/chip, 8 cores total, whereas the IA64 was only using 1 chip with 2
cores. The IA64 does much better than people think, ie. a single dual-core 1.66GHz Itanium2 can be 2X
faster than two quad-core XEONs. This doesn't apply to all codes (certainly not), but for those
where the two XEONs are faster one has to remember it takes two XEONs to achieve such a difference.
I've ammended the table above, and added URLs for the page results. Quite a few new results out now,
like the Dell T7400, so I'll wade through the spec pages and update the tables soon. It's not linked to
from my site index, but the file has always been available here . I use it as a personal reference when
researching performance issues and if I can update it 3 or 4 times a year.
Note that the Power6 result is interesting - it's only using 1 core on the chip. Strange that IBM didn't
submit results using both, though maybe they have now, I've not checked yet. Oh, I'm only referring to
the fp results here; for int, the XEON does much better, at least for the tests in the SPEC suite anyway
(large scale apps using lots of CPUs might be have differently).
Anyway, all this anti-IA64 sentiment is silly. Some of what went into IA64 came from ex-SGI people and
others familiar with SGI's ideas for H1/H2. I was very against IA64 early on, partly anti-Intel bias, partly
the loss of SGI's next-gen CPUs, etc., but after talking to John Mashey about it (the STREAM guy)
I was satisfied the result was going to be a good design, which it was/is, alas the late release caused
other problems (never used in O2K). If Itanium fails, it will be on cost grounds and like factors, not
performance. Speed-wise, it's a very efficient design, ie. work done per clock cycle per core. It would
have been nice to have H2, etc., but it was never going to happen for cost reasons (faster than IA64
as originally planned, but not fast enough to justify the higher cost).
It was a long time ago now, so I'm sure John wouldn't mind if I quoted his 1998 email to me:
Ian.
> usually those tests are just a single core if the cpu has multiple cores
That's generally true for the integer tests, but most of the fp tests are done in parallel, ie. cores/CPUs
enabled, Autoparallel compiler options turned on.
TeeTylerToe writes:
> ummm, didn't the 3.16GHz beat the itanium2 in both? ...
Both what? Please don't say you're judging based on the final averages! The key point is that one
should compare those tests closest to one's application, and in that regard there are many codes
where the XEON is much slower than IA64. And the answer to your next question might surprise you.
Comparing based on final SPEC averages is dumb and tells one nothing (eg. the final SPEC averages
for O2 hide an order of magnitude difference between lowest to highest). Look at the codes where
IA64 does well (lots of Fluid Dynamics stuff), when you compare the difference in the number of
cores and clock speeds involved, the IA64's results are very impressive.
> ... not as well as I predicted. also, how many cores, and sockets.
The XEON was using 2 chips, 4 cores/chip, 8 cores total, whereas the IA64 was only using 1 chip with 2
cores. The IA64 does much better than people think, ie. a single dual-core 1.66GHz Itanium2 can be 2X
faster than two quad-core XEONs. This doesn't apply to all codes (certainly not), but for those
where the two XEONs are faster one has to remember it takes two XEONs to achieve such a difference.
I've ammended the table above, and added URLs for the page results. Quite a few new results out now,
like the Dell T7400, so I'll wade through the spec pages and update the tables soon. It's not linked to
from my site index, but the file has always been available here . I use it as a personal reference when
researching performance issues and if I can update it 3 or 4 times a year.
Note that the Power6 result is interesting - it's only using 1 core on the chip. Strange that IBM didn't
submit results using both, though maybe they have now, I've not checked yet. Oh, I'm only referring to
the fp results here; for int, the XEON does much better, at least for the tests in the SPEC suite anyway
(large scale apps using lots of CPUs might be have differently).
Anyway, all this anti-IA64 sentiment is silly. Some of what went into IA64 came from ex-SGI people and
others familiar with SGI's ideas for H1/H2. I was very against IA64 early on, partly anti-Intel bias, partly
the loss of SGI's next-gen CPUs, etc., but after talking to John Mashey about it (the STREAM guy)
I was satisfied the result was going to be a good design, which it was/is, alas the late release caused
other problems (never used in O2K). If Itanium fails, it will be on cost grounds and like factors, not
performance. Speed-wise, it's a very efficient design, ie. work done per clock cycle per core. It would
have been nice to have H2, etc., but it was never going to happen for cost reasons (faster than IA64
as originally planned, but not fast enough to justify the higher cost).
It was a long time ago now, so I'm sure John wouldn't mind if I quoted his 1998 email to me:
Code: Select all
IA64 is *very* different in almost every conceivable way from an IA32 in architecture, emphasis, and
performance; it is very much what we might have designed had we been able to do a new ISA that
no longer had to be upwards compatible with anything we had (MIPS has run out of some opcode
slots, and will always have 32 each of integer and FP registers). It is publicaly known that IA64 has
128 each of integer and FP registers, and if you care about FP you will like that. There are numerous
other features where people have learned, and I found many features that probably first appeared in
MIPS chips, but in some cases cleaner. I studied the manuals looking for showstoppers, and was
heartily relieved not to find any such, and I did find features that I'd been specing for H2.
Anyway, I can't say anything that isn't public, but I would say:
a) This is a good architecture and a good chip, and if you liked MIPS chips, you will like IA64, even if
you despise IA32s.
b) In various strange ways, this architecture and implementation almost seem designed as better for
SGI than for anybody else, even HP.
c) As it happens, the threads of ideas that led to the R8000 and SGI compiler technologies came
from people who'd worked on related technologies at other companies, and worked with people
with similar ideas, who went to Intel & HP, and strangely enough, some of what's seen in the chips
is *very* familiar.
So, anyway, your concerns are well-taken; I had some of the identical ones; my boss (Forest Baskett,
our CTO, and one of those who worked on the Stanford MIPS) & I have both looked carefully at
this thing, and both feel that this will be a good chip for our customers ... it is *not* an X86, even if is
able to run that code. By now, people understand how to do 64-bit instruction sets, so that's not
that hard any more; the real issue is in a myriad of other details, which is why I spent hours studying
the manuals, and I sure felt a lot better after I'd read them.
Ian.