The collected works of Kira

I talked with Pentium about this on IRC, and came to the conclusion that it's not a library path issue, but an issue with sqlite3_os_switch itself. That was an internal, undocumented function that was removed in the transition between 3.4.2 and 3.5.0 of SQlite. The call to sqlite3_os_switch was removed from Firefox sometime before 3.0 , but it strongly appears that the .22pre version in nekoware still has the call.

Basically, I don't see how Firefox2 could possibly be working with sqlite above 3.4.2, since it relies on functions that are, quite simply, gone; Pentium's problem should affect anyone using Nekoware Firefox and the beta sqlite.
pentium wrote:
Also, please don't feed the Kira. She bites.


Rude.

Also, only on special occasions.

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
It seems to me like you're pissed that you can't afford things, not that they don't exist.

I'll tell you a secret. Nobody in the 90's was going to sell you a credible RISC/UNIX workstation for under a thousand dollars either.

Edited to add: What alternate universe do you live in where Java doesn't exist outside of x86?

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
OlaHughson wrote:
bgalakazam wrote:
1. I don't like the OS choice. Currently the market is pretty much Windows, OS X (wintel still) and GNU/Linux.


Y U NO BUY AMIGAONE X1000 ?
Seriously, why could You possibly forget about the best OS ever made?


For whatever reason, netbook-class performance in a multi-thousand-dollar price is a tough sell for a lot of people...

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
bgalakazam wrote:
on topic, not to mention that next gaming consoles (Sony and Microsoft) will be running on casualware x86 as well... looks like everybody is bending over


By all means, share with us from your font of infinite wisdom which non-x86 core they should have used. The IBM P6lite core used last generation is looking pretty long in the tooth... or do Atom class microarchitectures really appeal to you on some level?

It's pretty obvious to those of us who actually pay attention that IBM isn't particularly interested in that market anymore. The only core they have that's even remotely in the proper class for a modern game console is the PPC 476, which is inconveniently totally lacking in SIMD support and is pretty weak on floating-point, both of which are somewhat important for that workload.

Who else do you propose licensing a game console core from?

For what it's worth, I actually think the core design they're using for those systems (8-core low-clock Jaguar) is pretty questionable, but "wahh, it's x86" isn't the reason.

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
foetz wrote:
well he actually was aiming for the fact that there is no serious alternative at the moment indeed which is one of the main points of this thread


When, after the 80's, was there ever a serious alternative for the low end? The only time I can think of is the period in the very early 90's when NEC (and others) were pushing low-end MIPS running NT against the Pentium.

Other than that, it's almost always been RISC/UNIX being relegated to a high-end computing niche. Just like right now.

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
hamei wrote:
Kira wrote:
Other than that, it's almost always been RISC/UNIX being relegated to a high-end computing niche. Just like right now.

You stuck that "almost" in there but Silicon Graphics itself is a counter-example. The Indigo in particular was intended to bring graphics to the masses. That was Clark's whole schtick. He got driven out of the company because of it - the Venture Capitalists wanted the big bucks now , while he was focused on the future.

Silicon Graphics was not intended to be high-end. It was intended to be what Apple became, only a little classier :D


If we're including machines that cost 8000UKP as something other than high-end niche, then RISC/UNIX is mainstream right now. You can get Power7+ systems for ~$5000.

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
The base 710 system was, before the 7+ refresh, slightly above 5k. I guess IBM bumped up list again.

Still, with discounts factored in, it's likely to end up well under 6k.

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
SAQ wrote:
hamei wrote:
sgi_mark wrote:
I wonder how many VMS customers will take HP up on their suggestion that they port to NSK or HP-UX ?

More to the point, how many customers will surround HP headquarters with burning torches, demanding Meg's head on a pike ?

You guys have got to quit letting these bastards pull this crap. The world will be intolerable if you keep allowing this kind of behavior.


This seems to have been HP's standard acquisitions policy for a while now (at least since Apollo). Spend a lot of money buying a company, discontinue their products at the earliest possible time, then expect the customers to move to HP's stuff. You'd think by now they'd figure out that it doesn't work that way.

Guess this means they'll never patch the y31086 problem :(


This is funny, because it's ignoring some pretty important parts of HP history. In the late 90's, HP had a well-regarded minicomputer system, with a large installed base, a deep roadmap, and a solid plan to migrate to IPF. I'm referring, of course, to the HP 3000 line and its operating system, MPE/iX. MPE was beautiful, with a high-performance, easy-to-maintain database (IMAGE) and a large number of ISV's providing software for the platform.

Then HP bought Compaq.

As soon as HP had its paws on NSK and VMS, the MPE roadmap went out the window. The IPF port was cancelled, and the half-baked VMS-on-IPF port (seriously, benchmark it side-by-side with HP-UX/aCC sometime) went ahead as HP's sole minicomputer option. MPE support and sales ended a couple of years ago, and now there's a thriving community of businesses providing support for those "homesteading" on MPE.

Meanwhile, HP's VMS development has been nothing but a chain of fuckups for the last several years. Support for new processors has lagged behind HP-UX and even Windows, performance is still behind other IPF systems, and only some Integrity systems can even run it. (No Superdomes for you, VMS users!) HP says that they know of 2500 unique customers on VMS, which, if accurate, implies some deep, deep fuckups on HP's end. I've never seen solid figures on VMS installed base when HP bought Compaq, but I'd be surprised if it was under 10000; the other major remaining minicomputer platform, IBM i, is claimed to have 100,000 unique customers.

Basically, the message here is not "oh noes, HP murders its acquisitions in favor of inhouse stuffs!!!!" but rather, "everything HP touches turns to shit."

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
foetz wrote:
hamei wrote:
Okay le, let's be realistic. As a storage server, it's ridiculous. 2U, 19" wide, 27" deep (plus all the crap sticking out the back) to hold two disks.

he didn't say his storage was internal

Quote:
Compile box, it *might* be a little faster than a Fuel but not enough to notice. I've run both.

it's exactly twice as fast (assuming 2 cpus) as soon as the gnu make or any other jobserver kicks in.


Not quite. That's two processors on a shared bus - that means that you have just as much memory bandwidth and I/O bandwidth as a single-CPU unit, so linear scaling with CPU count is unlikely.

_________________
Altix 450 / 8x9030, 14x9040, 28x9140 (100 cores) / 200GB DDR2
I have two P5 machines, a 520 and a 570, as well as four JS12 blades (4GHz Power6) and two Intel blades in a BladeCenter S.
IBM announced Power8 yesterday, ahead of a more official launch this coming Monday. Notably, in a marked contrast from previous Power generations, the core is fully licensable (like ARM) and the platform has the support of a broad ecosystem - most notably Mellanox, Canonical, Google, and Nvidia. Tyan will be providing less expensive whitebox Power8 systems later on, but the machines available Monday are IBM, and will carry an IBM pricetag (which is significantly less than the list prices of entry level systems from other high-end RISC vendors, but still higher than you'd likely pay for a 1-2 socket Supermicro whitebox.) The cheapest machine will have a list price of about US$8k.

The Power8 core is an extremely ambitious design; it can decode and issue six non-branch and two branch instructions per cycle, from as many as eight simultaneous threads. The processor as currently implemented has twelve cores and tops out around 4GHz. While industry-standard benchmarks likely won't show up until next week, I expect Power8 to be significantly faster than Intel's Ivy Bridge-EX at database and multithreaded workloads, and comparable or slightly faster on single-threaded compute loads. Single-threaded performance has become harder to measure lately, as certain subtests on the SPEC suite appear to no longer reflect CPU performance accurately (such as libquantum), but I would be surprised if Intel had any meaningful single-threaded advantage over P8. While Power8 is likely quite a bit faster than Ivy-EX, it's also thirstier; Ivy-EX high-end parts have a TDP of 155W, while the Power8 datasheet says that the P8 SCM has a TDP of 190W.

Current Oracle and Fujitsu SPARCs are likely in an even worse position against P8, as they are already slower than Ivy-EX at some loads, as measured by SPECint[1][2][4]. The good news for those platforms is that both companies have strong roadmaps with major evolutions in 2015; Fujitsu, for instance, has roadmapped a 24-core, 96-thread SPARC64 at 4.5GHz[3], which should be a good match for Power8 on multithreaded loads.

Interestingly, P8 is being marketed strongly toward Linux rather than legacy AIX or iSeries workloads. During the announcement presentation yesterday, Linux was mentioned dozens of times, while I only caught one mention of AIX and iSeries. Of the five new machines, two are Linux-only, and it's a safe bet that third-party Power8 whiteboxes will not be capable of running AIX or iSeries. While AIX and iSeries will no doubt be supported and developed into the far future, it's probable that those platforms are picking up relatively few numbers of new customers, and the focus on Linux is an attempt to focus on a wider audience while the proprietary UNIX market shrinks.

In other vendor developments, we have bits from a number of companies. Nvidia and IBM are going to be working to ship GPU/P8 integrated systems in Q4 of this year; Nvidia's new NVlink technology (available in the next-generation GPU codenamed "Pascal") is designed for high-speed integration with Power processors. Google is investigating Power8 for use in its own datacenters, and is contributing to software and firmware for the platform. A Mainland Chinese company - Suzhou PowerCore - has licensed the Power8 core and is planning to build locally-produced Power8 compatible processors for the PRC server market.

Overall, Power8 is an immensely ambitious processor surrounded by a new development model and broad industry support, and is likely to be the fastest general-purpose processor in the world. While questions remain about whether it can turn around the overall decline in the high-end RISC market, I think it has a very good shot - it's a new business model, with new partners, looking for new workloads in new price ranges.

Notes
[1]: 8 socket Xeon E7-8890 @ 2.8GHz, 120 cores, 240 threads, vs 8 socket SPARC T5 @ 3.6GHz, 128 cores, 1024 threads
[2]: 4 socket Xeon E7-4890 @ 2.8GHz, 60 cores, 120 threads, vs 4 socket Fujitsu SPARC64 X+ @ 3.7GHz, 64 cores, 128 threads
[3]: http://www.fujitsu.com/global/services/computing/server/sparc/key-reports/roadmap/
[4]: Note: The exact performance of T5 relative to Xeon remains ambiguous. T5 tends to perform quite well at Java benchmarks, but the SPEC numbers are worse than Ivy-EX; SAP SD-2, while being difficult to interpret due to being somewhat multidimensional, shows Ivy-EX running a higher number of users (albeit at a slightly higher response time) than T5. TPC-H is similar - a quad-socket Ivy-EX has a higher throughput than T5-4, but at a higher load time (but T5 is equipped with a far larger storage:database ratio.) I expect P8 to significantly outperform T5 across the board, but as always, judge by the benchmark closest to your workload.
smj wrote: I'm sure P8 will see great adoption. Having largely ignored POWER since 1999, what's the virtualization scene like? (Not that it was such a strong topic then, but I use it a lot now and am wondering about those whitebox servers...)


PowerVM is pretty good stuff, although it's probably overkill for a lot of loads (it's a big serious enterprise class virtualization system.) There's also PowerKVM, which is exactly what it says on the tin, and seems to be what IBM is pushing for most Linux workloads.
Evidently, this strange-looking machine is what a Google Power8 board looks like.

And this little cutie is the Tyan variety.
robespierre wrote: The most exciting new architectural feature is transactional memory.


It's good to have, certainly. I was happy when Haswell and z got it last year.

That being said, my favorite features (more microarchitectural than architectural) are that this is an 8-issue processor (the first non-VLIW processor that can sustain 8-issue execution, afaik!) and that it has full 8-way SMT (which is also, I believe, a first.) It's just an incredibly ambitious core.

In other news, we have benchmarks! P8 comes out looking quite good, especially in contrast to SPARC - given that SPARC's published list prices tend to be fairly steep compared to Intel and IBM for entry systems.

SPECint2006_rate
This is a multithreaded throughput integer benchmark. Results are listed as base/result, where result is usually considered the more important number. The IBM SPEC numbers are from the Power Systems Performance Report, and will likely appear on spec.org in the coming days.

  • Power8, 24-core, 2-socket, 3.5GHz: 1280/1750
  • Xeon E7-2890 v2, 30-core, 2-socket, 2.8GHz: 1170/1200
  • Fujitsu SPARC64+, 64-core, 4-socket (Fujitsu has not published a 2-socket X+ result), 3.7GHz: 1780/2090
  • Oracle T5, 16-core, 1-socket (Oracle has not published a 2-socket T5 result), 3.6GHz: 441/489

SPECint2006_rate

This is a multithreaded throughput floating-point benchmark.

  • Power8, 24-core, 2-socket, 3.5GHz: 1180/1370
  • Xeon E7-2890 v2, 30-core, 2-socket, 2.8GHz: 836/857
  • Fujitsu SPARC64+, 64-core, 4-socket, 3.7GHz (see disclaimer above): 1650/1830
  • Oracle T5, 16-core, 1-socket, 3.6GHz (see disclaimer above): 350/369

SAP SD-2

This is a benchmark for SAP's sales and distribution application, and is generally a pretty good reflection of large database workloads. P8 does especially well here; I suspect this is due to its large cache and wide SMT.


Oracle and Fujitsu have not published recent SPARC SAP SD-2 benchmarks at comparable socket counts. These are the closest we can do:


I do not view those as particularly impressive numbers at this point, compared to either P8 or recent Xeons.

Conclusions

P8 delivers. Xeon has become vastly more credible in the last five years. SPARC is still lagging, although the situation isn't as bad as it was two or three years ago.
ClassicHasClass wrote: I really hope Tyan makes a POWER8 box so I can upgrade to that instead of waging war against another IBM box the next time this happens.


They will - I just wouldn't expect it to run AIX.

Given my experiences with AIX, I'd view that as a net positive - but opinions differ on that point. :P

(iSeries, on the other hand, is a legitimate loss.)
Cory5412 wrote:
Trippynet wrote: They wanted to create a unified and touch-based interface that they would force onto Windows users to make them get used to it.


It'll be interesting to see what Microsoft does with subsequent releases of Windows. 8.1 Update 1 brought some more "desktop-like" functionality to the New Interface ("Metro," though Microsoft isn't allowed to call it that) and either Windows 9, Windows 8.2 or Windows 8.1 Update 2 (whatever it gets called) is already slated to bring back the start menu, and they're making it worse (from my perspective, at least) by introducing windowed New Interface applications.

That and the start menu as you knew it in Windows 7 is literally never coming back. What's going to be in Windows 9 is a small rectangle that shows up at the bottom of the screen and shows Start screen tiles. You're not getting the Control Panel link back (though you can add control panel, run, et al as links on the Start Screen) and I'm going to lose my giant 1920x1200 launcher that shows every program my computer <i>has</i> in a single go.


What exactly do you mean by that? Microsoft has already demonstrated a start menu (not just a miniature start screen) with a striking resemblance to the legacy start menu, including the control panel being there.
Cory5412 wrote: I don't really know what else is out there in terms of "open" platforms that aren't Intel. Is there like, an ATX board that will accommodate an Itanium processor hanging around, or something?


Essentially any modern CPU family (with the possible exception of SPARC, which is rather insular...) has ATX boards if you look hard enough. Intel does the "Crater Lake" SDV boards for Itanium, Broadcom has a series of ATX boards for their high-performance MIPS processors, Applied Micro has an eval board for X-Gene, etc. That being said, they are all likely to be more expensive than a generic off-the-shelf x86 board; yay for economies of scale.

ClassicHasClass, you might take a look at the Freescale stuff. Definitely a different kind of PPC from the IBM's, but not necessarily in a bad way - the T4240, despite being strangely architected, is a solid performer. With that you're looking at a 64-bit 12-core (or 24-core, depending on how you measure it) CPU around 1.8GHz, with VMX, multiple on-chip 10gig MACs, PCIe, and DDR3 - in a power footprint of something like thirty watts. i think there are some boards for the T4240 on Avnet. (By the way, this is the CPU that IBM's prototype microservers are using.)

LSI also does "high-performance" PPC, although not nearly as high-end as Freescale does - they use the PPC476 core, which is a superb core if you're looking for low-power and don't need 64-bit support or VMX. i suspect the 476 stuff would be a little underpowered for you though.
The Tyan board is now seeding to OpenPower members, which implies it's getting close to ready for larger release.

Additionally, the firmware for the P8 machines has been released as open source.

Both of these things seem to me like steps in the right direction in making Power less of a monolithic IBM-dominated platform.
TeamBlackFox wrote: Not getting a PowerPC Mac. As I sold the last of my Apple kit to ClassicHasClass I'm sworn off Apple indefinitely. I need something newer than that, that runs cooler, and is of better quality.

I always have my SparcStation 20, but hell if I'm running that for my main workstation - Mima is faster than that slug.

Classic, I do see your point. I'll have to look at SPARC or ARM gear then, and if I can't find anything there, I guess fuck it, I won't have a modern workstation and I'll stick to SGIs.

Keep in mind folks I do work for Dell, and I have to help maintain about 40k servers in a data center outside of DC. I replace more CPUs and motherboards next to hard disk drives than any other parts. These are Xeon and Opteron servers so they're supposed to be the best of the best, but I'll tell you that the build quality isn't any better - it takes a 6-man crew, including myself, to barely keep the 99% uptime SLA we have with our customer and its all because Dell/Intel/AMD send us so much bum hardware its not even funny. Out of an order of 10 motherboards, we have roughly 1-2 in that batch that will not work. CPUs tend to not be DOA, but I swear we burn through CPUs as our third or fourth most common part to replace.


Modern Oracle SPARC kit is expensive and underpowered (generally worse than midrange to high-end Xeons.) The situation with the pre-Niagara stuff was worse - the Ultra 45 (last of the SPARC workstations) was substantially worse on performance than the contemporary Opteron-based Sun Ultra 40.

Nvidia provides some degree of GPU drivers now for Linux/PPC if you ask nicely. It's likely for little-endian only, which means Ubuntu and SLES (for whatever reason, RHEL 7 is still big-endian.) Power7 machines are decent on power consumption and can be had (entry-level) for about US$2k for second-hand blades - assuming you have a chassis. Rack machines are, and will continue being, more expensive - often moreso than the chassis and blades combined. i don't understand why, but there it is.

For a desktop, x86 is likely still the best option available - it has a large existing base of software and driver support, and performance is excellent.
TeamBlackFox wrote: x86 is obsolete


Based on what? Relative to what?
Oh boy.

TeamBlackFox wrote: Its been obsolete since the 386 - its dedication to backwards compatibility has held innovation in the software side of things back, and it is still designed around the same architecture which points to days where memory was expensive, and having everything done on the CPU was ideal.
Like what features?

Look around today, its the only CISC architecture still actively developed, unless you somehow count Itanium, and that isn't nicknamed Itanic for anything other than its massive failure in the marketplace.


The various mainframe architectures will be surprised to hear this - as will the Renesas RX, the 8051, etc. Additionally, Itanium was (if anything) hyper-RISC - rigidly fixed-length, rigidly load-store, and in-order, with a lot of internal machine state exposed. The fact that you think it was CISC speaks volumes about your familiarity with the platform. And by the way, Itanium was outselling SPARC for quite a while. "Massive failure" indeed.

ARM, MIPS and other RISC architectures have long overtaken the mobile market, and even today, if you compare the same class of Intel CPU to ARM and MIPS equivalents, the x86 produces about the same amount of output within a margin of 5%, but it consumes 30-40% more power and costs significantly more.


ARM, MIPS, et al have absorbed their share of CISC-isms and are no longer RISC in the original sense of the word; for instance, both ARM and MIPS have variable-length extensions (Thumb/Thumb2 and MIPS16 respectively.) x86 being less efficient is essentially a myth at this point - see this for an example of in-depth analysis with a proper test bench.

Plus, x86-native architectures all seem to blow don't they?


And yet 2.8GHz, 15-core Xeon outperforms[1][2] a 16-core 3.7GHz SPARC64 - quite likely at substantially lower power, given that SPARC64 chips have historically been hanging out up around 200W. Funny, that.

Now let's talk about obsolescence. Oracle SPARC has no SIMD past VIS (64-bit SIMD - MMX-class.) It has an encoding that limits it to 32 GPRs without extremely nasty tricks like those introduced by Fujitsu in HPC-ACE. It has register windows, which significantly increase the complexity of modern register rename in out-of-order designs. And you're calling x86 obsolete?

The moral of the story here is that ISA is basically irrelevant unless it exposes interesting microarchitectural functionality - like explicit support for multithreading, speculation, or instruction parallelism. ISA is like what style of columns you put on the porch of your house - it may influence the way the house is built, but ultimately it's not what makes the house solid or not.

[1] http://spec.org/cpu2006/results/res2014 ... 29190.html vs http://spec.org/cpu2006/results/res2014 ... 28687.html
[2] http://www.tpc.org/tpch/results/tpch_re ... =114041601 vs http://www.tpc.org/tpch/results/tpch_re ... =114020603
TeamBlackFox wrote: You went over my head there to be honest - I stay out of low level talk because I honestly was never formally educated in it.

You're correct on most of your points - I see that I should have qualified my statements to narrow my scope:

x86_64 is the only CISC architecture in use and actively developed within the realm of the consumer market that I can actually reach with my relatively modest income. 68k has been halted since the mid 1990s, VAX is no longer developed etc.

I'm not even going for SPARC specifically either, but I do want to make the point that I've have had a total of 8 x86 based computers, and all but two failed within a year. One of them that survived was a Pentium 4, and while it survived, it struggled with everything I threw at it moreorless. The other was a MacBook Pro Retina, which kept kernel panicking and because BSD didn't run stable on it, I sold it off for $1000, less than 2 years into ownership. All other machines I owned under the x86 ISA failed within 6 months of continued, 24/7 usage in one way or another, including one in a rack fire that also consumed two XServe G4s I had in the cage and *nearly* consumed some hardware I had rented.

After this desktop I had melted itself and warped the motherboard, I'm fucking done. Never had a SINGLE failure from a RISC based device of any kind. Thats why I started this thread, to see if a POWER server was viable. Then I was looking at the late model Sun Ultras and the Sun Blades, but now I'm just considering screwing it all and staying within the scope of retro RISC and 68k computers.


Assuming the issue is related to the ISA of the processors is an interesting conclusion.

Do you still maintain that x86 is obsolete, by the way? And have you found any evidence to support it yet?
ClassicHasClass wrote: I despise the x86 ISA as well, but Intel has had a lot of money and time to invest in making it run well despite its warts, and in fairness to Intel they've tried to kill it at least three times (iAPX432, i860/960, Itanium) and the market wouldn't let them. So I can't really blame them anymore though I used to.


Offtopic historical note: 860 and 960 were completely distinct designs. 960 was a combination of a Berkeley RISC with some concepts from the iAPX 432 - notably, in the beginning, tagged memory; on the other hand, 860 was an odd proto-VLIW chip with no relation whatsoever to the 960 and an emphasis on HPC and graphics.
ClassicHasClass wrote: Like POWER, I have a soft spot for PA-RISC because my first job was on a K250 and later I did contract work with a C3750. It's a very clean RISC and HP at least crammed incredible amounts of L2 in it, something Apple could have learned from with their criminally undercached designs. I came to hate HP-sUX and yet become fluent in it. Now I have two HP-sUX machines (a 9000/350 in the huge tank-like minirack and the C8000), plus a "425t" that I need to figure out if it even still works and what the hell is in it.


PA's cache hierarchy is awesome - however, it's because of L1, not L2. As far as i know, no PA-8000-series chip has a true L2 cache. The 8800/8900 have external (off-chip) DDR DRAM L2 caches with on-chip tags, but performance is dismal; they're neither immensely high-bandwidth nor low-latency (over 40 cycles LTU - and higher for back-to-back access!).

That being said: PA-8500 and up have a superb large on-chip L1, with fairly low load-to-use latency. CPU people used to call PA-8xxx series chips "SRAM blocks that happen to have cores attached", which is basically true, and it's continued in the PA-derived Itanium family, which all have huge gobs of SRAM on-die (something like 75% of the Itanium 9300's transistor count is L3 cache!)

Nothing like cheap, low-latency loads to make a workload fly. :D
ClassicHasClass wrote: I guess it boils down to how you view it, but I'd still call that an L2 cache even if it's not as good as it could be. HP certainly did in all the spec sheets and it serves the same role.

EPIC/VLIW certainly needs huge cache for those instruction word sizes. :D


Yes, IPF is sensitive to cache - although not inherently because it's EP, but because it has an insane 41-bit op size plus a 5-bit template field for every three ops. This is the downside of having a massive number of GPR's, plus predication. :( For what it's worth, though, Itanium also has some density-friendly things too - for instance, not all instructions in a bundle have to issue concurrently, which reduces the need for NOP padding. In general, Itanium's cache is still a work of art (single cycle load-to-use from L1!) and performs magnificently.

Tilera's ISA is VLIW and puts either 2 or 3 ops (depending on the exact op type - some can only be issued as part of a 2-op bundle) into a 64-bit instruction word. As a result, its density is actually comparable to or superior to commercial RISCs.

Going to stop derailing this hilarious thread rsn...
You could always get a zx2000 or zx6000 - they're pretty fast (by Old RISC Crap standards) and have good OS support.

Be advised, zx6000's are pretty loud.
wenp wrote: OpenPOWER looks interesting, but if you don't have AIX, what does it have over a new Xeon? I'm genuinely curious about the details.


viewtopic.php?p=7368605#p7368605 sums it up with benchmarks.

The core points:
  • Power8 is really fast
  • Having an 8-issue pipeline (vs 4-issue on current x86 cores) probably helps
  • Being able to issue from 8 threads on a single cycle (vs 2 threads on the vast majority of anything else, including Intel and SPARC) probably does too
  • i also suspect the 96MB L3 cache doesn't hurt

It's also a licensable core, which to the best of my knowledge current Xeon big-core IP is not. If you have the means (and plenty of companies do), you can slap it on a die with your own peripherals and accelerators (think encryption, on-chip networking MAC, compression, workload-specific fixed-function blocks...)
wenp wrote:
Kira wrote: [*]Power8 is really fast

Just on speed? I was sort of expecting that hardware support for virtualization would be a factor. Is PowerVM a big advantage, or can you approach similar performance with Xeon?


There's no longer any magical virtualization advantage to Power that i'm aware of. There might be some, but Xeon virtualization performance is already exceptionally low overhead; it's had support for virtualization in hardware for some years now. However, extremely aggressive multithreading helps virtualization workloads significantly. Faster processors in general directly result in faster virtualization performance, and Power8 is definitely fast.

Additionally, PowerVM is going extinct for Linux-only workloads on PPC in favor of KVM.
The AIX/IPF stuff actually got further than commonly believed - it did ship in an RTM form, although only on an RFQ basis and with limited support ("Early Adopters' Release.") This was effectively equivalent to shipping AIX for PPC - it wasn't a beta (although there had been three betas prior to it.)

There were two major prongs to Monterey later on - UnixWare/ptx, for x86, and AIX/IPF, for Itanium. UnixWare/ptx was a project that added enhancements to UnixWare with Sequent's SCI (think Numalink) clustering technology, but it died before shipping; Sequent's scalable clustering tech later ended up in IBM Itanium systems running Linux. AIX/IPF shipped a small number of units; it included some fancy stuff, like the new unified driver model (UDI) that was supposed to take off for all UNIXes and reduce the need for significant driver rewrites when porting across UNIX iterations.

Prior to Monterey, there were some other fancy joint Itanium projects - SCO and HP worked on a system called Summit which was supposed to be one UNIX for all common hardware, including Alpha, MIPS, and the forthcoming HP WideWord, which ended up shipping as Itanium. Sequent and Compaq also collaborated for a while on an Itanium project called Bravo, which would have been built on a Tru64 base with Dynix/ptx components. Sequent ended up joining the Monterey alliance, while Compaq proceeded for a while with a straight port of Tru64, which was eventually cancelled.

There were also some other Itanium ports cancelled before release. Sun had a fairly complete Solaris port running in the lab, and continued to ponder the idea through at least 2004. Novell had an operating system called Modesto, which was a hypervisor-based OS that could run Netware 6 instances in isolated VM's, as well as isolating system services and applications; think IBM VM as the closest comparison point. Additionally, Fujitsu-Siemens planned to port its BS2000 mainframe operating system to Itanium, but it seems to have been cancelled in an early state of development.
Oracle announced their next-generation SPARC processor - M7 - a few weeks ago, and it's fascinating! Here's the rundown.

  • It has 32 cores, running at a clock speed higher than 3.6GHz. These are a minor tweak to the existing S3 core; i don't really expect it to be super-competitive just on core performance, as it's still dual-issue and therefore can only issue from two threads in a given cycle.
  • It has a new cache hierarchy, involving shared L2 among blocks of cores. The 256KB L2I is shared among groups of four cores, while the 256KB L2D is shared among pairs. This can mean an effective 50% capacity increase and additional flexibility over the existing 128KB unified L2 per core.
  • It has a 64MB partitioned L3, up from 48MB in M6. Based on the quoted transistor count of approximately ten billion transistors, i believe this is SRAM rather than eDRAM.
  • It has 8-channel DDR4, for 160GB/s peak memory bandwidth. This is good.
  • It includes metadata (which can be used for data integrity) in unused bits of virtual addresses.
  • Finally, the cool part - it includes accelerators for SQL as core blocks on the processor itself, with a coherent address space with the SPARC cores themselves.

It looks like a really good chip - if it can be priced competitively. We're seeing what looks like a smaller emphasis on trying to keep up on core performance, and a larger focus on workload specialization for databases; i look forward to seeing benchmarks once it ships.
Depends on the application. For core compute performance, i anticipate Broadwell-EP and possibly Haswell-EX will attain comparable or higher SPEC numbers at a vastly lower price. On the other hand, the wildcard here is the database acceleration engines. Oracle only provided us one comparison point in the presentation: one of thirty-two database query pipelines is approximately an order of magnitude faster than a T5 core running a single thread that decompresses the bit-packed internal format of the Oracle database.

This is a flawed comparison in a number of ways. First off, the S3 core used in the T5 (and, in a slightly modified form, the M7) is inherently weak at single-thread performance. The whole idea is to get high utilization of a relatively weak (in peak instructions per cycle) core. Running a single thread as a comparison point doesn't tell us much. Second, we don't know the characteristics of the actual workload - the slide is ambiguous on whether it's comparing the decompression alone, or the WHERE-clause predicate evaluation, which the query engines are also capable of doing. There's also no word on latency of setup and transfer to a query engine.

Overall i would say that there's a lot of cool stuff here but it's hard to anticipate how much it will actually affect a normal transaction-processing workload running Oracle. Oracle will no doubt publish SAP SD-2 and TPC-H results for M7 when it's released; we'll get a clearer picture then of what the new features do for Oracle databases. i would not be at all surprised if M7 is significantly faster than contemporary Xeon on a per-socket basis for Oracle workloads - but it also needs to be priced competitively.
TeamBlackFox wrote: So in a nutshell it has specialised instructions and a core for database usage, and if these can be utilised, it will do better at the limited market of running OracleDB software? I know that's a gross oversimplification but I'm not an electrical engineer so I can't say I really am well acquainted with the industry terminology.


That's roughly accurate, yes.
wenp wrote:
Alver wrote: I honestly love UX, and still use it at home for a desktop

As I noted, a big part of Nemeth's affection was the reliability of the hardware and how well the OS supported it. By a wide margin, I hear more praise of the reliablity HP hardware than any other Unix boxes, but that's mostly from the PA-RISC days. Are things still as good with Itanium boxes?


In my experience, yes - especially with high-end machines.
XVR-1000 is based on Sun's MAJC VLIW processor. i have a vague suspicion that it was put in graphics boards after it failed to be the game-changing CPU that was originally expected.

Performance was generally fairly underwhelming compared to other graphics options, outside of specific cases, although anti-aliasing performance was very high and the 1000 and 4000 definitely had gobs of RAM.
Oh, boy.

You may be thinking “600MHz CPU?” - but that CPU is totally badass because it is actually a MIPS CPU with a 5 stage core pipeline. Compared to a 20+ stage pipeline in Intel CPUs, it is roughly comparable to a 2.4GHz Intel CPU with just the pipeline alone.


Well, jeez. All these embedded cores with three-stage pipelines must be absolutely amazing, then. Also, the last sentence is meaningless; I guarantee that a modern Intel core, even at clock speeds well below 2.4GHz, utterly spanks a 600MHz R16k. Pipeline length is not some sort of magical proxy for clock-normalized performance. There isn't one. That's even dumber than when people try to say all cores of equal sustained-issue width have the same clock-normalized perf...

Short pipelines are great and wonderful until you have to hit a less conservative clock. We saw the limits of clock-conservative brainiac designs with IPF, too.

Add in the fact that the MIPS CPU architecture doesn’t have any baggage in it’s high-performance pipeline, and the end result is that it performs far faster than that!


You mean baggage like SPR's for multiply-divide, delay slots, and a fixed-length encoding whose merits were already debatable by 1995?

Let me just say this: this SGI fuel runs Blender much FASTER than my new Alienware laptop with it’s Intel Core i7 CPU!


Amazing, then, that an 8-processor 600MHz R16k system gets schooled by a single-processor, multiple-generations-old, intel quad core in Blender tests, by reputable members of this community - http://www.sgidepot.co.uk/perfcomp_RENDER1_blender.html . I call complete bullshit.