Nekonomicon - R5000 O2

sybrfreq
Old Salt
Who joined Aug. 21, 2007, 10:12 p.m.
and authored 1647 notes

Wrote the following at June 16, 2008, 9:21 p.m...

Code:

  mark@machop:~$ hinv -vm
  
  CPU: MIPS R5000 Processor Chip Revision: 2.1
  
  FPU: MIPS R5000 Floating Point Coprocessor Revision: 1.0
  
  1 180 MHZ IP32 Processor
  
  Main memory size: 128 Mbytes
  
  Secondary unified instruction/data cache size: 512 Kbytes on Processor 0
  
  Instruction cache size: 32 Kbytes
  
  Data cache size: 32 Kbytes
  
  FLASH PROM version 4.16
  
  Integral SCSI controller 0: Version ADAPTEC 7880
  
  Disk drive: unit 1 on SCSI controller 0 (unit 1)
  
  Disk drive: unit 2 on SCSI controller 0 (unit 2)
  
  CDROM: unit 4 on SCSI controller 0
  
  Integral SCSI controller 1: Version ADAPTEC 7880
  
  On-board serial ports: tty1
  
  On-board serial ports: tty2
  
  On-board EPP/ECP parallel port
  
  CRM graphics installed
  
  Integral Ethernet: ec0, version 1
  
  Iris Audio Processor: version A3 revision 0
  
  PCI Adapter ID (vendor 36868, device 32888) pci slot 1
  
  PCI Adapter ID (vendor 36868, device 32888) pci slot 2
  
  Video: MVP unit 0 version 1.4
  
  with no AV Card or Camera.
  
  Vice: TRE

Code:

  mark@machop:~$ /usr/gfx/gfxinfo
  
  Graphics board 0 is "CRM" graphics.
  
  Managed (":0.0") 1280x1024
  
  32 bitplanes
  
  board revision 2, CRM revision C, GBE revision B
  
  Monitor 0 type: Unknown
  
  Channel 0:
  
  Origin = (0,0)
  
  Video Output: 1280 pixels, 1024 lines, 59.94Hz (1280x1024_60)

edit: added picture. it might be a pretty slow computer but it is a fabulous bookshelf embellishment:

_________________

(Aldebaran)

(Chaos)

(Machop)
:hp xw9300: (Aggrocrag) :hp dv8000: (Attack)

eMGee
Addict
Who joined Nov. 15, 2008, 2:50 p.m.
and authored 794 notes

Wrote the following at Jan. 3, 2009, 2:25 p.m...

Nice system! I remember, back when I was 15 years old and relatively new to CG, I dreamed about owning one of those. I remember that I became so curious, I one day picked up the telephone in an adventurous mood and called SGI to ask how much one of their lowest spec.'ed O2 systems would cost. I remember the price was extremely high, especially for a 15-year-old kid, being something like ~ƒ15,000 (Dutch guilders; the equivalent of $8,500 or $9000, depending on the value of the US$ at the time)...

Is it true that the R5K O2 s somehow perform better than some R10K and R12K s? That's what I heard from some people, something I've picked up here on the forum as well in some threads. By the way, you should try to get an A/V-module (if you haven't acquired one already, in the meanwhile). That's undoubtedly one of the most interesting, and unique , features of the O2 among SGI visual workstations...

_________________

kramlq
Addict
Who joined Sept. 20, 2005, 5:10 p.m.
and authored 841 notes

Wrote the following at Jan. 3, 2009, 4:25 p.m...

eMGee wrote:

Is it true that the R5K O2 s somehow perform better than some R10K and R12K s? That's what I heard from some people, something I've picked up here on the forum as well in some threads. By the way, you should try to get an A/V-module (if you haven't acquired one already, in the meanwhile). That's undoubtedly one of the most interesting, and unique , features of the O2 among SGI visual workstations...

The 300MHz R5k CPU is beaten by a 195MHz R10k in most tasks. The R10k has a lower clock rate but the internal architecture is more advanced, so it typically has the edge.

On the other hand, the 350MHz RM7000 (the last official SGI CPU available for an R5k style O2) seems to have had architectural improvements. It is faster than many CPUs in the R10k/R12k series. I found it to be roughly equivalent to my 300MHz R12k O2 for some tasks. It really depends on what test is being run - the RM7k might be better in some integer tasks, but R12k is usually better in any floating point tasks. Note that the RM7k has a three level cache hierarchy, whereas the R12k 300 only has two levels, so working data set size can also make a difference.

The R12k 400MHz is the fastest official CPU, and it also has a 2Mb L2 cache, so that is faster than any official CPU in the R5k series.

My experiences are only from running applications/compilers etc on the machine. I never ran benchmarks. Ian Mapleson has, but as you will see, results are often task dependent. And just to be clear, this only applies to the above CPU's in an O2 - the same CPU in an Octane will be much better, as Octane has a better architecture.

sybrfreq
Old Salt
Who joined Aug. 21, 2007, 10:12 p.m.
and authored 1647 notes

Wrote the following at Jan. 3, 2009, 5:30 p.m...

eMGee wrote:

Nice system! I remember, back when I was 15 years old and relatively new to CG, I dreamed about owning one of those. I remember that I became so curious, I one day picked up the telephone in an adventurous mood and called SGI to ask how much one of their lowest spec.'ed O2 systems would cost. I remember the price was extremely high, especially for a 15-year-old kid, being something like ~ƒ15,000 (Dutch guilders; the equivalent of $8,500 or $9000, depending on the value of the US$ at the time)...

Is it true that the R5K O2 s somehow perform better than some R10K and R12K s? That's what I heard from some people, something I've picked up here on the forum as well in some threads. By the way, you should try to get an A/V-module (if you haven't acquired one already, in the meanwhile). That's undoubtedly one of the most interesting, and unique , features of the O2 among SGI visual workstations...

Haha- I wish! This is among the slowest O2s out there- all it has going for it is that they are bulletproof, lightweight and easy to carry around. Looks good too!

_________________

(Aldebaran)

(Chaos)

(Machop)
:hp xw9300: (Aggrocrag) :hp dv8000: (Attack)

mapesdhs
Old Salt
Who joined Nov. 10, 2003, 5:17 p.m.
and authored 1685 notes

Wrote the following at Jan. 19, 2009, 7:19 a.m...

kramlq writes:
> On the other hand, the 350MHz RM7000 (the last official SGI CPU available for an R5k style O2) seems to have had architectural improvements.
> It is faster than many CPUs in the R10k/R12k series. I found it to be roughly equivalent to my 300MHz R12k O2 for some tasks. It really depends
> on what test is being run - the RM7k might be better in some integer tasks, but R12k is usually better in any floating point tasks. ...

That's very true. For the Alias test I ran (complex scene), the R7K/350 was beaten by the R10K/250 in O2, whereas for the Maya test (simple scene)
the R7K/350 beats the R12K/270, though probably not as good as R12K/300. For the C-ray test, even an R10K/195 beat theR7K/350. By contrast,
for code compilation the R7K/350 beats an R12K/300. I've not done the relevant GIMP tests yet, but they will probably be around the R12K/270
to R12K/300 range.

> The R12k 400MHz is the fastest official CPU, and it also has a 2Mb L2 cache, so that is faster than any official CPU in the R5k series.

Indeed, and it also beats the R7K/600 in many cases, but again it varies. For code compilation, the R7K/600 is faster than R12K/400 when
using GCC, but the R12K is faster when using MIPS Pro. Can't comment much beyond this, not enough data yet.

If anyone can lend me an R7K/600 module then I'll run all the tests. Still a fair few holes in my results tables atm.

> My experiences are only from running applications/compilers etc on the machine. ...

You're lucky, code compilation is one area where the R7K does well in O2.

> ... I never ran benchmarks. Ian Mapleson has, ...

That's putting it mildly.

Way too many late nights... zzz...

> ... And just to be clear, this only applies to the above CPU's in an O2 - the same CPU in an Octane will be much better, as Octane has a
> better architecture.

Absolutely, any R12K in Octane will leave O2 in the dust. For code compilation, an R10K/250 Octane beats an R7K/600 O2 with MIPS Pro,
while for GCC an Octane R12K/360 beats an O2 R7K/600. There's a huge cost difference for these configs, Octane is much cheaper.
However, as has often been said, O2 is low-power, low-noise, compact, etc.

Full details here:

http://www.sgidepot.co.uk/perfcomp.html

Ian.

_________________
SGI Systems/Parts/Spares/Upgrades For Sale: http://www.sgidepot.co.uk/sgidepot/
[email protected], [email protected], +44 (0)131 476 0796, check my auctions on eBid!
Sick of eBay? Try eBid instead! Safe, secure, cheaper, and buying is free!

Sign up here .

jan-jaap
Old Salt
Who joined June 17, 2004, 11:35 a.m.
and authored 2688 notes

Wrote the following at Jan. 19, 2009, 7:30 a.m...

mapesdhs wrote:

Indeed, and it also beats the R7K/600 in many cases, but again it varies. For code compilation, the R7K/600 is faster than R12K/400 when using GCC, but the R12K is faster when using MIPS Pro. Can't comment much beyond this, not enough data yet.

Easy. No released GCC version knows how to schedule instructions for R10k class CPUs. This will change with GCC 4.4, btw. So, GCC compiled code is optimized for R4x00 class CPUs and the R5k/R7k are architecturally closer to an R4k than an R10k or R12k is.

When you use MIPSpro, I assume you build optimized binaries for each CPU?

_________________
Now this is a deep dark secret, so everybody keep it quiet

It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi
Currently in commercial service:

(2x)

In the museum: almost every MIPS/IRIX system.

mapesdhs
Old Salt
Who joined Nov. 10, 2003, 5:17 p.m.
and authored 1685 notes

Wrote the following at Jan. 19, 2009, 8:06 a.m...

jan-jaap wrote:

When you use MIPSpro, I assume you build optimized binaries for each CPU?

Nope, I just run the Makefile as-is, since that's a more realistic test of what an ordinary user would experience. Besides, I'm not interested
in getting an optimised binary of the compiled program, just how long the compilation takes.

Ian.

mattst88
Enthusiast
Who joined July 13, 2005, 9:54 a.m.
and authored 219 notes

Wrote the following at Jan. 20, 2009, 9:46 a.m...

mapesdhs wrote:

jan-jaap wrote:

When you use MIPSpro, I assume you build optimized binaries for each CPU?

Nope, I just run the Makefile as-is, since that's a more realistic test of what an ordinary user would experience. Besides, I'm not interested
in getting an optimised binary of the compiled program, just how long the compilation takes.

Ian.

I think he meant when you built programs to benchmark, did you compile them with processor-specific optimizations?

_________________
My computers including Alphas, PA-RISCs, and my :O2+:

directedition
Addict
Who joined Feb. 3, 2003, 6:20 p.m.
and authored 515 notes

Wrote the following at Jan. 20, 2009, 9:57 a.m...

mattst88 wrote:

I think he meant when you built programs to benchmark, did you compile them with processor-specific optimizations?

I think there's a disconnect here. The benchmark being measure is compile time. There's no real point in spending time adding processor-specific optimizations when you're only measuring time spent compiling a given application. Hence, "I just run the Makefile as-is".

_________________
- Jim
:Indigo:

<- signed by The Screensavers

(230L) (230L) :540:

<- touchscreen :PI:

mapesdhs
Old Salt
Who joined Nov. 10, 2003, 5:17 p.m.
and authored 1685 notes

Wrote the following at Jan. 20, 2009, 10:49 a.m...

directedition wrote:

I think there's a disconnect here. The benchmark being measure is compile time. There's no real point in spending time adding processor-
specific optimizations when you're only measuring time spent compiling a given application. Hence, "I just run the Makefile as-is".

Correct!

The compiled program is not used for anything. It's how long it takes to compile the program that I'm interested in. For O2, with the
current generation of GCC/SGI compilers, the results show GCC runs better with an R7K/600 compared to R12K/400, whereas for
MIPS Pro the R12K is better. In both cases though, an Octane or Fuel will obviously be even faster. Here's the raw results page
without frames.

Ian.

jan-jaap
Old Salt
Who joined June 17, 2004, 11:35 a.m.
and authored 2688 notes

Wrote the following at Jan. 20, 2009, 1:29 p.m...

mapesdhs wrote:

For O2, with the current generation of GCC/SGI compilers, the results show GCC runs better with an R7K/600 compared to R12K/400, whereas for MIPS Pro the R12K is better

OK, so it's about the execution speed of the compiler itself.

GCC itself is compiled with GCC. GCC schedules instructions for an R4x00. Therefore the GCC binaries themselves don't take advantage of improvements in R10k class processors.

MIPSpro can schedule code for various CPUs, or at least use a model that doesn't work too bad on the various MIPS CPUs out there. I don't know what flags SGI used when they built MIPSpro, but I doubt that they optimized the binaries exclusively for R4x00.

R10000 and up support speculative, out of order execution. This is a completely different animal as far as a compiler is concerned. R5000 and R7000 don't do any of this and are architecturally closer to R4x00 class CPU's even if they support the MIPS4 instruction set.

_________________
Now this is a deep dark secret, so everybody keep it quiet

It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi
Currently in commercial service:

(2x)

In the museum: almost every MIPS/IRIX system.

mapesdhs
Old Salt
Who joined Nov. 10, 2003, 5:17 p.m.
and authored 1685 notes

Wrote the following at Jan. 20, 2009, 1:56 p.m...

jan-jaap writes:
> ... GCC schedules instructions for an R4x00. ...

GCC also knows about R5K/R7K, which is why it runs better on such systems.

> ... I don't know what flags SGI used when they built MIPSpro, but I doubt that they optimized the binaries exclusively for R4x00.

You might be surprised. The compiler binaries installed on my Fuel are, as far as I can be bothered to check offhand, all MIPS3.
Infact, /usr/bin/driver is MIPS2.

Ian.

jan-jaap
Old Salt
Who joined June 17, 2004, 11:35 a.m.
and authored 2688 notes

Wrote the following at Jan. 20, 2009, 3:11 p.m...

mapesdhs wrote:

jan-jaap writes:
> ... GCC schedules instructions for an R4x00. ...

GCC also knows about R5K/R7K, which is why it runs better on such systems.

GCC can target those CPUs, but the GCC binaries themselves are built without target specific flags (just '-O2 -g') and therefore default to R4x000 (that's simply the default target for GCC). If they run (relatively) faster on a certain CPU, it is because that CPU is architecturally closer to an R4k than some other CPU.

If you would want to make GCC itself run as fast as possible on these CPUs, you would have to rebuild GCC with different CFLAGS in the Makefiles. There are other ways to squeeze more speed out of GCC itself, such as profile based feedback bootstrap builds (profiledbootstrap). I do regular bootstrap builds and regression testing of GCC -- I'll see if I can do an R10k profile-optimized bootstrap of a GCC 4.4 snapshot. See if it does much for the speed of GCC itself.

mapesdhs wrote:

> ... I don't know what flags SGI used when they built MIPSpro, but I doubt that they optimized the binaries exclusively for R4x00.

You might be surprised. The compiler binaries installed on my Fuel are, as far as I can be bothered to check offhand, all MIPS3.
Infact, /usr/bin/driver is MIPS2.

Ian.

ABI has everything to do with the available instruction set, but nothing with scheduling of the instruction mix.
Scheduling instructions is trying to keep all execution units of the CPU busy at all times, so it's very target CPU dependent.

I can tell MIPSpro (with -r10000 -mips3') to schedule instructions such that it favors R10k, but use no instructions exclusively available on MIPS4 (so the resulting binary runs on R4k, but favors R10k). Yes, for that last drop of performance you would choose MIPS4. But, there are not many new instructions in MIPS4 over MIPS3 (some prefetching insns, and of course MADD which is irrelevant unless you're running floating point intensive code). Usually the performance benefits of using MIPS4 are not worth loosing compatibility with R4000 class systems, and thus SGI built almost everything MIPS3.

Oh, to make it even more confusing (?): R10k is an out-of-order CPU, so it can rearrange the instruction stream on the fly to a certain extent and would therefore be more tolerant to code optimized for a CPU other than itself. R8000 is an in-order CPU which is why it was so lousy at running contemporary (R4k optimized) code.

_________________
Now this is a deep dark secret, so everybody keep it quiet

It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi
Currently in commercial service:

(2x)

In the museum: almost every MIPS/IRIX system.

mapesdhs
Old Salt
Who joined Nov. 10, 2003, 5:17 p.m.
and authored 1685 notes

Wrote the following at Jan. 20, 2009, 4:14 p.m...

jan-jaap writes:
> GCC can target those CPUs, but the GCC binaries themselves are built without target specific flags (just '-O2 -g') and therefore default to R4x000 (that's
> simply the default target for GCC). If they run (relatively) faster on a certain CPU, it is because that CPU is architecturally closer to an R4k than some other CPU.

This sounds a bit wierd to me, ie. the idea that a particular MIPS3 binary that does lots of int processing should run better on an R5K than an R10K, given that
every other int test I know of is faster on R10K. Check the SPECint95 results for R10/195 O2 vs. R5K/180SC O2 for example, every test is 2X
better (or more) on the R10K (the gcc test gives 4.57 on the R5K vs. 9.02 on the R10K). I know this is all with respect to a SUN reference system,
but even so...

> -- I'll see if I can do an R10k profile-optimized bootstrap of a GCC 4.4 snapshot. See if it does much for the speed of GCC itself.

Remember the version I used for testing was V3.4.6.

> I can tell MIPSpro (with -r10000 -mips3') to schedule instructions such that it favors R10k, but use no instructions exclusively available on MIPS4 (so the
> resulting binary runs on R4k, but favors R10k). ...

So perhaps Neko GCC has been compiled MIPS3 but with code that favours R5K? That would explain it.

> ... R10k is an out-of-order CPU, so it can rearrange the instruction stream on the fly to a certain extent and would therefore be more tolerant to code
> optimized for a CPU other than itself. ...

Doesn't seem to help it much for running GCC though.

> ... R8000 is an in-order CPU which is why it was so lousy at running contemporary (R4k optimized) code.

Was it you I was talking to about sorting a decent ATLAS build or something, perhaps making an R8K-optimised Blender build?

Ian.

directedition
Addict
Who joined Feb. 3, 2003, 6:20 p.m.
and authored 515 notes

Wrote the following at Jan. 20, 2009, 4:33 p.m...

mapesdhs wrote:

Was it you I was talking to about sorting a decent ATLAS build or something, perhaps making an R8K-optimised Blender build?

Might I ask for what possible purpose you would go to the trouble of making an R8k optimized Blender? Aside from it's floating point performance, you're stuck with Extreme graphics in an Indigo 2, not exactly an ideal for contemporary software.

_________________
- Jim
:Indigo:

<- signed by The Screensavers

(230L) (230L) :540:

<- touchscreen :PI:

SGI: hinv

R5000 O2