The collected works of mapesdhs - Page 2

D-EJ915 writes:
> usually those tests are just a single core if the cpu has multiple cores

That's generally true for the integer tests, but most of the fp tests are done in parallel, ie. cores/CPUs
enabled, Autoparallel compiler options turned on.


TeeTylerToe writes:
> ummm, didn't the 3.16GHz beat the itanium2 in both? ...

Both what? Please don't say you're judging based on the final averages! :D The key point is that one
should compare those tests closest to one's application, and in that regard there are many codes
where the XEON is much slower than IA64. And the answer to your next question might surprise you.
Comparing based on final SPEC averages is dumb and tells one nothing (eg. the final SPEC averages
for O2 hide an order of magnitude difference between lowest to highest). Look at the codes where
IA64 does well (lots of Fluid Dynamics stuff), when you compare the difference in the number of
cores and clock speeds involved, the IA64's results are very impressive.


> ... not as well as I predicted. also, how many cores, and sockets.

The XEON was using 2 chips, 4 cores/chip, 8 cores total, whereas the IA64 was only using 1 chip with 2
cores. The IA64 does much better than people think, ie. a single dual-core 1.66GHz Itanium2 can be 2X
faster than two quad-core XEONs. This doesn't apply to all codes (certainly not), but for those
where the two XEONs are faster one has to remember it takes two XEONs to achieve such a difference.
I've ammended the table above, and added URLs for the page results. Quite a few new results out now,
like the Dell T7400, so I'll wade through the spec pages and update the tables soon. It's not linked to
from my site index, but the file has always been available here . I use it as a personal reference when
researching performance issues and if I can update it 3 or 4 times a year.

Note that the Power6 result is interesting - it's only using 1 core on the chip. Strange that IBM didn't
submit results using both, though maybe they have now, I've not checked yet. Oh, I'm only referring to
the fp results here; for int, the XEON does much better, at least for the tests in the SPEC suite anyway
(large scale apps using lots of CPUs might be have differently).

Anyway, all this anti-IA64 sentiment is silly. Some of what went into IA64 came from ex-SGI people and
others familiar with SGI's ideas for H1/H2. I was very against IA64 early on, partly anti-Intel bias, partly
the loss of SGI's next-gen CPUs, etc., but after talking to John Mashey about it (the STREAM guy)
I was satisfied the result was going to be a good design, which it was/is, alas the late release caused
other problems (never used in O2K). If Itanium fails, it will be on cost grounds and like factors, not
performance. Speed-wise, it's a very efficient design, ie. work done per clock cycle per core. It would
have been nice to have H2, etc., but it was never going to happen for cost reasons (faster than IA64
as originally planned, but not fast enough to justify the higher cost).

It was a long time ago now, so I'm sure John wouldn't mind if I quoted his 1998 email to me:

Code: Select all

IA64 is *very* different in almost every conceivable way from an IA32 in architecture, emphasis, and
performance; it is very much what we might have designed had we been able to do a new ISA that
no longer had to be upwards compatible with anything we had (MIPS has run out of some opcode
slots, and will always have 32 each of integer and FP registers). It is publicaly known that IA64 has
128 each of integer and FP registers, and if you care about FP you will like that. There are numerous
other features where people have learned, and I found many features that probably first appeared in
MIPS chips, but in some cases cleaner. I studied the manuals looking for showstoppers, and was
heartily relieved not to find any such, and I did find features that I'd been specing for H2.

Anyway, I can't say anything that isn't public, but I would say:

a) This is a good architecture and a good chip, and if you liked MIPS chips, you will like IA64, even if
you despise IA32s.

b) In various strange ways, this architecture and implementation almost seem designed as better for
SGI than for anybody else,  even HP.

c) As it happens, the threads of ideas that led to the R8000 and SGI compiler technologies came
from people who'd worked on related technologies at other companies, and worked with people
with similar ideas, who went to Intel & HP, and strangely enough, some of what's seen in the chips
is *very* familiar.

So, anyway, your concerns are well-taken; I had some of the identical ones; my boss (Forest Baskett,
our CTO, and one of those who worked on the Stanford MIPS) & I have both looked carefully at
this thing, and both feel that this will be a good chip for our customers ... it is *not* an X86, even if is
able to run that code. By now, people understand how to do 64-bit instruction sets, so that's not
that hard any more; the real issue is in a myriad of other details, which is why I spent hours studying
the manuals, and I sure felt a lot better after I'd read them.


Ian.
R-ten-K wrote: SPEC is not multithreaded, throwing more cores does not change the SPEC results.


That's not true, the run rules state autoparallel optimisations are allowed, and most of the fp tests do use them.
Whether they use them efficiently is another matter, that's down to how well the compilers have been written. See
sections 4.2.2 and 4.2.3 of the run rules. The Itanium2 did not use the autoparallel option, but the XEON tests
did, though maybe this tells us more about the quality of the compilers than the hardware, but then that's the whole
focus of how Itanium2 is used anyway (I'd have liked to see the fp suite on the Itanium2 with the autoparallel turned
on, see what it did for some of the tests compared to the XEON).

In many cases when the option is not used, other cores/chips are still enabled - how these might help the final
result is hard to say, hopefully not much (background system services, etc. could use the other cores).

As for the other systems, the Power6 and Core2Extreme did not use the option, while the Opteron did for just the
fp test.

Obvioulsy an autoparallel option will be nothing like as efficient at using multiple cores as hand-designed threaded
code, but nevertheless that's how the tests are run on many of the systems.

Btw, the above would explain why the XEON gets such high numbers for the cactusADM and libquantum
tests (likewise the Core2Quad for libquantum), whereas the Core2Extreme is much slower, or would you
really expect one thread on a 3.16GHz XEON to be ten times faster than a single thread on a 2.93GHz
Core2Extreme? If the use of the autoparallel option is not the cause, then what is? I know XEONs have
differnent internal settings and optimisations, but a ten-fold speed difference for a single thread? To me,
it looks much more likely the compiler on the XEON is doing exactly what it's been asked to do, namely using
all 8 cores as best it can with automatic optimisation.

If I'm wrong and this is all cobblers, I'll be more than happy to ammend my posts of course.

Ian.
pip wrote:
I can't speak to the Indigo2, but my Octane MXE runs 1680x1050 great. It does take a lot of vfc tweaking though, and an understanding of what the values are doing.


Has anyone posted any vfo's for this? I'd like to add them to my Depot Resources page.

Ian.
jan-jaap wrote:
> Hmm. But is it worth it? If you crank up the resolution, more memory is needed for the framebuffer, at the cost of buffers
> for OpenGL features. ...

Not everyone wants to do 3D. :D

Handy for just being able to have more windows onscreen properly sized without overlap. I know I couldn't go back to
1280x1024, but loads of flat screens are 1680x1050 these days, so it would certainly be handy.


> I'm curious what 'glxinfo' has to say about your 1680x1050 mode.

Indeed!

Ian.

PS. Why do people still keep saying V6/V10 only have 8MB for texture? That's just not true. Depending on the screen
mode, it can be as high as 20MB.
R14K/550 Octane2 V12 (PCI cage, gbit, etc.): pug (from the book, "Magician", by Raymond E. Feist)
R16K/700 Fuel V12: winters (in honour of Major Dick Winters , 101st Airborne, 506th)
R10K/195 Indigo2 video system: milamber (also from, "Magician")
R12K/400 O2: demo (name given from old SGI loaner setup; intend to change at some point)
R5K/180 Indy firewall machine: gateway (yawn)

Ian.
indyman007 wrote:
How do you rename a machine?


http://www.sgidepot.co.uk/admin/

It has everything you need to know. More information available elsewhere (search!). You can download the
text in printable form.

Ian.
A long time until I was finally able to look into this properly, but what an historic day to get working on it (I hope!)... 8)

I wasn't able to find anything in the UK like the NoiseMagic ThemoControl NMT-3, so instead I've decided to replace the
noisy fans in the Netgear switch with these:

http://www.quietpc.com/gb-en-gbp/produc ... /mini-kaze

and also fit at least 2 of the fans with this (I have a feeling perhaps only one fan will be needed most of the time):

http://www.quietpc.com/gb-en-gbp/produc ... s/fanmate2

I reckon this should work very well. Just to compare btw, the fans fitted by default in the Netgear switch are model
EFB0412VHA, specs for which are: 12V, 8000rpm, 9.46cfm and a whopping 37.5 dBA.

Atm I'm waiting to hear if the seller can do a bulk price - might get a few more for dealing with future project needs.

Thanks for all the help!!

Ian.
Just curious, how does one interpret the output from mplayer to find out the frame rate achieved? For the
speeder_teaser.mp4 movie, I get:

Code: Select all

MPlayer 1.0rc1- MIPSpro Compilers: Version 7.4.4m (C) 2000-2006 MPlayer Team
CPU: SGI MIPS

Playing speeder_teaser.mp4.
Cache fill:  5.18% (434176 bytes)
ISO: File Type Major Brand: ISO/IEC 14496-1 (MPEG-4 system) v2
Quicktime/MOV file format detected.
Warning! pts=1894000  length=1894023
Warning! pts=4165680  length=4168243
VIDEO:  [avc1]  1280x1040  24bpp  10.000 fps    0.0 kbps ( 0.0 kbyte/s)
Opening video filter: [format fmt=RGB24]
==========================================================================
Trying to force video codec driver family ffmpeg...
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
Selected video codec: [ffh264] vfm: ffmpeg (FFmpeg H.264)
==========================================================================
Audio: no sound
Starting playback...
VDec: vo config request - 1280 x 1040 (preferred colorspace: Planar YV12)
Could not find matching colorspace - retrying with -vf scale...
Opening video filter: [scale]
VDec: using Planar YV12 as output csp (no 0)
Movie-Aspect is 1.23:1 - prescaling to correct movie aspect.
SwScaler: using unscaled yuv420p -> rgb24 special converter
VO: [null] 1280x1040 => 1280x1040 RGB 24-bit
[h264 @ 1069dcb0]AVC: Consumed only 454 bytes instead of 459
V: 189.3 1894/1894 70% 19%  0.0% 0 0 0%

BENCHMARKs: VC: 132.943s VO:  36.990s A:   0.000s Sys:   1.198s =  171.131s
BENCHMARK%: VC: 77.6848% VO: 21.6152% A:  0.0000% Sys:  0.7000% = 100.0000%

Exiting... (End of file)


Does the output imply that ideally the movie should play at 10fps?

The command I used to play the file was:

Code: Select all

mplayer -vo gl2 -vf format=RGB24 -osdlevel 0 -benchmark -nosound -vo null speeder_teaser.mp4


Ian.
(07/Mar/2015) FREE! (collection only) 16x Sagitta 12-bay dual-channel U160 SCSI JBOD units.
Email, phone or PM for details, or see my forum post .
[email protected]
+44 (0)131 476 0796
I hope you can get it working again! Would be interesting to see how it compares to Toby's 8x500, though I don't think the
larger L2 will make much of a difference since the Blender test scene is pretty simple. I've been running a complex Alias test
which benefits from a larger L2 though, eg. Origin300/600MHz beats a dual-400 Octane2 (1/3rd of the speed difference is
due to the faster O3K RAM, 2/3rds of the increase is due to the 2X larger L2). By contrast, a different simple Maya test
scales more with basic clock speed, eg. Octane, Origin and Fuel are all pretty much the same with a 600MHz CPU.

The more complex the task, the better it will run on an O3K system.

One thing that does surprise me however, and a bit of a caveat for your O350: I see from the hinv that the L2 runs at
333MHz. With a 900MHz system, it runs at 450MHz. This might make a difference for some codes. Anyone have a
quad-1GHz Tezro to compare? Strange that SGI couldn't just make the L2 run at 500MHz.

Ian.

PS. Assuming you're successful in your repairs, any chance of some C-Ray results, etc.? 8)
I'm searching for relevant hw/funding for a charitable PC build . Please PM/email/call if you'd like to contribute or have any suitable components! Donations of items I can sell to provide funds are also welcome.
[email protected]
+44 (0)131 476 0796
So how does one interpret mplayer's benchmark results output? Couldn't find a ref in the man page.

Ian.
Is there somewhere it says how many frames are in the movie? Just wondering how people are calculating
their fps numbers from the output.

Ian.
bri3d wrote: The L2 speed discrepancy is curious... I wonder why they did that, maybe the L2 barely doesn't run at 500Mhz or something so they had to use divisor 3 instead of 2 (that's totally stupid if it's true though)...


Indeed. The L2 can run at 450 ok, so not much of a stretch to think of it running at 500. For some codes, the lower L2 speed might
be enough to offset some of the benefit of having a larger L2 and/or higher clock, eg. compared to an 800 with 8MB. Hard to say.
Does anyone have a Fuel, Tezro or Origin/Onyx3K with an 800 CPU? What is the speed of the L2?

For reference, a dual-700 Tezro uses 350MHz L2, while I already know a quad-1GHz Tezro also uses 333MHz L2. Rather ironic that
a Fuel/900 has a 35% faster L2 link.

Ian.
Ah well.

Anyone know what kinds of tasks/codes might be affected by the L2 speed? My site doesn't have any fp intensive tests
as yet, though I'm working on it, eg. Autodock, Alias, etc. Even then, fp tasks vary greatly as my SPEC95 pages show.
Perhaps hard to tease out the effects when, despite being slower, the L2 is 2X larger.

Ian.
I'm sure Toby can confirm, but yes, I would have thought you'd definitely need to have the O3K-type NumaLink3
cable to link the two O350s together.

Ian.
bri3d wrote:
Rumor has it it may have even been done internally at some point.


Someone at SGI did design a new low-end which ran IRIX using a newer Broadcom dual-core MIPS CPU, but
I don't know what the clock was. I infer from what I was told that the unit was designed around PCIe, and had
an NVIDIA gfx card. It never went anywhere though, management killed it. :\

Joe once spoke of using the 1.5GHz Sandcraft CPU, but I can't remember now why he couldn't go anywhere
with it - probably needs the PROM source just like he does for the R9K/1GHz.

Ian.
Hmm, are you perhaps connecting the cable to the older XIO port designed for O2K gfx options?

I cannot see how you'll ever get it to work if you continue to use an O2K cable. Find an O3K cable.

Ian.
edefault writes:
> Where is this PROM located physically?

Mbd somewhere. Dallas chip or elsewhere I guess.


> Any chance to reverse-engineer it?

Unlikely. If there was, I'm sure Joe would already have done it.

Last year I did ask a couple of SGI people about releasing the PROM source; the response was, never gonna
happen. Individual SGI employees might think it's a neat idea (they did) but it's not up to them.

Ian.
Picture links (some are out of date): side , front , rear , inside , PCI slots , disks , CPU

03/Feb/2015: upgraded the /ian user account SSD to a 240GB OCZ Vertex2E, ie. doubled the space as the 120GB was
almost full. Note I've tested many other models of SSD with Fuel/Tezro, including a Samsung 850 Pro, but alas for some
reason almost all SATA3 SSDs end up negotiating only a 1.5Gbit link with the SAS controller, whereas SATA2 SSDs setup
a 3Gbit link just fine. Could be that only certain firmware versions will allow for a 3Gbit link with a SATA3 device. I thought
I'd resolved this before by testing fw versions P20 vs. P21, but other tests showed this isn't the case, so it looks like I'm
going to have to at some point test the whole range of fw updates for these LSI SAS cards, see which ones work ok with
SATA3 devices and which don't. Of course though, even at 1.5Gbit, an SSD is far quicker than any rust spinner for small-
size random I/O, and beats almost all of them for sequential I/O aswell.

One other change: replaced the external SCSI backup unit with 4-bay Startech 5.25" device which holds 4x 2.5" devices.
This is filled with three SSDs and a 2.5" 750GB SATA, all used to hold clone mirror backups of the Fuel's internal drives.
Connects via a standard SAS cable to the LSI 3442's external SAS port.

08/Oct/2012 Edit: Replaced the system disk with a 60GB SSD (OCZ Vertex2E) in an ARS-2160H 68pin SCSI/SATA bridge
box, but /usr and /var are stored on a separate 120GB OCZ Vertex2E connected to the SAS/SATA card, thus allowing
max speed for all accesses to those areas. This gives the following diskperf when run within /var/tmp:

Code: Select all

# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
4096   39.75   33.27   39.94   33.07   39.70   22.56
8192   69.13   57.90   69.70   53.91   69.68   30.46
16384  109.03   92.98  107.86   79.81  110.71   54.90
32768  153.29  131.74  153.74  103.54  153.02   90.66
65536  192.25  170.21  188.90  151.19  193.54  137.95
131072  220.07  209.69  217.79  195.31  218.80  182.74
262144  238.19  233.02  234.89  228.38  238.00  220.60
524288  247.30  251.77  248.45  248.34  246.96  246.07
1048576  252.74  263.24  253.12  261.81  252.85  261.06
2097152  256.56  269.65  255.59  269.41  255.87  269.02
4194304  257.14  274.21  262.54  273.46  257.71  273.34


I'll update the pictures later.

Meanwhile, I'm not using the DDS4 DAT or external 146GB SCSI disk anymore (got a new backup system, more on
that later), and the external CDRW isn't connected now either. And I've replaced the 3041X-R SAS card with a
3442X-R, so the system now has a 4-port external link aswell (this relates to the new backup setup).

28/Feb/2011 Edit: Upgraded the SAS to a 600GB 15K (same model generation).

03/Jan/2011 Edit: Replaced the 300GB 10K data disk with a Seagate 450GB 15K SAS (15K.7). Moved my main /ian
user account (approx. 67GB of data atm) onto a 120GB SSD (OCZ Vertex2E 3.5").

31/Aug/2009 Edit: Replaced Maxtor system disk with 300GB 15K Fujitsu, replaced LSI 22320 SCSI card with LSI
SAS3041X-R SAS/SATA, added 1TB SATA disk.

Code: Select all

Location: /hw/module/001c01/node
IP34 Board: barcode NED586     part 030-1707-004 rev -C
Location: /hw/module/001c01/node/cpubus/0
IP34PIMM Board: barcode MWX852     part 030-2023-001 rev -B
Location: /hw/module/001c01/Ibrick/xtalk/13
ASTODY Board: barcode NJY535     part 030-1726-005 rev -A
Location: /hw/module/001c01/Ibrick/xtalk/14
IP34 Board: barcode NED586     part 030-1707-004 rev -C
Location: /hw/module/001c01/Ibrick/xtalk/15
IP34 Board: barcode NED586     part 030-1707-004 rev -C
1 900 MHZ IP35 Processor
CPU: MIPS R16000 Processor Chip Revision: 3.0
FPU: MIPS R16010 Floating Point Chip Revision: 3.0
CPU 0 at Module 001c01/Slot 0/Slice A: 900 Mhz MIPS R16000 Processor Chip (enabled)
Processor revision: 3.0. Scache: Size 8 MB Speed 450 Mhz  Tap 0xb
Main memory size: 4096 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 8 Mbytes
Memory at Module 001c01/Slot 0: 4096 MB (enabled)
Bank 0 contains 1024 MB (Premium) DIMMS (enabled)
Bank 1 contains 1024 MB (Premium) DIMMS (enabled)
Bank 2 contains 1024 MB (Premium) DIMMS (enabled)
Bank 3 contains 1024 MB (Premium) DIMMS (enabled)
Integral SCSI controller 2: Version QL12160, low voltage differential
Integral SCSI controller 3: Version QL12160, single ended
Scanner: unit 7 on SCSI controller 3
Integral SCSI controller 4: Version SAS/SATA LS1068
Disk drive: unit 0 on SCSI controller 4 (unit 0)
Disk drive: unit 1 on SCSI controller 4 (unit 1)
Disk drive: unit 2 on SCSI controller 4 (unit 2)
Disk drive: unit 3 on SCSI controller 4 (unit 3)
Integral SCSI controller 0: Version QL12160, low voltage differential
Disk drive: unit 1 on SCSI controller 0 (unit 1)
Integral SCSI controller 1: Version QL12160, single ended
CDROM: unit 3 on SCSI controller 1
CDROM: unit 4 on SCSI controller 1
IOC3/IOC4 serial port: tty1
IOC3/IOC4 serial port: tty2
IOC3 parallel port: plp1
Graphics board: V12
Gigabit Ethernet: eg0, module 001c01, pci_bus 1, pci_slot 3, firmware version 12.4.10
Integral Fast Ethernet: ef0, version 1, module 001c01, pci 4
Iris Audio Processor: version MAD revision 1, number 1
PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 1
PCI Adapter ID (vendor 0x1000, device 0x0054) PCI slot 2
PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 1
PCI Adapter ID (vendor 0x1412, device 0x1724) PCI slot 2
PCI Adapter ID (vendor 0x10a9, device 0x0009) PCI slot 3
PCI Adapter ID (vendor 0x10a9, device 0x0003) PCI slot 4
PCI Adapter ID (vendor 0x1045, device 0xc861) PCI slot 5
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
IP35prom in Module 001c01/Slot n0: Revision 6.210
USB controller: type OHCI



gfxinfo (out of date, it uses 1920x1200 now, will update this later):

Code: Select all

Graphics board 0 is "ODYSSEY" graphics.
Managed (":0.0") 1600x1200
BUZZ version B.2
PB&J version 1
128MB memory
Banks: 4, CAS latency: 3
Monitor 0 type: HWP 4819
Channel 0:
Origin = (0,0)
Video Output: 1600 pixels, 1200 lines, 75.00Hz (1600x1200_75)



Internal PCI cards:

Code: Select all

MAudio Revolution 7.1
SGI Copper Gbit Ethernet
QLA12160 dual-channel U160 SCSI
LSI SAS3442X-R SAS/SATA PCIX card (4 internal ports, 4 external ports via an SFF8470 socket)



SCSI/SATA Devices:

Code: Select all

0,1: OCZ Vertex2E 60GB 2.5" SSD (fw 1.37) held in an ACARD ARS_2160 SCSI/SATA bridge box (68pin version).
1,3: Yamaha 16/10/40 CDRW (internal).
1,4: TOSHIBA DVD-ROM SD-M1711.1005 (internal).
3,7: HP ScanJet 6300C Scanner.
4,0: OCZ Vertex2E 120GB 2.5" SSD (dks4d0s6 90GB partition holds /var, dks4d0s7 20GB partition holds /usr).
4,1: OCZ Vertex2E 240GB 2.5" SSD (main 'ian' user account)
4,2: Seagate 600GB 15K SAS (ST3600057SS).
4,4: Samsung Spinpoint F1 1TB 7200rpm SATA, Model HD103UJ.



Here are some diskperf results on the drives, done with all background processes running as normal, ie. mediad, httpd, firefox, etc.

0,1: OCZ Vertex2E 60GB 2.5" SSD. Current "system" disk (test must be executed somewhere such as /tmp, as /var and /usr are on a different device).

Code: Select all

# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
4096   15.61   15.08   15.61   15.04   15.61   11.21
8192   27.60   26.31   27.61   25.78   27.60   20.25
16384   45.27   41.56   45.17   39.89   45.16   33.62
32768   65.11   57.68   65.03   53.21   65.06   49.96
65536   83.16   82.34   83.60   78.51   83.59   74.61
131072   83.90   88.09   84.12   86.10   83.72   82.44
262144   85.92   92.85   86.19   92.64   85.38   89.58
524288   86.54   95.23   86.59   95.01   86.45   94.01
1048576   85.89   95.07   86.03   95.20   85.67   94.25
2097152   83.88   93.19   84.30   93.03   84.22   92.20
4194304   80.13   88.45   80.15   88.15   80.08   87.66



4,0: OCZ Vertex2E 120GB 2.5" SSD, holds /usr and /var. Connected via the SAS card for max speed. Notice the much better 4K random I/O speeds.

Code: Select all

# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
4096   39.75   33.27   39.94   33.07   39.70   22.56
8192   69.13   57.90   69.70   53.91   69.68   30.46
16384  109.03   92.98  107.86   79.81  110.71   54.90
32768  153.29  131.74  153.74  103.54  153.02   90.66
65536  192.25  170.21  188.90  151.19  193.54  137.95
131072  220.07  209.69  217.79  195.31  218.80  182.74
262144  238.19  233.02  234.89  228.38  238.00  220.60
524288  247.30  251.77  248.45  248.34  246.96  246.07
1048576  252.74  263.24  253.12  261.81  252.85  261.06
2097152  256.56  269.65  255.59  269.41  255.87  269.02
4194304  257.14  274.21  262.54  273.46  257.71  273.34



4,1: OCZ Vertex2E 120GB 3.5" SSD, used to hold the main /ian user account (not yet done a test on the 240GB 2.5" upgrade).

Code: Select all

#  req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#   (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#----------------------------------------------------------
4096   41.03   32.78   40.87   32.40   40.89   18.71
8192   71.14   57.90   71.43   55.72   71.25   35.13
16384  112.41   92.70  112.71   84.98  112.62   62.55
32768  155.51  134.62  155.55  114.20  155.22  101.51
65536  195.52  178.54  194.84  165.57  194.62  148.27
131072  222.14  214.24  222.33  207.04  221.96  192.56
262144  239.02  239.37  239.96  236.09  238.97  227.54
524288  249.26  255.97  250.19  254.05  249.32  250.51
1048576  254.55  266.98  254.66  265.47  254.36  263.74
2097152  258.39  272.93  257.60  272.10  257.53  270.96
4194304  258.92  275.82  259.48  274.96  259.66  275.08
8388608  259.94  277.27  261.89  276.97  260.37  276.86



4.2: Seagate 600GB 15K SAS (ST3600057SS), test not yet run.


4,3: 1TB 7200rpm SATA data disk (Samsung HD103UJ):

Code: Select all

# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
16384   64.18   65.03   47.89   33.18    4.44    2.27
32768   75.78   99.78   88.04   37.53    8.73    4.42
65536  106.05  111.59  106.76   42.43   16.23    8.48
131072  112.12  114.28  107.58   49.08   27.45   15.77
262144  106.30  113.90  106.95   56.80   41.58   27.56
524288  106.15  113.59  106.60   55.57   57.88   44.60
1048576  112.30  113.14  107.58   79.46   74.57   62.15
2097152  111.39  110.91  106.38   85.67   86.97   80.43
4194304  111.30  109.20  108.82  102.21   98.86   93.09



Fujitsu 300GB 15000rpm MBA3300NC, old system disk.

Code: Select all

# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
16384   59.87   68.97   30.55   27.13   10.23    5.30
32768   80.08   93.85   34.26   33.87   17.71   10.04
65536   94.37  114.01   56.67   38.45   29.14   18.21
131072   92.95  117.03   43.56   43.48   32.49   31.45
262144  102.52  124.69   44.35   44.50   51.88   48.31
524288  107.48  124.89   66.54   67.23   72.67   67.22
1048576  109.85  124.88   89.21   90.17   91.55   86.17
2097152  112.99  124.93  105.37  102.76  104.32   99.83
4194304  114.38  124.92  110.83  109.11  111.25  109.44
8388608  114.84  119.09  114.52  115.07  114.61  114.00
16777216  115.02  121.53  114.63  117.28  114.43  117.12



147GB 15K 'Worldview' OEM (old backup disk).

Code: Select all

# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
16384   61.32   74.49   14.59    4.32    8.84    5.25
32768   71.57   74.14   24.75    9.11   14.93    9.83
65536   76.83   78.69   34.40   20.56   24.20   17.13
131072   78.59   78.80   43.84   25.16   38.69   27.67
262144   78.96   78.77   57.23   45.79   52.63   39.79
524288   74.64   78.79   62.99   55.96   58.63   52.34
1048576   78.83   78.73   71.28   56.21   60.90   56.55
2097152   78.94   78.78   71.63   71.03   67.03   68.72
4194304   78.89   78.74   76.42   71.49   74.51   71.58
8388608   78.85   78.83   71.87   74.01   76.84   74.17
16777216   77.99   78.83   72.37   75.36   77.69   75.38



Seagate 450GB 15K SAS (ST3450857SS), old general data disk.

Code: Select all

#  req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#   (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#----------------------------------------------------------
4096   38.83   40.26   19.03   13.10    3.11    1.34
8192   68.98   70.94   34.82   20.83    6.22    2.67
16384  110.34  112.64   61.36   31.14   12.28    5.35
32768  154.20  156.89   99.76   42.16   22.76   10.52
65536  194.83  178.83  143.22   57.72   41.58   19.79
131072  206.30  198.62  170.62   63.65   69.34   35.18
262144  206.03  207.51  163.79   95.62   98.67   59.84
524288  205.83  207.88   95.40   96.06   95.99   92.94
1048576  206.58  207.95  149.61  145.27  127.19  123.02
2097152  206.24  207.90  152.67  150.97  158.85  156.52
4194304  206.77  207.91  184.26  171.54  180.34  176.95
8388608  206.14  207.91  193.90  189.59  191.76  189.09



Reference info: RAID results using one dual-channel SCSI card, comparing LSI 22320 to QLA12160/66, tested
with Maxtor Atlas 15K II 146GB disks:

Code: Select all

Summary for 'n' disks on 2 channels, optimised for HD video, fwd read/write speeds only.

| **FORWARD WRITE** | **FORWARD READ**
|    LSI     QLA    |    LSI    QLA
-----------------------------------------------
2 disks  |  172.02   172.76  |  159.08   159.26
4 disks  |  209.30   266.74  |  186.62   237.78
6 disks  |  209.36   269.59  |  190.81   245.07
12 disks |  209.30   269.34  |  200.07   260.37


Except for using two disks, the QLA is significantly faster than the LSI.

Cheers! :)

Ian.

PS. For an Alias render benchmark I've done (complex scene), this system beats a dual-600 Octane2. For a simple scene though, the
Octane2 is faster. The more complex the task, the better Fuel and O3K systems run. I've accumulated data for dozens of systems, results
now available on my site !
I'm searching for relevant hw/funding for a charitable PC build . Please PM/email/call if you'd like to contribute or have any suitable components! Donations of items I can sell to provide funds are also welcome.
[email protected]
+44 (0)131 476 0796
(40Kbit dialup - yuck!!...)

I did a swap; my two Fuel 700/V12 systems for a 900/V10 and an Onyx2 deskside, and helping
with some other stuff.

Cheers! :)

Ian.

PS. Bothans?... It was a slaughter... >-)
eMGee writes:
> That's a very nice system there Ian! Even with 8MB L2 cache memory, impressive!

Thanks!! :)


> So, this must be your personal system, as you mentioned. If you get the chance, and of course if you feel like it,
> could you perhaps post some pictures? All the added expansion/option cards, the whole setup, would be interesting
> to see. :)

Will do! Some time in January.

Ian.
Hope you all had a good Christmas, etc.!!

I've finished doing the usual benchmark tests from my SGI General Performance Comparisons page (Inventor, GIMP, etc.),
and also Quake1 / Quake2 tests (finally worked out how to run the tests single-buffered). I've also taken a few pics,
though the lighting/sharpness isn't all that great as the system lives under my desk, so natural light is mostly absent.

Side of system:




Front (not found a black facia for the CDRW yet):




Rear connections (just ordered a VHDCI/MicroD cable for the scanner, so the remaining SCSI socket will be used soon):




Overall internals:




PCI cards (top to bottom: Revolution 7.1, copper Gbit NIC, QLA12160/66, LSI 21320):




SCSI disks (top: Maxtor Atlas 15K II 147GB 8K147L0 68pin; bottom: Fujitsu MAT3300NP 300GB 10K 68pin. The Maxtor was a good
find as it has a valid warranty until October 2011):




The main CPU board (sorry for lack of focus, hard for my camera's flash to work well at such short range):




Happy new year everyone!! 8)

Ian.
sybrfreq writes:
> Somebody was a very good boy this year

Nah, somebody just worked their butt off. :D (ie. me)


directedition writes:
> Are you open to trades? How do first born sons sound?

Ooo, you'd have to do way better than that... would be well into the Lex Luthor scale of things. :D You know,
as in babes, Australia, The Universe, that sort of thing...


Meanwhile, as a final closing treat for 2008, I've just uploaded the results of of my latest benchmarking venture ,
namely running Alias and Maya renders using two quite different scenes (complex scene with Alias, simple scene
with Maya; links are on the left hand side in the index frame):

http://www.sgidepot.co.uk/perfcomp.html

or if you don't want the frames layout, here are direct links to the Alias and Maya pages.

Hope you find the results interesting! Main thing to take away from the data is that, in general, rendering a complex
scene will scale much better using multiple CPUs on an O3K-class system, though the Onyx2 does rather well.

That's it for this year! Catchya next year... :D

Ian.

-----

Jan/09 Edit: I've moved the screenshots from here into the main announcement thread.
JacquesT writes:
> Very nice Ian!

Thanks! 8-)


> If I'm ever up in Scotland (But why?! It's colder there!!! 8-) ) I'll have to pop in and see this beast!

Feel free! :D


> Happy New Year!

Ditto!!


> (What's next, a quad 1ghz Tezro?)

That's the plan, hehe...

Ian.
kramlq writes:
> On the other hand, the 350MHz RM7000 (the last official SGI CPU available for an R5k style O2) seems to have had architectural improvements.
> It is faster than many CPUs in the R10k/R12k series. I found it to be roughly equivalent to my 300MHz R12k O2 for some tasks. It really depends
> on what test is being run - the RM7k might be better in some integer tasks, but R12k is usually better in any floating point tasks. ...

That's very true. For the Alias test I ran (complex scene), the R7K/350 was beaten by the R10K/250 in O2, whereas for the Maya test (simple scene)
the R7K/350 beats the R12K/270, though probably not as good as R12K/300. For the C-ray test, even an R10K/195 beat theR7K/350. By contrast,
for code compilation the R7K/350 beats an R12K/300. I've not done the relevant GIMP tests yet, but they will probably be around the R12K/270
to R12K/300 range.


> The R12k 400MHz is the fastest official CPU, and it also has a 2Mb L2 cache, so that is faster than any official CPU in the R5k series.

Indeed, and it also beats the R7K/600 in many cases, but again it varies. For code compilation, the R7K/600 is faster than R12K/400 when
using GCC, but the R12K is faster when using MIPS Pro. Can't comment much beyond this, not enough data yet.

If anyone can lend me an R7K/600 module then I'll run all the tests. Still a fair few holes in my results tables atm.


> My experiences are only from running applications/compilers etc on the machine. ...

You're lucky, code compilation is one area where the R7K does well in O2.


> ... I never ran benchmarks. Ian Mapleson has, ...

That's putting it mildly. :D Way too many late nights... zzz...


> ... And just to be clear, this only applies to the above CPU's in an O2 - the same CPU in an Octane will be much better, as Octane has a
> better architecture.

Absolutely, any R12K in Octane will leave O2 in the dust. For code compilation, an R10K/250 Octane beats an R7K/600 O2 with MIPS Pro,
while for GCC an Octane R12K/360 beats an O2 R7K/600. There's a huge cost difference for these configs, Octane is much cheaper.
However, as has often been said, O2 is low-power, low-noise, compact, etc.

Full details here:

http://www.sgidepot.co.uk/perfcomp.html

Ian.

_________________
SGI Systems/Parts/Spares/Upgrades For Sale: http://www.sgidepot.co.uk/sgidepot/
[email protected], [email protected], +44 (0)131 476 0796, check my auctions on eBid!
Sick of eBay? Try eBid instead! Safe, secure, cheaper, and buying is free! 8) Sign up here .
jan-jaap wrote:
When you use MIPSpro, I assume you build optimized binaries for each CPU?


Nope, I just run the Makefile as-is, since that's a more realistic test of what an ordinary user would experience. Besides, I'm not interested
in getting an optimised binary of the compiled program, just how long the compilation takes.

Ian.
directedition wrote:
I think there's a disconnect here. The benchmark being measure is compile time. There's no real point in spending time adding processor-
specific optimizations when you're only measuring time spent compiling a given application. Hence, "I just run the Makefile as-is".


Correct! 8)

The compiled program is not used for anything. It's how long it takes to compile the program that I'm interested in. For O2, with the
current generation of GCC/SGI compilers, the results show GCC runs better with an R7K/600 compared to R12K/400, whereas for
MIPS Pro the R12K is better. In both cases though, an Octane or Fuel will obviously be even faster. Here's the raw results page
without frames.

Ian.
jan-jaap writes:
> ... GCC schedules instructions for an R4x00. ...

GCC also knows about R5K/R7K, which is why it runs better on such systems.


> ... I don't know what flags SGI used when they built MIPSpro, but I doubt that they optimized the binaries exclusively for R4x00.

You might be surprised. The compiler binaries installed on my Fuel are, as far as I can be bothered to check offhand, all MIPS3.
Infact, /usr/bin/driver is MIPS2.

Ian.
jan-jaap writes:
> GCC can target those CPUs, but the GCC binaries themselves are built without target specific flags (just '-O2 -g') and therefore default to R4x000 (that's
> simply the default target for GCC). If they run (relatively) faster on a certain CPU, it is because that CPU is architecturally closer to an R4k than some other CPU.

This sounds a bit wierd to me, ie. the idea that a particular MIPS3 binary that does lots of int processing should run better on an R5K than an R10K, given that
every other int test I know of is faster on R10K. Check the SPECint95 results for R10/195 O2 vs. R5K/180SC O2 for example, every test is 2X
better (or more) on the R10K (the gcc test gives 4.57 on the R5K vs. 9.02 on the R10K). I know this is all with respect to a SUN reference system,
but even so...


> -- I'll see if I can do an R10k profile-optimized bootstrap of a GCC 4.4 snapshot. See if it does much for the speed of GCC itself.

Remember the version I used for testing was V3.4.6.


> I can tell MIPSpro (with -r10000 -mips3') to schedule instructions such that it favors R10k, but use no instructions exclusively available on MIPS4 (so the
> resulting binary runs on R4k, but favors R10k). ...

So perhaps Neko GCC has been compiled MIPS3 but with code that favours R5K? That would explain it.


> ... R10k is an out-of-order CPU, so it can rearrange the instruction stream on the fly to a certain extent and would therefore be more tolerant to code
> optimized for a CPU other than itself. ...

Doesn't seem to help it much for running GCC though. :)


> ... R8000 is an in-order CPU which is why it was so lousy at running contemporary (R4k optimized) code.

Was it you I was talking to about sorting a decent ATLAS build or something, perhaps making an R8K-optimised Blender build? :)

Ian.
QuicksilverG4 wrote: There is no SGI solution that's as easy and no where near as cheap.


That's very true. Doing it with SGIs is just more interesting and a bit of a challenge, which is why I'm doing it that way,
using a combination of O2/Octane with video options, though I don't need to use any professional apps. However,
I'll be using a PC for doing the final format conversion, probably buying an i7 920 system in May/June.

Can't give specific details yet though, not had time to build the final setup and do proper tests. Done early tests
which worked ok, capturing with O2, converting to uncompressed AVI and then to DivX using a PC, but the available
sw and codecs have improved since then. The one thing I have done is buy the commercial verison of DivX.

pub_bronx, the AV board for O2 is cheap, and the digvid board for O2 isn't too bad either really given the much higher
cost of the equivalent option for Octane, etc. Software is an entirely different issue though and for many is often the
main stickler with using SGIs. In theory your position ought to be the same as mine, ie. only really need cut & paste, so
the supplied tools ought to suffice. I've yet to explore Premiere but it runs ok on O2.

I've been acumulating 300GB disks for my setup, have 9 of them now. Just need some spare time. *sigh*

Ian.
[email protected]
+44 (0)131 476 0796
A follow-up to this with some solid results...

I tested an R5K/300 O2 fitted with the dig vid board and dual-channel Adaptec, with an array containing eight old/slow 73GB drives,
running Illusion 6.1v8; diskperf gave:

Code: Select all

# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
1245184   50.89   57.31   49.65   49.15   49.64   50.25
2490368   52.83   60.90   52.19   55.90   52.25   56.42


which is more than enough for D1 PAL. I used an Octane with DIGVID as a D1 source and then tested Illusion on the O2 for real-time
capture and playback, which did indeed work ok with the material stored as RGB frames.

Biggest difference though is the Octane (a single-R12K/400) was - for example - way faster than the O2 for exporting a clip
out as Targa files, and the GUI interface responsiveness was far superior on the Octane.

But for practical purposes, the O2 does indeed work fine for real-time capture using one of these Adaptec dual-channel cards
and the kernel mod. Here's the O2's hinv, notice the extra SCSI channels:

Code: Select all

CPU: MIPS R5000 Processor Chip Revision: 10.0
FPU: MIPS R5000 Floating Point Coprocessor Revision: 10.0
1 300 MHZ IP32 Processor
Main memory size: 1024 Mbytes
Secondary unified instruction/data cache size: 1 Mbyte on Processor 0
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
FLASH PROM version 4.18
Integral SCSI controller 0: Version ADAPTEC 7880
Disk drive: unit 1 on SCSI controller 0 (unit 1)
Disk drive: unit 2 on SCSI controller 0 (unit 2)
CDROM: unit 3 on SCSI controller 0
Integral SCSI controller 1: Version ADAPTEC 7880
PCI SCSI controller 3: Version ADAPTEC 7880
Disk drive: unit 1 on SCSI controller 3 (unit 1)
Disk drive: unit 2 on SCSI controller 3 (unit 2)
Disk drive: unit 3 on SCSI controller 3 (unit 3)
Disk drive: unit 4 on SCSI controller 3 (unit 4)
PCI SCSI controller 4: Version ADAPTEC 7880
Disk drive: unit 1 on SCSI controller 4 (unit 1)
Disk drive: unit 2 on SCSI controller 4 (unit 2)
Disk drive: unit 3 on SCSI controller 4 (unit 3)
Disk drive: unit 4 on SCSI controller 4 (unit 4)
On-board serial ports: tty1
On-board serial ports: tty2
On-board EPP/ECP parallel port
CRM graphics installed
Integral Ethernet: ec0, version 1
Iris Audio Processor: version A3 revision 0
PCI Adapter ID (vendor 0x9004, device 0x8078) PCI slot 1
PCI Adapter ID (vendor 0x9004, device 0x8078) PCI slot 2
PCI Adapter ID (vendor 0x1011, device 0x0001) PCI slot 3
PCI Adapter ID (vendor 0x9004, device 0x8278) PCI slot 4
PCI Adapter ID (vendor 0x9004, device 0x8278) PCI slot 5
Video: MVP unit 0 version 1.4
AV: AV2 Card version 0, Camera not connected.
Vice: TRE

Graphics board 0 is "CRM" graphics.
Managed (":0.0") 1280x1024
32 + 32 bitplanes
board revision 2, CRM revision C, GBE revision B
Monitor 0 type: Unknown
Channel 0:
Origin = (0,0)
Video Output: 1280 pixels, 1024 lines, 50.00Hz (1280x1024_50)



Cheers! :)

Ian.
eMGee wrote: From what I can tell, for regular VHS, that you'd need something like a DMediaPro DM6 (for SD, more shouldn't be needed for this specific task at least) with optional DM5 /VBOB plus a VTR (not to mention cables) and either a VTR that is merely VHS/SD, yet with SDI I/O, or something to feed the VHS/SD into the board/VBOB somehow.


For VHS, that's overkill. One can capture VHS just fine using an O2 or Cosmo2 Indy/Indigo2/Octane. Likewise, for anything digital that
does not need HD, an O2 with digvid works fine. I've been testing an O2 today with Illusion and it captures real-time D1 PAL no problem,
and playback is ok too. Editing is probably best done by moving the data to an Octane2/Fuel/Tezro though, as even a single-CPU R12K/400
is so much faster at (for example) exporting a clip as Targa files than a typical R5K/300 O2 (yes, there are faster O2s, but even an R5K/300 O2
is more expensive than an R12K/400 Octane). Come to think of it, an Octane with the older DIGVID is not that costly now; for digital SD
work it would be better than O2 by far.

I hadn't considered using Illusion before, but now I know it works ok I will probably use it instead of MovieMaker for editing. As for capture,
O2 with analogue AV and dmrecord works fine, though for tapes with degraded quality it is better to use Indigo2 with IMPACT Compression
or Octane with Octane Compression (I have both) since Cosmo2 is better at coping with noisy signals than O2's VICE hardware.

Even including the cost of a decent PC for final format conversion, having an O2/Octane/Fuel with the relevant lesser but perfectly ok options
would be much cheaper than anything that requires a DM5/VBOB.

For reference, I have about 300 tapes to digitise, so this is a key area of study for me.

Ian.
jan-jaap writes:
> Does anybody have any experience with the Miranda mini converters, e.g. SDI <-> composite or SDI <-> component? ...

I've never been able to find one, at least not at a useful price.


> ... A PowerSeries with SDI video, how cool is that ;)

Nice!! I have a VideoCreator unit, but until I get my Crimson working it's going nowhere.

Ian.
jan-jaap writes:
> Wait a second -- you said only one of the two channels was external. ...

That was 2 months ago. :D


> ... Did you hack a flatcable into the case ...

Of course! 8)

Ian.
Thanks!! What version of Blender did you use? (2.44 I hope)

Is there a hinv of your system somewhere I can link to on my results page?


Follow-up for this thread's main topic...

I tried Illusion on an R12K/400 O2 with the same dual-channel Adaptec card and the performance was
much better for using the application, but about the same for diskperf results. I'm wondering if accessing
both channels through one 32bit/33MHz PCI card is causing an issue somehow. Will try next with the
built-in UW channel plus just one of the extra card's channels...

The R12K did make a huge difference for using Illusion. Much more responsive menus, and the app loaded
up almost as fast as it does on an Octane/400.

Ian.
I'd like to know that too.

Also, is anyone using ShotMaker? (rewrite of MovieMaker, optimised for uncompressed video)

SGI released a license some time ago:

FEATURE SHOTMAKER sgifd 1.000 01-jan-0 0 8D523EC13F12C3B65C47 \
HOSTID=ANY vendor_info="ShotMaker" SN=133659 \
ISSUER="Silicon Graphics, Inc." NOTICE="Courtesy" ck=77

but I've not managed to find the installation source yet.

Setting up a capture array is easy. For SD, even a QLA1080 or 1280 will do, though I use a QLA12160.

Ian.
nekonoko wrote:
I thought you asked for that tardist a while back - I put it up on Nekochan FTP (under /pub/irix/Graphics).


Oh! Not quite sure what happened there. Perhaps I was unable to download it at the time for some reason.

Thanks!! Downloading now... I'll add it to my site later this evening.

Ian.
Thought you might like to know, seems to work fine on Fuel/900 with 6.5.26.

One question though - why is it so much slower than, say, V2.44? (the one I've been using for benchmarking)

On my Fuel/900, V2.44 does the test scene in 4 mins 16 secs, but the V2.48 build does it in 5 mins 30 seconds,
ie. 30% slower. What has changed?

Supporting lots of threads is not that useful until the app is first optimised for one thread.

Ian.
I'm working on a charitable PC build for the Learn Engineering YouTube channel. Please PM/email/call if you'd like to contribute! Donations of items I can sell to provide funds are also welcome.
[email protected]
+44 (0)131 476 0796
+44 (0)7434 635 121
skywriter writes:
> render optimization comes and goes, depending on the whims of (usually) Ton :) somethings may get tweaked at the expense of
> others, depending on real world work rather than the test.blend.

Yeah but still, 30%?? Kinda huge drop IMO. Makes a nonsense of inter-version comparisons, which is why I'm sticking to 2.44 for
benchmarking. And remember this is just for a very simple scene; quite possible the speed loss would be much worse for a complex scene.


> fwiw - although the renderer is multithreaded, not all parts, or indeed the most computationally expensive parts are parallelized.

I noticed. Personally I don't like the way the threaded stuff works at all. Sometimes it doesn't start using more than one thread
until after the first block has been completed, there's a major stutter when each thread completes which can be nasty if a whole bunch
finish at more or less the same time, and once the no. of remaining blocks is less than the no. of threads then the parallelism drops
off completely (very poor if the last coupla blocks happen to be complex). Ah well, better than nothing. :)

I know it's just a very simple test program by comparison, but I like the way C-ray does it. Parallelism is always maxed as much as it
can be since it's done by scanline.

Ian.
SGI Japan sold the Asterix x86 systems for a while. I wonder if the their site still has the info...

Ian.
skywriter writes:
> it's possible that there could have been an increase on an intel/pc architecture. it's been a long time since anyone cared about sgi/irix
> performance in the vast majority of the blender community.

Hmm, I don't see how that could occur though, afterall the build is compiled for MIPS. How could a general 'method' of rendering a scene
within the application be magically faster after a revision on an x86 platform? ie. the algorithm is surely unrelated to the final compiled
code for whatever platform.


> if you bashed your head against the blender development community you would find they haven't been interested so much as doing it
> right, but getting something at all done. the threading and distributed rendering is really, really, really, primitive.

So I see. Oh well.

Ian.