The collected works of LionSGI

hi!

i have bought an O200 a long time ago, but it is dead. it does nothing. only makes big noise :D i had some time and soldered an MSC cable. with this i am able to access the MSC and it works, but the O200 does nothing on the serial... what could the problem be? i wanted to buy a carrier logic board, but i could not find any in Europe for weeks, than i stopped searching for it. my board is: 030-1025-002 (1M cache support) and 013-1896-001 (dual R10K/180/1M) and 030-1235-001 (PCI board). i suspect the carrier logic is dead. now, i found one for sale: 030-1389-001 (it is a 4M cache support board). would it work with dual 1M CPUs and the PCI board? what else should i do/look for? maybe it's working... or am i just too n00b? :P

thanks!
Richard
thank you!

unfortunately i could not find a pair combo (motherboard + cpu) for sale, so i have to look further after them. is there any documentation which describes, witch cpu goes with witch motherboards for O200? i dont know O200s, does the motherboard clock the cpu or cpus clock themselfs (where is this info hold, NVRAM?) i bought a complete system, that used to work some day, but noone knows what happened to it. it can be that the ram modules are faulty, is there a way to test them in other computers? i have an octane1/r12k too (it works well :D ). my friend has an r5k/200 o2. i have access to some SUNs too...

regards,
Richard
i've got 6 memory modules in my O200, 4 (inner, 2 most inner slots are empty) of them are yellow. i've noticed, that they don't sit very well in their sockets, you can move them 1-2milimeters sideway when inserted and there is only one clip per socket. feels like kinda lame pc... :( does faulty memory affect serial terminal output? i also inserted the system disk onto my octane, it looked ok (no hw damage and fs was clean). i will play with the memory modules. the MSC port works so nice :D although, i have no special plans with the O200, it would be nice, if it would run, and maybe i could find some task that i could use it for...

i see, you are parting purple i2... i have 2 of them, both are defective :( both R10k, max and high impact. the upper board of the impact is faulty. one does nothing, the other messes up the screen with random pixels (later with matrix like patterns, here is the screenshot http://www.kozarmisleny.hu/_lionsgi/sgi ... enshot.rgb ), the machine is stable... it could be a cold solder somewhere near the framebuffer RAM or DAC, or faulty RAM or DAC. i've disassembled the boards, resoldered some pins, but nothing happened. it used to work fine for 2 years, it started messing up when cold, after 10mins it went away, it did it for half year got worser form week to week... i like my purple maximpact the most of my machines, it was my first SGI.

one PSU is also bad, has capacitor problems (mostly it dont starts, if it starts it works fine)
is the flash memory in the 0200 the DALLAS chip? it is socketed, could be exchanged in the "new" carrier logic, to ensure it will work with the cpu that was in the faulty one...
i had some spare time this weekend :D and i decided to disasseble my dead o200... i completely disasebled the dual R10k/180/1M board. i cleaned the pins and so on... there is a thin something between the cpu and the heatsink. on one cpu this was completely loose, so i guess that cpu did not have (or had bad) contact to the heatsink. now i screwed together the cpu board with the other cpu only, and got this:

1A 000: Starting PROM Boot process
1A 000:
1A 000:
1A 000: IP27 PROM SGI Version 6.17 built 04:52:09 PM Sep 21, 1998
1A 000: using BaseIO nic
1A 000: Testing/Initializing memory ............... DONE
1A 000: Copying PROM code to memory ............... DONE
1A 000: Discovering local IO ...................... DONE
1A 000: update_klcfg_cpuinfo: Couldn't find my structure.
1A 000: Discovering CrayLink connectivity .........
1A 000: Local hub CrayLink is down.
1A 000: *** Local network link down
1A 000: DONE
1A 000: Found 1 objects (1 hubs, 0 routers) in 38663 usec
1A 000: Waiting for peers to complete discovery.... DONE
1A 000: No other nodes present; becoming global master
1A 000: Global master is /hw/module/1/slot/MotherBoard
1A 000: Testing/Initializing all memory ........... DONE
1A 000: *** Nasid 0: CPU B was previously Present & Enabled but is now Absent
1A 000: *** Nasid 0: Memory bank 1 was previously Present & Enabled but is now A bsen
1A 000: t
1A 000: *** Nasid 0: Memory bank 2 was previously Present & Enabled but is now A bsen
1A 000: t
1A 000: *** Nasid 0: Memory bank 1 was previously had 64 MB but now has 0 MB
1A 000: *** Nasid 0: Memory bank 2 was previously had 64 MB but now has 0 MB
1A 000: Checking partitioning information ......... DONE
1A 000: No other nodes present; becoming partition master
1A 000: *** No console found. Searching for console...
1A 000: *** Found console on /hw/module/1/slot/io1.
1A 000: *** You can change the console by setting the ConsolePath variable
1A 000: *** Setting ConsolePath variable and resetting.
1A 000: Starting PROM Boot process

then i disassebled the whole thing again, i noticed, that there is a pin missing from the second plastic socket that is between the cpu and the pcb. should this pin be missing? after turning on i got this:

1B 000: Testing/Initializing all memory ........... DONE
1B 000: *** Nasid 0: CPU A was previously Present & Enabled but is now Present & Dis
1B 000: abled
1B 000: *** Nasid 0: CPU B was previously Absent but is now Present & Enabled
1B 000: Checking partitioning information ......... DONE

how many working CPUs i have then? one? :D

after restarting i got then this:

1B 000: Starting PROM Boot process
1B 000: serial_pio failed with 2 errors
1B 000: serial_pio failed:
1B 000: RSLT serial_pio FAIL diag_rc = 63
1B 000:
1B 000: diag_serial_pio: /hw/module/1/slot/io7: FAILED
1B 000:
1B 000:
1B 000: IP27 PROM SGI Version 6.17 built 04:52:09 PM Sep 21, 1998
1B 000: using BaseIO nic
1B 000: Testing/Initializing memory ............... DONE
1B 000: Copying PROM code to memory ............... DONE
1B 000: Discovering local IO ...................... DONE
1B 000: Discovering CrayLink connectivity .........
1B 000: Local hub CrayLink is down.
1B 000: *** Local network link down
1B 000: DONE
1B 000: Found 1 objects (1 hubs, 0 routers) in 38388 usec
1B 000: Waiting for peers to complete discovery.... DONE
1B 000: No other nodes present; becoming global master
1B 000: Global master is /hw/module/1/slot/MotherBoard
1B 000: Testing/Initializing all memory ........... DONE
1B 000: Checking partitioning information ......... DONE
1B 000: No other nodes present; becoming partition master
1B 000: *** No console found. Searching for console...
1B 000: *** Found console on /hw/module/1/slot/io1.
1B 000: *** You can change the console by setting the ConsolePath variable
1B 000: *** Setting ConsolePath variable and resetting.
1B 000: Starting PROM Boot process
1B 000: serial_pio failed with 2 errors
1B 000: serial_pio failed:
1B 000: RSLT serial_pio FAIL diag_rc = 63
1B 000:
1B 000: diag_serial_pio: /hw/module/1/slot/io7: FAILED

what does it mean simply? there is no serial terminal and scsi hdd attashed to it...

thanks!
so, one of my CPUs is dead, thats sure. i dismounted it, now it runs with only one CPU. i get all this output on the MSC port. when i plug a serial cable into the serial port, i get only trash. the cable is working tho, when i use it on my octane. on the octane i can go into the PROM and do things, but on this o200, only trash comes out (like viewing an executeable file with an ASCII editor) how do i access the PROM on the o200? (i am afraid, i have to look for a working carrier + cpu...)

here is the MSC output i got now each time:

MSC> pwr u
ok
1A 000: Starting PROM Boot process
1A 000: serial_pio failed with 2 errors
1A 000: serial_pio failed:
1A 000: RSLT serial_pio FAIL diag_rc = 63
1A 000:
1A 000: diag_serial_pio: /hw/module/1/slot/io7: FAILED
1A 000:
1A 000:
1A 000: IP27 PROM SGI Version 6.17 built 04:52:09 PM Sep 21, 1998
1A 000: using BaseIO nic
1A 000: Testing/Initializing memory ............... DONE
1A 000: Copying PROM code to memory ............... DONE
1A 000: Discovering local IO ...................... DONE
1A 000: update_klcfg_cpuinfo: Couldn't find my structure.
1A 000: Discovering CrayLink connectivity .........
1A 000: Local hub CrayLink is down.
1A 000: *** Local network link down
1A 000: DONE
1A 000: Found 1 objects (1 hubs, 0 routers) in 38496 usec
1A 000: Waiting for peers to complete discovery.... DONE
1A 000: No other nodes present; becoming global master
1A 000: Global master is /hw/module/1/slot/MotherBoard
1A 000: Testing/Initializing all memory ........... DONE
1A 000: Checking partitioning information ......... DONE
1A 000: No other nodes present; becoming partition master
1A 000: *** No console found. Searching for console...
1A 000: *** Found console on /hw/module/1/slot/io1.
1A 000: *** You can change the console by setting the ConsolePath variable
1A 000: *** Setting ConsolePath variable and resetting.
1A 000: Starting PROM Boot process
1A 000: serial_pio failed with 2 errors
1A 000: serial_pio failed:
1A 000: RSLT serial_pio FAIL diag_rc = 63
1A 000:
1A 000: diag_serial_pio: /hw/module/1/slot/io7: FAILED

maybe bad serial port? there's a module near the serial ports, sticked onto the carrier logic. there is nothing on this little PCB. if i remove this, i get no BaseIO error...

thanks!
i could not get it goning, so i bought another o200. it is working, but has only one cpu. i played around with it, i tried to put the degraded 2 cpu module (with one cpu dismounted, other cpu is dead, probably burnt, due to bad heatsink connection) in the "new" carrier logic, it wanted to boot. :D now i moved the cpu from the "new" module to the old degraded one as a second cpu. it did not do anything. now i dismounted the new cpu from the old module (it became degraded again); it did not do anything :S. now i moved back the new cpu onto the new module, it goes well. what could be happened to the old module? why did it not function at least in degraded mode again? i asked some people on the chat about this, they said, i have to upgrade the PROM. which chip is the PROM on the carrier logic? is it the little PLCC one near the pciboard socket? what if i swap these chips (old one had dual, new one had single cpu originally)?

thanks!
hello!

now the origin complains, that some (1 :P ) of the processor(s) have old firmware, that i should upgrade... where do i get newer firmware for IP27 (Origin200/180/1M)?

thanks!
joerg wrote:
LionSGI wrote: hello!

now the origin complains, that some (1 :P ) of the processor(s) have old firmware, that i should upgrade... where do i get newer firmware for IP27 (Origin200/180/1M)?

thanks!


Its part of every IRIX installation and its located under "/usr/cpu/firmware". Just use "flash" and it will use the default location.

Regards
Joerg


hello! i have installed IRIX 6.5.22f yesterday. it did some upgrade, but it still complains about the old CPU flash. i used the flash command by hand, it did nothing (perhaps the installer did flash the newest that i have, so it is nothing more to do, unless i get even newer PROMs) as i can remember i have v6.150 now, it was 6.31 before... any ideas?

thanks!
hello!

this message comes on the console:

Code: Select all

Starting up the system...

WARNING: Some CPUs have old firmware.  Please update node board flash proms.

and this is hinv -mv:

Code: Select all

origin 4# hinv -mv
Location: /hw/module/1/slot/MotherBoard/node
PIMM_1XT5_1MB Board: barcode DWK946     part 013-1895-001 rev  C
Location: /hw/module/1/slot/MotherBoard/node/xtalk/8
IP29 Board: barcode DBF494     part 030-1025-002 rev  L
Location: /hw/module/1/slot/MotherBoard/node/xtalk/8/pci/2
1 180 MHZ IP27 Processor
CPU: MIPS R10000 Processor Chip Revision: 2.6
FPU: MIPS R10010 Floating Point Chip Revision: 2.6
CPU 0 at Module 1/Slot 1/Slice A: 180 Mhz MIPS R10000 Processor Chip (enabled)
Processor revision: 2.6. Scache: Size 1 MB Speed 120 Mhz  Tap 0x9
Main memory size: 384 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 1 Mbyte
Memory at Module 1/Slot 1: 384 MB (enabled)
Bank 0 contains 128 MB (Standard) DIMMS (enabled)
Bank 1 contains 64 MB (Standard) DIMMS (enabled)
Bank 2 contains 64 MB (Standard) DIMMS (enabled)
Bank 3 contains 128 MB (Standard) DIMMS (enabled)
Integral SCSI controller 0: Version QL1040B (rev. 2), single ended
Disk drive: unit 1 on SCSI controller 0 (unit 1)
Integral SCSI controller 1: Version QL1040B (rev. 2), single ended
CDROM: unit 2 on SCSI controller 1
IOC3/IOC4 serial port: tty1
IOC3/IOC4 serial port: tty2
IOC3 parallel port: plp1
Integral Fast Ethernet: ef0, version 1, module 1, slot MotherBoard, pci 2
Origin 200 base I/O, module 1 slot 1
PCI Adapter ID (vendor 0x10a9, device 0x0003) PCI slot 2
PCI Adapter ID (vendor 0x1077, device 0x1020) PCI slot 0
PCI Adapter ID (vendor 0x1077, device 0x1020) PCI slot 1
IOC3/IOC4 external interrupts: 1
HUB in Module 1/Slot 1: Revision 3 Speed 90.00 Mhz (enabled)
IP27prom in Module 1/Slot n1: Revision 6.150
origin 5#

Code: Select all

origin 14# uname -aR
IRIX64 origin 6.5 6.5.22m 10070055 IP27

so, how does IRIX know, that the CPU flash is old, when the newest is already flashed that it knows about?

thanks!
i successfully managed to kill my "new" origin200. i only put the dual cpu module in it (with one cpu) that it used to work with, now it does nothing anymore. so i put back the single cpu module (all CPUs are 180/1M), that i installed IRIX with 2 days ago; it does nothing, MSC works well :P i am tired of the origin200 by now. i'm sure it saved something in the PROM, or in the DALLAS or whereever... is there a way to reset these settings? can someone give me a full MSC command list?

thanks!