Getting Started, Documentation, Tips & Tricks

new Fuel machine - Page 1

OK You were right. It's addictive...
3 weeks ago I asked you to help me with my new Octane (it works now so thanks again for all the help) and today I have SGI Fuel machine in my computer room.
So again few questions here:
1. IT has some memory problems I guess... After pressing start (power on) button it starts to blink with white led (screen is blank) and after additional 15 - 30 secs it starts to boot.
I new it will act this way as ebay seller informed about it... the problem is (guess) with one of the SIMMs, as after bootking the OS sees only 1,5 GB Ram instead of 2GB
I can remove 2 SIMMs as I don't need 2 GBs, but I have problem removing this blue shield which goes from cooler to RAM docs. Any hints? Do I have to remove HD first?
2. Can I use USB pen drives as it has USB ports? I tried to do it but with no luck. what FS type the pendrive schould be? how to mount it?
3. seams like my machine has PCI sound card of some sort. I'll make photos later and will ask you to help me setup it as it seams not to work IMO
4. and there's also some other PCI card with connectors loking like firewire or something... photos to come later.
5. last question is kind of general. are there any good-practices of which services from init level 2 (default init level) could be stoped as starting them at boot time makes booting process a bit longer.

so that's all for now.
my main goal to collect SGI (as well as my other 80's/90's machines) is just a hobby but I want to start some hobby project soon trying to create some images with gfx tools from 90's. just to see how different the tools were back then but also to see how much of progress we have with the gfx tools we use nowadays.

_________________
8bit guy with: :Fuel: :Octane:
crrn wrote:
I have problem removing this blue shield which goes from cooler to RAM docs. Any hints? Do I have to remove HD first?

Have you tried doing it this way ?
crrn wrote:
2. Can I use USB pen drives as it has USB ports? I tried to do it but with no luck. what FS type the pendrive schould be? how to mount it?

Unfortunately no .

_________________
:Indigo: :Indigo: :Indy: :Indy: :Indigo2: :Indigo2IMP: :Octane: :Fuel: Image
crrn wrote:
IT has some memory problems I guess... After pressing start (power on) button it starts to blink with white led (screen is blank) and after additional 15 - 30 secs it starts to boot.
Sounds suspiciously normal. :D The LED blinks white and the monitor is blank during power-on diagnostics - that's also roughly the right amount of time for power-on diagnostics with the amount of memory you have installed.

crrn wrote:
the problem is (guess) with one of the SIMMs, as after bootking the OS sees only 1,5 GB Ram instead of 2GB I can remove 2 SIMMs as I don't need 2 GBs
There's at least a small chance you might be able to revive the missing memory by clearing and resetting the Power On Diagnostic logs. Doesn't always work, but here's an example of success with the hardware-similar Origin 300: viewtopic.php?f=3&t=16725059

Prior to accessing your Fuel's innards, disconnect the power cord, the system isn't fully powered down unless completely disconnected from power.

Before you try the POD mode revival trick I'd suggest using the link GL1zdA provided to ensure your Fuel's memory is installed in the correct alternating-slot configuration , and that each DIMM is fully seated. Those plus-sized IP35 DIMMs have enough mass to suffer inertia disconnects during shipment - if you're lucky that'll be the underlying cause of the missing memory.

While you're in there, locate the 9-pin serial port on the system board. Connecting your serial terminal emulator of choice (38,400/8/N/1) to that port with a null modem cable is the best way to access both the L1 controller and PROM of your Fuel. The L1 will power up as soon as you re-connect power; once you power up the system you can access the PROM/POD by pressing the control and d keys on the serial terminal simultaneously.

crrn wrote:
Can I use USB pen drives as it has USB ports?
IRIX only supports a very limited subset of the USB standard .

The Fuel (IP35) Hardware Aggregator is a good resource for known-to-work Fuel hardware peripherals.

_________________
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************
Embrace the madness, crrn! :D

Audio card: Get the model number off of it. Photos are fine, but you can just lookup the board in the aggregator or on the Fuel NekoWiki page . There are only a few that are supported, but random no-name USB audio adapters seem to work very well (again, see aggregator or just go here: viewtopic.php?f=3&t=16725440&p=7341670 )

_________________
Then? :IRIS3130: ... Now? :O3x02L: :1600SW: +MLA :Fuel: :Octane2: :Octane: :Indigo2IMP: ... Other: DEC :BA213: :BA123: Sun , DG AViiON , NeXT :Cube:
OK more problems accured...

I tried the PROM/POD procedure but it hangs after typing 'pod' in the PROM commandline.
So I tried to remove DIMMs and put them back with different configs (actully I tried all possigle configs but anything seams to work. machine doesnt even start and flashes with white (3 flashes) and then starts to flash red. When I reinstall 4 of the DIMMs it boot's but now it seams it sees only 512 MB. :(
I assume I broke something in the DIMMS or memory docks... which is difficult to understand as I was extra carefull and unpluged power everytime I touched DIMMs.

Do you think that there is anything else I can do before buying new DIMMs on eBay?

I remeber about photos. Will post them later, no worries.

_________________
8bit guy with: :Fuel: :Octane:
So no combination of two of your four DIMMs inserted in slots 0 & 2 will allow it to pass POST, but all four slots full will? That's odd. Myself, I'd retry the combinations and make doubly certain that for each DIMM insertion the card-edge fingers really go in the socket all the way - I've had that problem before with these critters... By "make sure" I mean grab a flashlight and stick your head in the machine, and if need be use an inspection mirror.

Also, I suspect this is where having a terminal attached to the motherboard serial port might be helpful as you power up.

_________________
Then? :IRIS3130: ... Now? :O3x02L: :1600SW: +MLA :Fuel: :Octane2: :Octane: :Indigo2IMP: ... Other: DEC :BA213: :BA123: Sun , DG AViiON , NeXT :Cube:
crrn wrote:
I tried the PROM/POD procedure but it hangs after typing 'pod' in the PROM commandline.
I'd suggest trying the POD sequence again.

When you enter POD mode the system runs a comprehensive memory test, which might need a fair amount of time as it attempts to examine the disabled memory. The 'go cac' command is an indirect acknowledgement that POD mode wasn't built for speed. Once you get that far it allows POD to run in cached memory - but you'll still have to wait for the initial memory tests to run.

_________________
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************
recondas wrote:
... might need a fair amount of time as it attempts to examine the disabled memory ...

Oooh ! Oooh ! Wait ! Wait !

Yes, Gunther ?

There's a command to reset the memory. The bootup sequence will disable memory slots that it sees as bad. if you move memory around to test, you can end up with a bunch of slots marked "bad" that really aren't. And it doesn't always re-identify the "bad" memory as good on the next boot sequence.

I forget the command (Oldtimer's kicking in) but it's saved my arse a few times.
hamei wrote:
Oooh ! Oooh ! Wait ! Wait !

There's a command to reset the memory.... I forget the command (Oldtimer's kicking in) but it's saved my arse a few times.
enableall?

It was included in the POD mode sequence linked above, but not implicitly mentioned in this thread. So everything/one is on the same page, here it is again:
recondas wrote:
To access POD mode, stop at the PROM command line (item 5 in the PROM menu list), and sequentially run the following commands from the command line:
    pod
    go cac
    clearalllogs
    initalllogs
    flush
    reset (the system will restart)

After the system restarts, go back into the PROM monitor and execute:
    enableall
    update
    reset (the system will restart)

_________________
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************
OK, so here's the status:
1- connected via nullmodem cable to L1 on the motherboard.
2. seams that 3 banks were marked disabled but after POD procedure all banks are enabled and the Fuel can see 2GB RAM
so thanks for the help in this area so far.
but...

it can boot up to the OS and I can even login to my user account but when trying to start apps like blender, photoshop or anything bigger than xterm causes black screen and reboot.
all I can read (because the console output disapears too quickly) is some 'cache error'

so IMO it maigh mean that something's wrong with the RAM. but which DIMM? how can I disable it? any other options?
TIA

_________________
8bit guy with: :Fuel: :Octane:
I had an indigo2 that could crash like that. We suspected faulty memory and even exchanged the motherboard but it was actually the processor cache and replacing the PIMM fixed it. But it was one very specific maya command that would make it crash, not "a couple of big applications".

Do fuels still have IDE tests ? Older machines use that to test the memory, cpu, etc, more extensively than POST.. Not super reliable but better than nothing.

Test the system (actually boot into irix and run photoshop etc or whatever is making it crash) with one dimm set at a time. You might just have to run with reduced memory until you can order new.

Note that I'm not a fuel expert and have never seen one in real life...

_________________
:Onyx: (Maradona) :Octane: (DavidVilla) A1186 (Xavi)
A1370 (Messi) dp43tf (Puyol) A1387 (Abidal) A1408 (Guardiola)

"InfiniteReality Graphics - Power Through Complexity"
crrn wrote:
all I can read (because the console output disapears too quickly) is some 'cache error'

so IMO it maigh mean that something's wrong with the RAM. but which DIMM? how can I disable it? any other options?
Before you remove any RAM I'd suggest taking a look at the syslog to see there are any entries there that might give you a better idea of what's going on. If you can't access the syslog from the GUI tool in the Toolchest without generating a crash, you may have to print the contents (of /var/adm/SYSLOG) into a terminal window. You might also find entries related to the crash/error in the L1 log (connect the serial terminal and run the 'log' command at the L1 prompt).

If you aren't able to find anything useful in either log, then you could try booting IRIX from the serial terminal. Configure the serial terminal with a fairly large scroll back buffer (say 5k lines or so). Once IRIX is up and running try running a few of the commands/programs that previously caused the crash. If the system does crash the error message should remain in the serial terminal window as the Fuel crashes.

If any of these methods work, save the error message and post a copy here.

_________________
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************
Here's part of SYSLOG
Code:
Jan 15 21:43:59 6D:Fuel sn0log: The following are messages stored in the flashlog from a previous system boot.
Jan 15 21:43:59 6D:Fuel sn0log: Flashlog for /hw/module/001c01/node/hub/mon
Jan 15 21:43:59 5D:Fuel sn0log: A Info: *** Local network link down
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: Serial #: 080069104199
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: HARDWARE ERROR STATE:
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +  Errors on node Nasid 0x0 (0)
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +    IP35 in /hw/module/001c01/node [serial number MET870]
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +      BEDROCK signalled following errors.
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +        BEDROCK PI 0 Error Interrupt Register: 0x15000000
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          24: Memory/Directory uncorrectable error, access bits uncorrectable error, or protocol error
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          26: CPU A SysAD Data quality bad during data cycle
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          28: CPU A received uncorrectable error during a cached load
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +        BEDROCK PI 0 Error spool A:
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +            *** 34 Access Errors to unpopulated memory skipped
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +        BEDROCK PI 1 Error Interrupt Register: 0x1000000
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          24: Memory/Directory uncorrectable error, access bits uncorrectable error, or protocol error
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +        BEDROCK MD Mem Error Register: 0x1300ff008de62700
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          32<->03: bad entry pointer 0x11bcc4e0 << 3 = (0x8de62700)
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          47<->40: bad memory syndrome 0xff
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          57<->56: uncorrectable memory write ecc (valid, overrun)
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          61<->60: uncorrectable memory read ecc (valid)
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +      CPU cache on cpu 0
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +        Cache error register: 0xffffffff88062700
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          Secondary cache error: sidx 0x62700  Way 1 addr 0x62700
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          27:  D, Uncorrectable data array error, way 1
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: End Hardware Error State
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: ++FRU ANALYSIS BEGIN
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: No rules triggered:  Insufficient data
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal:
Jan 15 21:43:59 5D:Fuel sn0log: Timeout Histogram is empty.
Jan 15 21:43:59 5D:Fuel sn0log:
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: ++FRU ANALYSIS END
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: PANIC: Cache Error (unrecoverable secondary cache error) Eframe = 0x838
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal:
Jan 15 21:43:59 5D:Fuel sn0log: Dumping to /hw/module/001c01/Ibrick/xtalk/15/pci/1/scsi_ctlr/0/target/1/lun/0/disk/partition/1/block at block 0, space: 0x2000 pages
Jan 15 21:43:59 5D:Fuel sn0log: A Info: System dump startedA Fatal: Dumping low memory...A Fatal:
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: Dumping static kernel pages...A Fatal:
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: Dumping pfdat pages...A Fatal:
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: Dumping backtrace pages...A Fatal:
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: Dumping dynamic kernel pages...A Fatal: WARNING: A Fatal: Cached read: Poison Access Violation, node 0x0 paddr 0xe20900A Info: *** Local network link down
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: Restarting the machine...
Jan 15 21:43:59 6D:Fuel sn0log: End of flashlog for /hw/module/001c01/node/hub/mon
Jan 15 21:43:59 6D:Fuel sn0log: End of flashlog messages.


_________________
8bit guy with: :Fuel: :Octane:
crrn wrote:
all I can read (because the console output disapears too quickly) is some 'cache error'

Cache memory is on the processor module.
It probably won't help, but you should try reseating the processor module as well (these things can get a bit loose during shipping).
If reseating the PM doesn't help, all may not be lost. In addition to the usual spare parts sources, you may be able to find another nekochan member with a PM from a Fuel processor upgrade. I haven't noticed much demand for leftover 500Mhz Fuel PMs, so that route should be relatively inexpensive.

_________________
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************
@recondas, @ShadeOfBlue, all
Thanks for help so far.

Do you really think that it could be a PM? I cant read anythin interesting from SYSLOG so I'm not sure.
For me the problem looks like brokem memory DIMM. I had several examples of this kind of behaviour when computer crashed but it was in PC world.

The thing is I can't reproduce the error again. Meaning that I run my Fuel almost all day today and start trial/demo version of Photoshop, Soofice, blender, even demos and I cant get computer to crash (which normally could be a good sign ;) )
anyway I'll try to stress test it (any ideas of how to do it with memory intensive apps?) and raport back later.

One more question that comes to mind. why it reported broken memory DIMM previously and now we think it could be PM?
I assumed IT IS MEMORY but now I'm confused...

_________________
8bit guy with: :Fuel: :Octane:
crrn wrote:
One more question that comes to mind. why it reported broken memory DIMM previously and now we think it could be PM?
I assumed IT IS MEMORY but now I'm confused...

Cache is between memory and the CPU, so if cache is broken reading/writing the whole RAM becomes unreliable. If it would be one DIMM you wouldn't end up with all banks disabled and it's unlikely that all your DIMMs are broken. Since you live near Warsaw you can PM me if you would like to test your Fuel with a different set of DIMMs - I have 4 256 MB modules left after upgrading my Fuel which I can lend you.

_________________
:Indigo: :Indigo: :Indy: :Indy: :Indigo2: :Indigo2IMP: :Octane: :Fuel: Image
crrn wrote:
Do you really think that it could be a PM? I cant read anythin interesting from SYSLOG so I'm not sure.


This is why I think it's the processor module, or rather a poor connection between the PM and the motherboard:
crrn wrote:
Code:
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          26: CPU A SysAD Data quality bad during data cycle
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: +          28: CPU A received uncorrectable error during a cached load
[...]
Jan 15 21:43:59 5D:Fuel sn0log: A Fatal: PANIC: Cache Error (unrecoverable secondary cache error) Eframe = 0x838

The SysAD bus connects the CPU to the memory controller and the L2 cache.

A poor connection between the processor module and the motherboard can also cause such errors (e.g. if the pins that carry power don't make a good contact, it can cause such problems during heavy load), so reseat the module even if it currently works. Make sure you don't touch any of the chips or the connector or the pads on the motherboard, and use an ESD wrist strap (or hold the chassis with your other hand while you touch any components inside). Don't place any components on the carpet. (Sorry if this seems obvious to you, but some people do that and it decreases the lifespan of the parts, so it's worth repeating :) )

If you're lucky, it is just a bad connection to the motherboard and everything will be OK once you reseat the processor module. If you're not, one of the L2 cache chips is malfunctioning and you will need a new processor module.

Another possibility is a broken fan or high ambient temperature. You can look at the output of "l1cmd env" (run this as root) during heavy load. If any of the fans report 0 RPM, you have a problem (but this is easily fixed by replacing the fan :) ).
Normally, the system would shut down in such cases, but due to a bad batch of DS1780 monitoring chips, this option is disabled by most users with systems that have the defective monitoring chip.

crrn wrote:
anyway I'll try to stress test it (any ideas of how to do it with memory intensive apps?) and raport back later.

I once wrote a simple program for testing memory (it simply allocated as much RAM as it could get, then filled it with various patterns and verified the contents), but I'd have to dig it up from backups and I won't have the time to do that until next week.
The best option is probably to render a complex scene in Blender. There were some benchmark scenes posted for it here on the forum (or was that for Maya...).
Other than that, try to abuse it a bit :) Run something else while Blender is rendering, e.g. some demos.

Oh, you could also run the "confidence tests" (located somewhere in the toolchest menus) to test the hardware (I'm not sure if it does any memory tests, but it's worth a shot).
OK I had some time yesterday to play with my Fuel again.
In general I think it works with 1GB of Ram.
When I insrted 2GB and tried to stress test it with blender (createda huge scene with lots of poligons) it crashed in different moments/situations.
With 1GB it works OK!

Before testing 2GB I tightened the PM and everything that could possibly couse some connection problems, with no luck.
As it's my hobby machine and 1GB seams to be enough for this moment, I'll stay with it.

Now it's time to see what are the PCI cards that I have. Will post photos when I'm back at home today.

_________________
8bit guy with: :Fuel: :Octane: