SGI: Hardware

Fuel won't power on (Looks like it's not a bad power supply)

I have a Fuel that won't power up. Pressing the power button does nothing, the light bar never flashes, etc. I assumed it was a bad power supply at first but now I am not so sure. It has the 430 watt NMB power supply, and when plugged in the 5V standby rail is on and reasonably close to +5V. For further testing I removed it from the Fuel, plugged in a couple of old hard drives I keep around for load-testing PSU and connected the PS_ON pin of the WTX 24 pin connector to ground. All of the hard drives spun up and stayed running as long as PS_ON was grounded, which makes me think that at least the +5 and +12V rails are OK.

I've tried connecting a serial cable to the internal L1 port at 38400 8N1 and see no signs of life after hitting CTRL+T several times with or without a null modem adapter. Any ideas on what to check next?
I've had similar problem a year ago and spent a lot of time modifying a standard ATX power supply to replace the presumed dead original NMB. Finally, I found out that the V12 was faulty. Replaced it by a V10 I had on stock and... tadaaa !

More information here .

Did you try to connect to the L1 ? What's the output of "power" command ?
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
I tried connecting to the L1 via the internal serial port with a terminal emulator set to 38400 8,N,1 and didn't get any output. Should the L1 be up and running as long as +5V standby is up?

I don't have any spare Fuel graphics cards laying around, but I removed the card that was installed, and the machine still will not power on.
ClarusWorks wrote: I removed the card that was installed, and the machine still will not power on.

A fuel will definitely not power up without a working graphics card installed. btdt.

I'm not sure that having 5v means much. The big problem with fuel power supplies is not that the supply itself dies : there's a little SGI interface chip which dies, so the supply never gets a turn-on signal.

(That's a grossly simplified decription, I've forgotten the details, but essentially that's what happens. There's a thread here by Kubatysko describing replacing the fuel power supply with a standard one, he goes into the problem in depth.)
I never thought that a fat man's face would ever look so sweet ...
hamei wrote:
ClarusWorks wrote: I removed the card that was installed, and the machine still will not power on.

A fuel will definitely not power up without a working graphics card installed. btdt.


Thanks for that bit of info. Even if the main CPU won't power on, will the L1 come up without a graphics card? It'd be useful to get some signs of life and maybe some debug info out of this motherboard, otherwise I'm trying to troubleshoot a large multilayer board I have no schematics for with a multimeter and luck.

As for the PSU issues, I've read into those and my understanding is that the only difference between a Fuel PSU and a standard ATX of appropriate current ratings is the different pinout, and 2 pins used for fan monitoring. The pinouts I've seen only indicate one thing that could be a power on signal is PS_ON, pin 23. This is the pin I connected to GND when testing the PSU with the junk hard drives. The threads I've read about Fuel PSUs seem to imply that if you just rewire a regular ATX PSU correctly it will work with a Fuel if you disable environmental monitoring. I do have another PSU on the way, once I get it I will try rewiring it and see if that does anything.

I also think I remember at least one person who managed to fake out the fan speed signal with a 555 timer. If I can get this motherboard working I'd be happy to look at those fan signals with an oscilloscope and try to solve this mystery (lots of modern ATX power supplies do have a fan speed signal they feed back to the motherboard, it's possible that signal would work, or might work with some extra electronics added - usually the tachometer pulse off of a fan needs to be pulled up to some reference voltage and cleaned up a bit before something that wants to see square waves will like it).
ClarusWorks wrote: Even if the main CPU won't power on, will the L1 come up without a graphics card?

That, I can't say. For some reason I wanted to try it headless. Fuel was running fine, I took out the graphics card, it was dead as a doornail. Put the graphics card back in, started right up.

I also had a V12 that went bad. Had the same behaviour.

So I'm about 99% sure they won't run without a graphics card.
I never thought that a fat man's face would ever look so sweet ...
Actually, I've not tested whether the L1 would come up without a graphic card. But I'm sure the fuel won't start without.
On the other hand, as soon as I was typing "env" or "power" I got a message back.

Regarding the ATX power supply, that's right: the pinout is definitely different but can easily be rewired. But don't forget those WTX-like power supplies were designed for server equipment. The maximum current it can provide in one of the rails (I don't remember which one) is much higher than consumer-PC specifications, even for hard-core gamer equipment. Then, select your ATX replacement PSU with care.

As for how to fool the environment monitoring, I think there was a post recently from someone who attached the original fuel PSU fan to it's ATX PSU so that it reports its RPM to the monitoring... or maybe I'm wrong.
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
The replacement PSU I was intending to use, assuming this is actually a PSU failure, is a Zippy/Emacs PSM-6600P ATX server PSU (you've probably never heard of the name, but they build reliable stuff with good caps, etc. as opposed to the $30 '1000 Watt' PSUs that have tons of ripple and fuses that look suspiciously like jumper wire). It has the following ratings:
5V @ 30A
12V @ 26A
12V @ 20A
-12V @ 0.8A
3.3V @ 30A
5V Standby @ 2A
Combined 5V and 3.3V max is 50A
Combined 12V max is 40A

This is higher than the 430 watt NMB on everything except the 3.3V rail, which is 45A on the NMB 430W unit, but only 27 on the 460 watt Sparkle PSU or 24 on the 460 watt NMB that later Fuels used. Interestingly, the later 460 watt NMB actually has lower max ratings on both the +5 and +3.3 rails, and only marginally more +12V (18 and 18 vs 16 and 18). I'm guessing the combined max of +5 and +3.3 is higher, maybe the early NMB power supplies liked to cook themselves due to overcurrent on those rails.
ClarusWorks wrote: ... maybe the early NMB power supplies liked to cook themselves due to overcurrent on those rails.

The power supplies don't generally cook themselves. The stupid little interface chip dies and then you can't tell it to turn on.
I never thought that a fat man's face would ever look so sweet ...
I'm suspecting there's something screwy going on here: Today I was doing some probing around the L1, was originally seeing the 3.3V supply to the L1 PROM low (1.6V), but then I went to probe a second time, and now I'm getting 3.3V and the L1 is alive. I can now confirm that a Fuel L1 will in fact come up with no CPU or graphics card attached :-)

ALERT: Unknown PSC: 15
INFO: Cannot disable power supply: 17


SGI SN1 L1 Controller
Firmware Image B: Rev. 1.12.6, Built 04/22/2002 08:13:40


001a01-L1>
001a01-L1>?
ERROR: command not found.
001a01-L1>help
Commands are:
* autopower|apwr junkbus|jb|bedrock brick
partdb cpu nia|ni|ctc nib
iia|ii|cti iib config|cfg debug
display|dsp env fan help|hlp
history|hist l1dbg link log
ioport|ioprt istat l1 leds
margin|mgn network pimm port|prt
power|pwr reset|rst nmi softreset|softrst
select|sel serial eeprom uart
usb verbose router|rtr date
nvram security nextgen flash
reboot_l1 version|ver pbay test|tst
scan pci
enter 'hlp <cmd>' for more help on a single command.
001a01-L1>serial all

Data Location Value
------------------------------ ------------ --------
Local System Serial Number EEPROM 08:00:69:10:48:94
Local Brick Serial Number EEPROM MLA346
Reference Brick Serial Number NVRAM MLA346

EEPROM Product Name Serial Part Number Rev T/W
---------- -------------- ---------- -------------------- --- ------
NODE IP34 MLA346 030_1707_003 H 00
MAC MAC ADDRESS NA NA NA NA
PIMM no hardware detected
XIO no hardware detected

EEPROM JEDEC Info Part Number Rev Speed (ns)
---------- ------------------------ ------------------ --- ----------
DIMM 0 CE0000000000000028260D00 M3 47L6510BT1-CA0 0B 10.0
DIMM 2 no hardware detected
DIMM 1 CE0000000000000028130A00 M3 47L6510BT1-CA0 0B 10.0
DIMM 3 no hardware detected

001a01-L1>leds
CPU A: 0xff: Console poll found data for reading
CPU B: 0x00: PLED_RESET: Slave loop (0x00/0x45=okay, solid 0x00=possibly hung)
CPU C: 0x00: PLED_RESET: Slave loop (0x00/0x45=okay, solid 0x00=possibly hung)
CPU D: 0x11: PLED_INITICACHE
001a01-L1>
001a01-L1>pwr
Supply State Voltage Margin Value
-------------- ----- --------- ------- -----
12V off 5.563V N/A
12V IO NC 0.000V N/A
5V NC 0.676V N/A
3.3V NC 1.531V default 0
2.5V off 1.157V default 0
1.5V NC 1.255V default 0
5V aux NC 2.314V N/A
3.3V aux NC 0.000V N/A
PIMM0 12V bias <not present>
unknown NC ERROR (-204) default ERROR (-207)
Asterix CPU <not present>
PIMM0 1.5V <not present>
PIMM0 3.3V aux <not present>
PIMM0 5V aux <not present>
XIO 12V bias <not present>
XIO 5V <not present>
XIO 2.5V <not present>
XIO 3.3V aux <not present>
001a01-L1>
hamei wrote: The power supplies don't generally cook themselves. The stupid little interface chip dies and then you can't tell it to turn on.

This is a bit of a crazy question, but... Do we know, or can we find out enough about the "stupid little interface chip" to be able to produce a replacement or some other workaround in order to revive the power supply instead of hacking in an ATX supply as a replacement?
nyef wrote:
hamei wrote: The power supplies don't generally cook themselves. The stupid little interface chip dies and then you can't tell it to turn on.

This is a bit of a crazy question, but... Do we know, or can we find out enough about the "stupid little interface chip" to be able to produce a replacement or some other workaround in order to revive the power supply instead of hacking in an ATX supply as a replacement?


I don't think there is an "interface chip". The ATX supply mods with the exception of the mysterious fan signals appear to just be swapping wires around on the 24 pin and 8 pin connectors. The chip that some people seem to have had fail in their NMB Fuel PSUs is a PIC microcontroller that the mystery fan signals as well as some other stuff are connected to. Actually powering on the PSU is just a matter of pulling the power on pin low, no chips involved. It's possible that the fan signal is somehow special, it's also possible that all the PIC is doing is filtering a fan tachometer signal from the mechanical fan itself since those signals tend to be extremely dirty by default.
ClarusWorks wrote: I can now confirm that a Fuel L1 will in fact come up with no CPU or graphics card attached :-)

That's good to know. The last time mine died, I could get nothing from the L1.

So it went in the garbage ....
ClarusWorks wrote: The chip that some people seem to have had fail in their NMB Fuel PSUs is a PIC microcontroller that the mystery fan signals as well as some other stuff are connected to. Actually powering on the PSU is just a matter of pulling the power on pin low, no chips involved.

How do you plan to do that ? With a toggle switch on the front ? To actually use it as a Fuel replacement you need to have that interace chip functioning. Or go through the entire Kubatysko hardware scenario, with faking out the tech signal, etc etc.

That's not exactly a direct replacement :(

also, it's not just in the nmb supply, its in all of them :(

It's a total pain in the ass :(

Knowing what's in the microcontroller would solve everything but that's apparently not so simple :(
I never thought that a fat man's face would ever look so sweet ...
The power on pin is a normal ATX power supply thing. Your standard PC PSU has it as well, it's just in a different location. If that wasn't the case, Kubatysko's ATX mod would not work at all. The faking out the tach signal is there to make environmental monitoring happy. My idea would be instead of faking it, use a PSU with a 3 wire fan (third wire is the tach) and run that signal into the tach pin on the 24 pin connector.
Go for it :P Hope you are successful ....
I never thought that a fat man's face would ever look so sweet ...
So, here's a quick update on where I am after looking at the machine again tonight:

Sometimes the L1 comes up, sometimes it does not. Unplugging/replugging the AC cord from the power supply 2 or 3 times will eventually get it to come up. Possible strike against the power supply there.

When the L1 comes up, it can read the serial numbers on the motherboard and PIMM but not the V10 card, XIO reports "No hardware detected". If I try to power the machine up with environmental monitoring on, I get some possibly bogus temperature and voltage warnings:

Code: Select all

SGI SN1 L1 Controller
Firmware Image A: Rev. 1.12.6, Built 04/22/2002 08:13:40


001a01-L1>pwr up
001a01 ATTN: 12V low fault limit reached  5.563V.

001a01 ATTN: brick auto power down in 30 seconds

001a01-L1>
001a01 ATTN: brick auto power down in 25 seconds
env off
001a01-L1>
001a01 ATTN: power down aborted, environmental monitor reset

001a01 ATTN: NODE 0 fault temperature reached @ 89 C/ 192 F.


If I disable environmental monitoring, the machine will power up, and one of two things happens: If the V10 card is installed, the light bar stays on solid red. If the V10 card is not installed, the light bar turns red for a few seconds, and then just keeps flashing white forever.

I tried running some of the tests in the L1, and think there might be some bad DS1780s on this board (which could explain the crazy temperature readings and low 12V warning). It also looks like I may want to invest in a Dallas chip, or apply a Dremel tool and CR2032 holder to this one.

Code: Select all

001a01-L1>log
02/07/06 06:28:15 checksum Error - common header initialized
02/07/06 06:28:15 nvram checksum error - initializing core data.
02/07/06 06:28:15 nvram checksum error - initializing extended data.
02/07/06 06:28:15 nvram checksum error - log pointers invalid, log pointers rest
02/07/06 06:28:15 L1 booting...
02/07/06 06:28:15 NVRAM doesn't match EEPROM
02/07/06 06:28:15 USB0: waiting on open
02/07/06 06:28:15 checking power status
02/07/06 06:28:15 power up (COMMAND)
02/07/06 06:28:15 12V low fault limit reached  5.563V.
02/07/06 06:28:15 power down (COMMAND)
02/07/06 06:28:15 NODE 0 fault temperature reached @ 89 C/ 192 F.
02/07/06 06:28:15 checking power status
02/07/06 06:28:15 power up (COMMAND)
02/07/06 06:28:15 power down (PANEL)
001a01-L1>ERROR: Reset error: scan error - unable to reset scan hardware

001a01-L1>pwr
Supply          State Voltage    Margin  Value
--------------  ----- ---------  ------- -----
12V     on    5.563V      N/A
12V IO     NC   12.188V      N/A
5V     NC    5.096V      N/A
3.3V     NC    1.531V  default     0
2.5V     on    1.157V  default     0
1.5V     NC    1.255V  default     0
5V aux     NC    2.314V      N/A
3.3V aux     NC    0.000V      N/A
PIMM0 12V bias     NC    5.813V      N/A
unknown     NC    1.209V  default     0
Asterix CPU     on    1.311V  default    93
PIMM0 1.5V     NC    1.311V  default     0
PIMM0 3.3V aux     NC    1.600V      N/A
PIMM0 5V aux     NC    2.418V      N/A
XIO 12V bias     <not present>
XIO 5V     <not present>
XIO 2.5V     <not present>
XIO 3.3V aux     <not present>
001a01-L1>tst ioexp get all
0:      BUS SELECT I/O Expander: 0x78
1:  MARGIN CONTROL I/O Expander: 0x00
2:        LED0 LOW I/O Expander: 0x05
3:  ASIC CONTROL 1 I/O Expander: 0x7b
4:  ASIC CONTROL 2 I/O Expander: 0x33
5:    POWER ENABLE I/O Expander: 0xfa
6:      XB CONTROL I/O Expander: 0xb7
7:    PIMM CONTROL I/O Expander: 0x30
8:     ODYSSEY GFX I/O Expander: not present
9: DISPLAY CONTROL I/O Expander: 0x17
001a01-L1>tst i2cNODE 0 DS1780 Company ID miscompare, bus: 2 addr: 2c act: 59 ea
PIMM DS1780 Company ID miscompare, bus: 4 addr: 2e act: 5d exp: da

error -1 on tst.
001a01-L1>tst i2cNODE 0 DS1780 Company ID miscompare, bus: 2 addr: 2c act: 59 ea
PIMM DS1780 Company ID miscompare, bus: 4 addr: 2e act: 5d exp: da

error -1 on tst.


At this point, I think what I'm planning to do is go ahead and convert an ATX PSU since I already have one on the way. If the board still acts crazy after that, I'll procure a motherboard from somewhere and hope my PIMM and memory are good. I have no idea if the V10 card is broken, or if there's an I2C bus issue preventing the L1 from seeing it.
ClarusWorks wrote: Sometimes the L1 comes up, sometimes it does not. Unplugging/replugging the AC cord from the power supply 2 or 3 times will eventually get it to come up.

On a Fuel ? This is common on O350's. Haven't heard of it on the fuel :(
I never thought that a fat man's face would ever look so sweet ...