SGI: Hardware

Fuel error (ODYSSEY, Power supply ?)

Hi.

Since monts, I'm having issue with my fuel. I thought it was the power supply (NMB GM430WTXW01SSV rev B, manufactured 38th week of 2003) and modified an old PC ATX 2.2 one (Xilence XP420) according to this post . Unfortunately it didn't work better.
Today, I've installed the original power supply back into the chassis. When I connect with putty, I get the following:

Code: Select all

ALERT: Error reading the ODYSSEY temperature sensor, no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge

I tried re-seating the V12, without success.

What to do ? Does anyone have an idea ?

Thanks in advance.

BetXen
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : hydrogen, my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
Some additional outputs... Hopefully it may help understand what happens !

Code: Select all

001a01-L1>env

************************************************
ATTENTION: Environmental monitoring is disabled!
************************************************

Description    State       Warning Limits     Fault Limits       Current
-------------- ----------  -----------------  -----------------  -------
12V   Disabled  10%  10.80/ 13.20  20%   9.60/ 14.40    0.000
12V IO   Disabled  10%  10.80/ 13.20  20%   9.60/ 14.40    3.563
5V   Disabled  10%   4.50/  5.50  20%   4.00/  6.00    0.416
3.3V   Disabled  10%   2.97/  3.63  20%   2.64/  3.96    0.585
2.5V   Disabled  10%   2.25/  2.75  20%   2.00/  3.00    0.000
1.5V   Disabled  10%   1.35/  1.65  20%   1.20/  1.80    0.000
5V AUX   Disabled  10%   4.50/  5.50  20%   4.00/  6.00    5.122
3.3V AUX   Disabled  10%   2.97/  3.63  20%   2.64/  3.96    3.285
PIMM 12V BIAS   Disabled  10%  10.80/ 13.20  20%   9.60/ 14.40    3.563
SRAM   Disabled  10%   2.25/  2.75  20%   2.00/  3.00    0.052
VCPU   Disabled  10%   1.13/  1.38  20%   1.00/  1.50    0.014
PIMM 1.5V   Disabled  10%   1.35/  1.65  20%   1.20/  1.80    0.042
PIMM 3.3V AUX   Disabled  10%   2.97/  3.63  20%   2.64/  3.96    3.285
PIMM 5V AUX   Disabled  10%   4.50/  5.50  20%   4.00/  6.00    5.148
XIO 12V BIAS   Disabled  10%  10.80/ 13.20  20%   9.60/ 14.40    0.000
XIO 5V   Disabled  10%   4.50/  5.50  20%   4.00/  6.00    0.000
XIO 2.5V   Disabled  10%   2.25/  2.75  20%   2.00/  3.00    0.000
XIO 3.3V AUX   Disabled  10%   2.97/  3.63  20%   2.64/  3.96    0.000

Description     State       Warning RPM  Current RPM
--------------- ----------  -----------  -----------
FAN  0  EXHAUST   Disabled          920            0
FAN  1       HD   Disabled         1560            0
FAN  2      PCI   Disabled         1120            0
FAN  3    XIO 1   Disabled         1600            0
FAN  4    XIO 2   Disabled         1600            0
FAN  5       PS   Disabled         1349            0

Advisory   Critical   Fault      Current
Description       State       Temp       Temp       Temp       Temp
----------------- ----------  ---------  ---------  ---------  ---------
0 NODE 0           Wait Pwr    [Autofan Control]    80C/176F   23C/ 73F
1 NODE 1           Wait Pwr    [Autofan Control]    80C/176F   24C/ 75F
2 NODE 2           Wait Pwr    [Autofan Control]    80C/176F   23C/ 73F
3 PIMM             Wait Pwr    [Autofan Control]    80C/176F   23C/ 73F
4 ODYSSEY          Wait Pwr    [Autofan Control]    80C/176F    0C/ 32F
5 BEDROCK          Wait Pwr  Not currently available


************************************************
ATTENTION: Environmental monitoring is disabled!
************************************************
It seems I have a 12V problem... "12V IO" and "PIMM 12V BIAS" are at 3.563V !

Code: Select all

001a01-L1>power
Supply          State Voltage    Margin  Value
--------------  ----- ---------  ------- -----
12V    off    0.000V      N/A
12V IO     NC    3.563V      N/A
5V     NC    0.416V      N/A
3.3V     NC    0.585V   normal     0
2.5V    off    0.000V   normal     0
1.5V     NC    0.000V   normal     0
5V AUX     NC    5.122V      N/A
3.3V AUX     NC    3.285V      N/A
PIMM 12V BIAS     NC    3.563V      N/A
SRAM     NC    0.052V   normal     0
VCPU    off    0.014V   normal   112
PIMM 1.5V     NC    0.042V   normal     0
PIMM 3.3V AUX     NC    3.285V      N/A
PIMM 5V AUX     NC    5.148V      N/A
XIO 12V BIAS     NC ERROR (-404)      N/A
XIO 5V     NC ERROR (-404)      N/A
XIO 2.5V    off ERROR (-404)   normal     0
XIO 3.3V AUX     NC ERROR (-404)      N/A
WARNING: power appears off, console unavailable
VCPU = 112 ?? What does it mean ?
All XIO values having error... :roll:

And from time to time...

Code: Select all

power on 001a01 appears off!
WARNING: power on 001a01 appears off!
ALERT: Error reading the ODYSSEY temperature sensor, no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge

escaping to L1 system controller
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : hydrogen, my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
Some new outputs, inspired by this post .

Code: Select all

01a01-L1>serial all

Data                            Location      Value
------------------------------  ------------  --------
Local System Serial Number      EEPROM        08:00:69:10:7C:A7
Local Brick Serial Number       EEPROM        NCJ712
Reference Brick Serial Number   NVRAM         NCJ712


EEPROM      Product Name    Serial         Part Number           Rev  T/W
----------  --------------  -------------  --------------------  ---  ------
NODE        IP34            NCJ712         030_1707_004          C    00
MAC         MAC ADDRESS     NA             NA                    NA   NA
PIMM        IP34PIMM        NNN514         030_1932_001          B    00
XIO         ASTODY          MLD922         030_1726_003          E    00

EEPROM     JEDEC-SPD Info           Part Number        Rev  Speed  SGI
---------- ------------------------ ------------------ ---- ------ --------
DIMM 0     CE000000000000000C989800 M3 46L2820DT2-CA0   2D   10.0  N/A
DIMM 2     CE000000000000000C313000 M3 47L6423DT2-CA0   2D   10.0  N/A
DIMM 1     CE000000000000000C7F9800 M3 46L2820DT2-CA0   2D   10.0  N/A
DIMM 3     CE000000000000000CEB2300 M3 47L6423DT2-CA0   2D   10.0  N/A

Code: Select all

001a01-L1>power up
*** Ctrl-D ***
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error initializing the ODYSSEY monitor, no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 12V BIAS) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 2.5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 3.3V AUX) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY temperature monitoring: no acknowledge
ALERT: Error enabling fans on device 0
ALERT: Error enabling fans on device 0
ALERT: Error initializing the ODYSSEY monitor, no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 12V BIAS) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 2.5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 3.3V AUX) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY temperature monitoring: no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error disabling fans on device 4
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error disabling fans on device 4
ALERT: Error disabling fans on device 4
ALERT: Error disabling fans on device 4
ALERT: Error disabling fans on device 4
ALERT: Error disabling fans on device 4
ALERT: Error disabling fans on device 4
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error disabling fans on device 4
ALERT: Error initializing the ODYSSEY monitor, no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 12V BIAS) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 2.5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 3.3V AUX) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY temperature monitoring: no acknowledge
ALERT: Error initializing the ODYSSEY monitor, no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 12V BIAS) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 2.5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 3.3V AUX) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY temperature monitoring: no acknowledge
ERROR: I2C:no acknowledge

Code: Select all

001a01-L1>cfg
:0  - 001a01

001a01-L1>ver
L1 1.40.4 (Image B), Built 09/29/2005 13:43:39    [Fuel/PE/O300 1MB image]
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : hydrogen, my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
Did your environment chips ever work in this machine? Were they intentionally disabled?
smit happens.

:Fuel: bigred , 900MHz R16K, 4GB RAM, V12 DCD, 6.5.30
:Indy: indy , 150MHz R4400SC, 256MB RAM, XL24, 6.5.10
:Indigo2IMP: purplehaze , R10000, Solid IMPACT
probably posted from Image bruce , Quad 2.5GHz PowerPC 970MP, 16GB RAM, Mac OS X 10.4.11
plus IBM POWER6 p520 * Apple Network Server 500 * HP C8000 * BeBox * Solbourne S3000 * Commodore 128 * many more...
Yes. They were working and I disabled them when I first tested the ATX power supply, because no fan was connected. I should probably turn it on again, since I reinstalled the original one.

I had a look at the NMB PS internals yesterday. The fan only has two wires and no additional circuit board. The fan is directly connected to the lower PCB.
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : hydrogen, my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
BetXen wrote:

Code: Select all

ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error initializing the ODYSSEY monitor, no acknowledge
ALERT: Error reading monitor ODYSSEY interrupt status 1: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 12V BIAS) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 2.5V) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY power (XIO 3.3V AUX) monitoring: no acknowledge
ALERT: Error configuring ODYSSEY temperature monitoring: no acknowledge
...
ERROR: I2C:no acknowledge

ODYSSEY is the internal name for VPro graphics. There's only one XIO slot in a Fuel and it's the graphics slot.

It appears to me that the L1 has troubles communicating (via I2C) to the 'monitor' chip on the V12 which monitors various voltages and a temperature reading. It makes sense that this monitor chip is something similar to the notorious environmental monitoring chips used on the Fuel main board.

I'd take out the V12 and inspect it for DS1780 or other Dallas chips. If you can buy / borrow another card that would make it easier to confirm the guilty piece as well.

FWIW: I've got 'no acknowledge' on I2C reads to the CPU of my Fuel . But there's apparently no environmental monitoring going on there so the damage is limited to the CPU serial number disappearing from a 'hinv -mv' :?
Now this is a deep dark secret, so everybody keep it quiet :)
It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi

Currently in commercial service: Image :Onyx2: (2x) :O3x02L:
In the museum : almost every MIPS/IRIX system.
Wanted : GM1 board for Professional Series GT graphics (030-0076-003, 030-0076-004)
Thanks jj,

I still have the V10. I'll have a try.

Apparently all 5V and 3.3V voltages are OK. However 12V is very strange (~3.5V). Is it normal before power on ? Could it be a cause to the problem ? I guess the I2C is ont he 5 or 3.3V line, right ?
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : hydrogen, my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
I feel so stupid !

With the V10, it just works... End :D

Actually:
- "env" gives the same values, including the strange 12V IO at 3.563V
- no more "NC ERROR (-404)" after "power", for all XIO voltages
- "pwr up" and... tadaaa !

That's so nice to see it running again. I'd almost lost hope.

Any idea what I could try to resurrect the V12 ?
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : hydrogen, my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
I've had a look at some other env outputs. For instance, jj's Origin 350 reports the following.

Code: Select all

XIO 12V BIAS <not present>
XIO 5V <not present>
XIO 2.5V <not present>
XIO 3.3V AUX <not present>


Since O350 may accommodate a V10/12, does it mean the system knows when XIO additional board is present or not ? Then in my case, it feels something is in, but cannot get any answer from the I2C ?
Maybe I should try removing the graphic board and repeat the test to see whether I get the same "<not present>" message as jj.
:Onyx2: : oxygen / :A3504L: :A3504L: : neon (16xItanium2 1.6, L2 9MB) / :O200: :O200: : beryllium
:Fuel: : nitrogen / :Octane2: : carbon / :Octane: : fluorine
:O2: : hydrogen (R10k 195, 512Mo) / :O2: : sodium (R5k 180, 512Mo) / :O2: : R5k 180->200 motherboard and PM only
:Indigo2IMP: : helium (R10k 195, HighImpact, 160Mo) / :Indigo2IMP: : boron / :Indigo: : magnesium
:4D70G: 4D70GT : hydrogen, my very first one (now property of musée bolo and the foundation mémoires informatiques )
See the hinv/gfxinfo posts here .
FYI, Fuel has 5 DS1780 monitoring chips, 3 on the motherboard (one between PCI slots, one near the external connectors, and one near the PSU socket), 1 on the PIMM, and 1 on the V10/V12 - if I recall, it's near the connector header, somewhere near the 1x2 and 2x4 pin headers and the smaller heatsink.
The V10/V12 also has an I2C EEPROM, and they are likely on the same I2C bus (but I do know there's also an I2C multiplexer on the board), so when the board is on (and powered), you might be able to scan the I2C bus and see if the DS1780 responds at all...
[click for links to hinv] JP: :Fuel: | :O2: | :Indy: || PL: [ :Fuel: :O2: :O2+: :Indy: ]