SGI: Hardware

A mysterious shut-down... - Page 1

I normally leave my Fuel run 24x7; however, the other day when I came home it was down and after reboot, I'd found in syslog:
unix: |$(0x15a)WARNING: 001a01 ATTN: 1.5V low warning limit reached @ 1.340V.
unix: |$(0x15a)WARNING: 001a01 ATTN: 1.5V low warning limit reached @ 1.269V.
unix: |$(0x160)WARNING: 001a01 ATTN: 1.5V level stabilized @ 1.452V.
unix: |$(0x15a)WARNING: 001a01 ATTN: 1.5V low warning limit reached @ 1.255V.
unix: |$(0x160)WARNING: 001a01 ATTN: 1.5V level stabilized @ 1.354V.
unix: |$(0x158)WARNING: 001a01 ATTN: 1.5V low fault limit reached @ 1.199V.
unix: WARNING: Auto power down will be delayed until shutdown is complete.
unix: |$(0x163)WARNING: 001a01 ATTN: power down aborted, environmental monitor reset
Xsession: mephisto: logout
INFO: The system is shutting down.
INFO: Please wait.
4D:IRIS /usr/etc/eventmond[794]: The child process was killed by the signal 9
0D:IRIS inetd[265]: inetd received SIGTERM; terminating.
3F:IRIS syslogd: going down on signal 15

What in your opinion is the exact reason for that? Can I do anything about it?
371- 528 - 818 - ?
Oskar45 wrote: I normally leave my Fuel run 24x7; however, the other day when I came home it was down and after reboot, I'd found in syslog:

Code: Select all

unix: |$(0x15a)WARNING: 001a01 ATTN: 1.5V low warning limit reached @  1.340V.
unix: |$(0x15a)WARNING: 001a01 ATTN: 1.5V low warning limit reached @  1.269V.
unix: |$(0x160)WARNING: 001a01 ATTN: 1.5V level stabilized @  1.452V.
unix: |$(0x15a)WARNING: 001a01 ATTN: 1.5V low warning limit reached @  1.255V.
unix: |$(0x160)WARNING: 001a01 ATTN: 1.5V level stabilized @  1.354V.
unix: |$(0x158)WARNING: 001a01 ATTN: 1.5V low fault limit reached @  1.199V.

What in your opinion is the exact reason for that?

Mainboard monitoring failure ...
Can I do anything about it?

You have a service contract. Get a new one. In the meanwhile < l1cmd env off > will get you back running. That's what us normal mortals have to do :) The environment monitoring on Fuel sucks. SGI probably lost plenty of money by cheapying out on the components there. Save a nickel, lose a dollar MBA-think. Oops.
hamei wrote: You have a service contract. Get a new one.

Heh, hamei, you really recall that I've a service contract!!! Congrats to your brain - should be preserved in some nicely-labelled jar!!! Anyway, I'll be fine with a new board in a few days. Don't want to think about what it oost me without contract, though!
371- 528 - 818 - ?
Oskar45 wrote:
hamei wrote: You have a service contract. Get a new one.

Heh, hamei, you really recall that I've a service contract!!! Congrats to your brain - should be preserved in some nicely-labelled jar!!!

It was, originally. Abby something ...
I had the same problem recently and had to kill the environmental monitoring. I don't have a contract though :(
hamei wrote: It was, originally. Abby something ...
:-D :-D :-D
zafunk wrote: I had the same problem recently and had to kill the environmental monitoring.

Hmm, I'm not sure that killing env monitoring will save you. As I've noted in my original post of this thread, the messages are written by the kernel so they will be in the syslog even if eventmond is not running. Also, I suspect my box would have been killed anyway regardless of whether eventmond was running or not ...
371- 528 - 818 - ?
Oskar45 wrote: Hmm, I'm not sure that killing env monitoring will save you. As I've noted in my original post of this thread, the messages are written by the kernel so they will be in the syslog even if eventmond is not running. Also, I suspect my box would have been killed anyway regardless of whether eventmond was running or not ...


Hmm.... well, only time will tell. So far, turning the monitoring off has saved me, but I may have to get a new mobo eventually :(
Oskar45 wrote:
zafunk wrote: I had the same problem recently and had to kill the environmental monitoring.

Hmm, I'm not sure that killing env monitoring will save you. As I've noted in my original post of this thread, the messages are written by the kernel so they will be in the syslog even if eventmond is not running. Also, I suspect my box would have been killed anyway regardless of whether eventmond was running or not ...


If you're shutting off env monitoring on the L1 (which is the case here) it won't report anything to the kernel. No-one said anything about turning off eventmond - this is more low level :)
Twitter: @neko_no_ko
IRIX Release 4.0.5 IP12 Version 06151813 System V
Copyright 1987-1992 Silicon Graphics, Inc.
All Rights Reserved.
zafunk wrote: Hmm.... well, only time will tell. So far, turning the monitoring off has saved me, but I may have to get a new mobo eventually :(

Going on two years now here .... if it finally dies it'll go into the scrapper. The Fuel just isn't that great a computer to spend SGI's version of money on.
agreed :twisted:
hamei wrote:
zafunk wrote: Hmm.... well, only time will tell. So far, turning the monitoring off has saved me, but I may have to get a new mobo eventually :(

Going on two years now here .... if it finally dies it'll go into the scrapper. The Fuel just isn't that great a computer to spend SGI's version of money on.


yes, sadly the fuel was more sort of a mips-pc like the alpha-pc those days or the 604e based ibms.
r-a-c.de
foetz wrote: yes, sadly the fuel was more sort of a mips-pc like the alpha-pc those days or the 604e based ibms.

Didn't mean to come off as quite so negative about the Fuel - it's not a bad computer, actually. It's just that SGI still had their heads up their asses when they priced it. Okay, the mainboard and cpu are low-volume high-cost items. But $600 for an off-the-shelf peecee case ? The exact same one they used in the 230 and 330 machines ? And the rest of their Fuel prices are equally nonsensical. No wonder they are bankrupt. No one who isn't a fanboy (us) is gonna spend six times what something is worth just because it says SGI on the outside. And then they shit on their fanboy constituency. Great.
You know, it sounds like these monitoring boards can probably be fixed... judging by what Oskar posted earlier, looks like it may be capacitors or something oscillating in the buffer amps. Anyone got a really good clear picture (IC part number readability is good) of what one looks like, working or not?
:O3000: <> :O3000: :O2000: :Tezro: :Fuel: x2+ :Octane2: :Octane: x3 :1600SW: x2 :O2: x2+ :Indigo2IMP: :Indigo2: x2 :Indigo: x3 :Indy: x2+

Once you step up to the big iron, you learn all about physics, electrical standards, and first aid - usually all in the same day
nekonoko wrote:
Oskar45 wrote:
zafunk wrote: I had the same problem recently and had to kill the environmental monitoring.

Hmm, I'm not sure that killing env monitoring will save you. As I've noted in my original post of this thread, the messages are written by the kernel so they will be in the syslog even if eventmond is not running. Also, I suspect my box would have been killed anyway regardless of whether eventmond was running or not ...


If you're shutting off env monitoring on the L1 (which is the case here) it won't report anything to the kernel. No-one said anything about turning off eventmond - this is more low level :)

Hmm - does that mean, if you shut off monitoring on the L1, the box would not die even if the voltage gets too low???
371- 528 - 818 - ?
Oskar45 wrote: Hmm - does that mean, if you shut off monitoring on the L1, the box would not die even if the voltage gets too low???


That's correct.
Twitter: @neko_no_ko
IRIX Release 4.0.5 IP12 Version 06151813 System V
Copyright 1987-1992 Silicon Graphics, Inc.
All Rights Reserved.
nekonoko wrote:
Oskar45 wrote: Hmm - does that mean, if you shut off monitoring on the L1, the box would not die even if the voltage gets too low???


That's correct.

If that's so, there seems then to be no need in this case to get a new mobo... :P
371- 528 - 818 - ?
Oskar45 wrote: If that's so, there seems then to be no need in this case to get a new mobo... :P


Pretty much ... as hamei said, he's been running his Fuel (which has the same problem) for two years without an issue :)
Twitter: @neko_no_ko
IRIX Release 4.0.5 IP12 Version 06151813 System V
Copyright 1987-1992 Silicon Graphics, Inc.
All Rights Reserved.
nekonoko wrote:
Oskar45 wrote: If that's so, there seems then to be no need in this case to get a new mobo... :P


Pretty much ... as hamei said, he's been running his Fuel (which has the same problem) for two years without an issue :)

OK... as I already got a new mobo, I'll ask my local SGI support to take it back again and give me my old one :)
371- 528 - 818 - ?
Oskar45 wrote: OK... as I already got a new mobo, I'll ask my local SGI support to take it back again and give me my old one :)


Heh, I wouldn't do that in your case as you're entitled, but for others it's a handy tip that can keep otherwise dead machines alive ;)

So far my Fuel hasn't exhibited this problem (fingers crossed) ...
Twitter: @neko_no_ko
IRIX Release 4.0.5 IP12 Version 06151813 System V
Copyright 1987-1992 Silicon Graphics, Inc.
All Rights Reserved.