The collected works of t-rexky

Good day,

I finally had a chance to spend a bit more time with my two Tezro machines and I noticed that the Odyssey V12 video boards seem to run a bit warm on both of them. Once the machines have been running for a period of time, I frequently get the following type of messages on the console and in the logs:

Code:
Dec 23 20:41:41 4A:tezro unix: |$(0x15f)WARNING: 001c01 ATTN: ODY zone advisory limit reached 50 C/ 122 F  Fan: 87
Dec 23 20:42:41 4A:tezro unix: |$(0x162)WARNING: 001c01 ATTN: Cooling system stabilized
Dec 23 21:12:03 4A:tezro unix: |$(0x15f)WARNING: 001c01 ATTN: ODY zone advisory limit reached 50 C/ 122 F  Fan: 87
Dec 23 21:13:03 4A:tezro unix: |$(0x162)WARNING: 001c01 ATTN: Cooling system stabilized


These occur with virtually no load on the video hardware, so on the surface it would seem to me that the V12 boards are running somewhat warmer than intended. Interestingly both machines are exhibiting exactly the same behaviour, so I doubt that it is a hardware issue. For reference, both V12 boards have the DCD option and both machines are equipped with the DM3 cards. Removing the DM3 card appears to mostly correct this, but I still have seen an occasional message:

Code:
Dec 26 16:05:17 4A:tezro unix: |$(0x15f)WARNING: 001c01 ATTN: ODY zone advisory limit reached 51 C/ 123 F  Fan: 80
Dec 26 16:08:17 4A:tezro unix: |$(0x162)WARNING: 001c01 ATTN: Cooling system stabilized
Dec 26 23:11:55 4A:tezro unix: |$(0x15f)WARNING: 001c01 ATTN: ODY zone advisory limit reached 50 C/ 122 F  Fan: 80
Dec 26 23:12:15 4A:tezro unix: |$(0x162)WARNING: 001c01 ATTN: Cooling system stabilized


With the DM3 board removed, it seems that the temperatures are just marginally below the advisory level:

Code:
tezro 1# l1cmd env
Environmental monitoring is enabled and running.

Description    State       Warning Limits     Fault Limits       Current
-------------- ----------  -----------------  -----------------  -------
1.8V    Enabled  10%   1.62/  1.98  20%   1.44/  2.16    1.875
12V    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.000
12V #2    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.125
3.3V    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.474
2.5V    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.613
12V IO    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.063
5V AUX    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    5.070
3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.268
5V    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    5.070
XIO 12V BIAS    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.063
XIO 5V    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    5.070
XIO 2.5V    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.574
XIO 3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.285
IP53 3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.302
IP53 5V AUX    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    5.044
IP53 12V    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   11.875
IP53 VCPU    Enabled  10%   1.13/  1.38  20%   1.00/  1.50    1.283
IP53 SRAM    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.574
IP53 1.5V    Enabled  10%   1.35/  1.65  20%   1.20/  1.80    1.551

Description     State       Warning RPM  Current RPM
--------------- ----------  -----------  -----------
FAN  0   NODE 1    Enabled         1800         2109
FAN  1   NODE 2    Enabled         1800         2136
FAN  2   NODE 3    Enabled         1800         2149
FAN  3    PCI 1    Enabled         1350         1430
FAN  4    PCI 2    Enabled         1350         1493
FAN  5       HD    Enabled         1620         4218
FAN  6    ODY 1    Enabled         1300         2220
FAN  7    ODY 2    Enabled         1300         2083

Advisory   Critical   Fault      Current
Description       State       Temp       Temp       Temp       Temp
----------------- ----------  ---------  ---------  ---------  ---------
0 INTERFACE 0       Enabled    [Autofan Control]    76C/168F   39C/102F
1 INTERFACE 1       Enabled    [Autofan Control]    76C/168F   35C/ 95F
2 INTERFACE 2       Enabled    [Autofan Control]    76C/168F   34C/ 93F
3 INTERFACE 3       Enabled    [Autofan Control]    76C/168F   41C/105F
4 ODYSSEY           Enabled    [Autofan Control]    76C/168F   50C/122F
5 NODE              Enabled    [Autofan Control]    76C/168F   55C/131F
6 BEDROCK           Enabled    [Autofan Control]    85C/185F   55C/131F

Zone Temp     Target    Current   Zone Fan   Curr/Min
Zone Name  State     Sensors       Average   Average   Index      Fan %
---------  --------  ------------  --------  --------  ---------  ---------
Node        Enabled           5,6  62C/143F  55C/131F          0   46%/ 46%
PCI         Enabled       0,1,2,3  45C/113F  37C/ 98F        3,4   57%/ 57%
ODY         Enabled             4  50C/122F  50C/122F          6   78%/ 64%
HD          Enabled             5  40C/104F  55C/131F          5   80%/ 38%


I looked through the forums here and came across a number of posts with sample environmental monitor output on Tezros. They all seem lower than what I see on both of my machines. Can anyone offer any thoughts on this? Should I worry?

Thank you.
canavan wrote:
Is this a rack mount tezro? I don't see the "BOOST" zone or Fans 8-10 listed.


Hi. Both units are "desktops", not rack mount...
diegel wrote:
I don't think this is normal. That's the temperature of my Tezro running 24/7 at 20C room temperature:
Code:
Zone Temp     Target    Current   Zone Fan   Curr/Min
Zone Name  State     Sensors       Average   Average   Index      Fan %
---------  --------  ------------  --------  --------  ---------  ---------
Node        Enabled           5,6  62C/143F  47C/116F          0   46%/ 46%
PCI         Enabled       0,1,2,3  45C/113F  32C/ 89F        3,4   57%/ 57%
ODY         Enabled             4  50C/122F  41C/105F          6   64%/ 64%
HD          Enabled             5  40C/104F  44C/111F          5   49%/ 38%
Probably there is dust in the machines or any parts blocking the airflow.


Thank you for the information. Both of these machines were very clean when I picked them up, with just a trace of black deposits on some of the parts. Nonetheless the machine I am currently using was very thoroughly cleaned and inspected before I started using it on a more regular basis. I don't believe that there is anything blocking the cooling flow either, so I am completely baffled. I wonder if perhaps the temperature sensors have somehow drifted their calibration...
recondas wrote:
The suggestion to make sure the fans in your Tezro are unobstructed is a good one - continued operation at those temperatures might well cause damage. The DM3 is positioned fairly close to the perforated metal shroud that encloses the V12, so it's likely the graphics in your Tezros run cooler without the DM3 because of improved airflow.

I think SGI was aware that graphics boards in certain tower Tezros configurations could run hot. Later revisions of the V12 used in the Tezro included what appears to be an (unused) 3-pin fan header - the one in the photo is an 030-1884-002 Revision D: To my knowledge that V12 fan header was never used - along with the development of 256 and 512MB versions of the V12 it may have fell victim to SGI's decision to scale back/drop MIPS-based systems.


I will have a look under the shield to see if the V12 card in my Tezro has the header. I will also look through the manuals to see how to remove the V12. I am wondering if perhaps there is some heat transfer issue between the chips and the heatsink? It looked to me like the shield has been previously removed so perhaps something went awry in the process...

recondas wrote:
You might also compare the L1 revision in your Tezros with some those posted in the hinv forums (that have cooler graphics temperatures). I don't know for certain that the temperature threshold/fan speed levels used by the L1's "Autofan Control" changed between Tezro L1 revisions, but I did notice differences in the Autofan Control between different L1 revisions in the very similar O350.


I recently upgraded the L1 to the firmware version that is supplied with 6.5.30 and I have not seen any concrete difference. Unfortunately I cannot upgrade it any further since I do not have a maintenance contract with SGI. I exchanged some emails with their support, but it looks like enthusiasts are out of luck as far as patches are concerned :( .

recondas wrote:
The Tezro I have arrived with a dead V12. There have been other mentions in the forum of failed Tezro graphics boards. Concerned that those failures might be heat related I placed a low-noise fan on the perforated metal cover directly above the heat sink on the V12. The additional fan lowered the average operating temperature of the V12 in my Tezro from 50C to 40C (though the DM3s in your Tezros will make adding a fan directly above the V12 heat sink slightly more challenging).


I will have another good look at the internals to see if there is something I can do to enhance the baseline cooling. There are a number of things in the baseline cooling design that I do not really understand, being an engineer. For example, if the V12 is this sensitive to overheating, the positioning of the cooling fans, the primary heatsink and the perforations in the cover shield do not seem ideal...

Thank you very much for your comprehensive response!
Hi,

I picked up a non-working HP 735/99 a few months ago. I finally had a bit of time to replace all the electrolytics in the power supply as some of them vented and damaged the PCB. Following the repair the power supply red LED "pilot" illuminates with the mains power plugged in but the machine still does not power up. I contacted ASTEC and HP with request for some technical information or a schematic but was not successful in obtaining any information. HP does not have it and ASTEC will not release it because the power supply is OEM to HP.

I am curious if anyone has any technical information on this power supply. The HP 735 service handbook does not even provide the pinouts, let alone any more detailed information. The power supply HP P/N is 0950-2081 and the ASTEC model is BM200-3601. Any help would be much appreciated before I dive in ant try to troubleshoot it blind...

Thanks!
Thanks for the encouragement! I tried to contact ASTEC again with a request for a schematic or service manual. I also started looking at the pinout and the circuit, but it will be a very slow process.
Back in the day SUN has released non-expiring, any node, demo licenses for WorkShop 3 and WorkShop 5. The relevant page can still be found at:

http://wayback-beta.archive.org/web/20080220024130/http://www.sun.com/software/licensingcenter/sundev.xml

Unfortunately I am not able to find the referenced WS5.0_SPARC license file on the Oracle web site or anywhere else for that matter. Does anyone have it backed-up by any chance?

Many thanks.
I recently came across a number of SPARCstation 20 boxes and decided to restore a few of them to a "new" condition. I am quite impressed with the overall design and build quality of the units. However, when cleaning and inspecting the power supplies I discovered that some of the output filter capacitors began to vent electrolyte (not surprisingly). The power supplies were still functioning correctly and the ripple was within reasonable range, but I expect that the deterioration would rapidly accelerate under use.

I have now rebuilt the power supplies and replaced all of the aluminum electrolytic capacitors with high quality long life components. This reduced the output voltage ripple and will hopefully allow the supplies to carry on for another 20 years. If anyone is interested, I would be more than happy to post the replacement parts list.
It's my pleasure.

I am attaching a PDF with a DigiKey parts list and with the original component details for the FDK PEX668-31 power supplies that my SPARCstations were equipped with. Note that some of the capacitors are installed on daughter boards and are either very difficult or impossible to rework without removing the daughter boards. The parts most likely to fail are C32, C35 and C27, C29 on the output filter side, but I replaced all of the capacitors on my power supplies, including the input side (line) filter.

Please be careful when working with the power supply as the input capacitor stores enough energy to be lethal - check its voltage to confirm that it is fully discharged before handling.
I finally put some photos together for my flickr account that illustrate the capacitor replacement. It should be enough to provide a rough guide and illustrate the scope of work:

http://www.flickr.com/photos/t-rexky/sets/72157640695991374/

Enjoy.
Indeed, and I had to do that for my "primary" machine. However, if the installed NVRAM is original and still carries the orange barcode sticker, the NVRAM content can be recreated from the sticker. A complete guide to reprogramming the NVRAM can be found at:

http://www.squirrel.com/sun-nvram-hostid.faq.html

The SPARCstation 20 uses an STM M48T18-150 Timekeeper NVRAM with built-in battery. An exact replacement can be obtained from DigiKey under their part number 497-2838-5-ND. Note that the mechanical envelope has been changed slightly by STM and this new part will no longer fit into the plastic chip carrier / frame that the SPARCstation 20 uses to ease the chip removal. But it fits perfectly into the socket and functions perfectly as well.
Yes, I am still looking for the license since I was unable to locate it. Thank you.
Thanks for the suggestion. All of the electrolytics have been already replaced as some of them leaked and started corroding the board...
Have you used the capacitor models and part numbers from my list or just "any" capacitors of same capacity and voltage rating?

In switching power supplies it is critical to use the correct low impedance capacitors in order for them to function correctly. When I put the replacement list together I matched the new capacitors to the specifications to the originals, ensuring that they are at least as good in terms of impedance. In general I pick the absolute best quality long-life parts for the recap jobs to make sure that they survive for a very long time. Note, however, that too low impedance is also not good as it may cause power supply instabilities.

Are you sure that nothing got damaged during your recap and that you soldered and assembled the power supply correctly?
All switch-mode power supplies use similar fundamental topologies, but detailed implementations vary. The best thing I would suggest is contacting Fuji in an attempt to obtain the power supply schematic. It is a long shot, but being persistent helps and I was able to do this successfully for other power supplies in the past...

Beyond this, a good electronics repair shop would be able to troubleshoot and repair the unit, but this approach would be rather expensive. It would be cheaper to purchase a replacement power supply.

I am sorry that I am of not much help.
This is the absolutely wrong forum for this, but I have almost fully bootstrapped gcc-4.6 on NEXTSTEP 3.3. It takes somewhere around a week on my 33 MHz TurboColor for just C and C++...
For those interested, the Previous NeXT emulator has been just updated to version 1.6. Andreas keeps chipping away at it and it is now superbly usable!

http://www.nextcomputers.org/forums/viewtopic.php?p=23501#23501
Raion-Fox wrote: Try the Misc forum next time.


My post was in response to:

ClassicHasClass wrote: I guess I should try bootstrapping an old gcc at some point.


So I fully intended for it to go into this thread despite it being somewhat off-topic. It illustrates the ridiculous amount of time that modern gcc takes to bootstrap on vintage hardware.
robespierre wrote: Apple seems to be the only company whose capacitors all leak. (It doesn't happen at all on NeXT computers, or Suns, or SGIs...)
It seems they used a harsher post-solder cleaning regimen that was destructive to the rubber seals of the capacitors.

There is no general problem with SMD capacitors leaking. You can replace them with direct substitutes.


Well, absolutely not true. The early SMD electrolytic capacitors are absolutely notorious for failing. NeXT used two alternate suppliers for theirs with one holding up reasonably well while the others leak like faucets. There is a large supply of new old stock NeXT sound boards and many of them are damaged because of leaked electrolyte. These are boards that have never been powered up since they left the factory.

Anyhow, have a look here for my NeXTstation Turbo Color recap: https://flic.kr/s/aHsjyr4fMY

And also here for my Sparcstation 20 PSU recap: https://flic.kr/s/aHsjS55diC
kjaer wrote: If you look at Suns, NeXTs, SGIs of the same vintage... you'll find they haven't used any SMT electrolytics anywhere. That's why they "don't leak".


Also absolutely not true. Please see my post above...

All of my vintage equipment has been very carefully recapped with modern equivalent electrolytics. They have been selected to fit physically and electrically, including ESR, ripple current rating, etc. Notwithstanding any potential surprises with batch issues this should make the equipment good for another ~30 years...