SGI: Hardware

4D/240GTX troubleshooting - Page 1

Considering the case has "Sep 1989" stamped on it in several places, this twin-tower beast is in pretty good shape. It powers up just fine, and I've rigged up a serial port converter according to this page:
http://www.meadow.net/pinouts.html#sgikeydb9

Connecting a VT220 to the first serial port given me the (surprisingly) familiar POST messages:
Code:
Version 4D1-4.0A IP7 OPT Tue May  2 11:26:34 PDT 1989 SGI

Memory address pattern test from B0000000 to B0014000
Looking for additional CPUs!
CPU 1 recognized
CPU 2 recognized
CPU 3 recognized
Total of 4 CPUs running!
Testing all UART's ports                        PASSED.
Timer/Clock                                     PASSED.
Floating Point Unit                             PASSED.
Sync Bus Controller                             PASSED.
I/O Mapper, INT Vectors, MODE Regs              PASSED.
SCSI Controller                                 PASSED.
Data and Instruction Caches                     PASSED.
Full Memory Test                                PASSED.
MP Caches                                       PASSED.
Initialize local hardware!

CPU     LOCAL DIAG      1st CACHE       MEMORY          MP CACHES
0       Passed          Passed          Passed          Passed
1       Passed          Passed          Passed          Passed
2       Passed          Passed          Passed          Passed
3       Passed          Passed          Passed          Passed

Sizing and clearing 56 Mbytes of memory!   Initialize local hardware!
Loading Monitor ... Done!
nv ram checksum incorrect, check with printenv and fix with 'setenv ENV_VAR'
where ENV_VAR is a non-volatile environment variable such as bootmode

Error-- keyboard not responding

Error-- cannot open console "gfx(0)"


System Maintenance Menu
1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor

Option?


I have also connected the same serial adapter to the serial port on one of the GM. It prints out a sequence of POST messages and concludes with:
Code:
GM PROM diags PASSED.

It prints this series of messages twice, actually, on first power-up. Then it presents a prompt-looking "gm>" but doesn't take any input.

The LED status digit near the front power switch alternates between 1 and 2 continuously. This is normal while in PROM, right?

There's no keyboard attached, so I expect the no keyboard and gfx(0) errors. All the other diagnostics seem to pass, but I'm experiencing two problems:
1) I am unable to enter anything into the terminal -- it doesn't accept or respond to any input over the VT.
2) When plugging in a monitor using the primary graphics output, I'm presented with a garbled, black and white console screen which echoes the output on the serial port. You can almost make out what's on the screen, but clearly there's some thing wrong with the graphics subsystem.

I have a couple old SGI keyboards from this era, and am planning to make a keyboard adapter as well. But beyond that, I'm not sure where to begin troubleshooting the console and graphics issues. I hope it's salvageable! The lack of SGI documentation is unfortunate -- or can anyone point me to a reference more specific than This Old SGI (which is excellent but doesn't have a lot of detail about this system and the options available)? Are there scans of the original docs somewhere?

Also, is http://sgistuff.g-lenerz.de/ down or is it just me? (I just get a "500 Internal Server Error" message.)

I'll post some pics on here if I can figure out how to do that... :)

Cheers!

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy:
First of all can you tell us the exact pinout for the serial converter? Because you list the wrong section for the serial port on the Power Series. It should have been:
http://www.meadow.net/pinouts.html#sgiDB9F
I sometimes refer to it as a 4D-serial port, so for connecting a 4D-DB9 female to IBM-DB9 male serial port you need a 4D-DB9 male to IBM-DB9 female null modem cable. (for a stand alone terminal you need a 9-25 converter) I have such a cable for my Crimson and PI, and it took me a few tries to get it right.

This might explain why you do get output but no input response from the 4D-240. From "This old SGI" page:
Quote:
Buried deep within the GT graphics subsystem is a little-known interface which resides on the GM1 card. The GM1 appears to be built around a MC68020 CPU running at 16 Mhz. If you examine the card, you will notice a female DB9 connector. This connector provides an RS232C serial interface (9600,N,8,1) with more or less the same pinout as the standard SGI DB9 Serial connector. The only difference is that this serial interface doesn't support any of the hardware handshaking lines, so you can get away with simply connecting pins 2,3 and 7 via an appropriate adaptor cable (or you can use the same type of adaptor cable needed to connect an ASCII terminal to serial port 1 for debugging purposes. See the chapter on 4D Series Serial Port Pinouts for more details.)
So how about the GM> prompt? Can you enter stuff in there? Then chances are that your handshaking is correct for the GM serial port but incorrect for accessing the 240 serial port.

That output doesn't look good. I hope this improves by pulling out the GT boards and have a good look at it. Incidentally, can you make a photo of the GTX boardset? It can be anything ranging from 3 to 5 or even 6 boards. GM2 and GE4 should always be there, and they are bridged by a small board. You can have up to two RM1's and an RV1.5 or RV2.

Quote:
Also, is http://sgistuff.g-lenerz.de/ down or is it just me? (I just get a "500 Internal Server Error" message.)
http://www.g-lenerz.de/retro-collection.html might suggest that only his php link with apache is slightly shitty. standard html pages do load fine.

_________________
:Crimson: :PI: :Indigo: :O2: :Indy: :Indigo2: :Indigo2IMP: :O2000: :Onyx2:
European nekoware mirror, updated twice a day: http://www.mechanics.citg.tudelft.nl/~everdij/nekoware
ftp://mech001.citg.tudelft.nl rsync mech001.citg.tudelft.nl::nekoware
zuluchas wrote:
nv ram checksum incorrect, check with printenv and fix with 'setenv ENV_VAR'
where ENV_VAR is a non-volatile environment variable such as bootmode

Wow, I've never seen a PowerSeries with a dead NVRAM battery, but it seems it can happen after all.
The battery is on the the IO2.

But maybe a 'resetenv' and running the beast for a while does the trick.

zuluchas wrote:
I have also connected the same serial adapter to the serial port on one of the GM. It prints out a sequence of POST messages and concludes with:
Code:
GM PROM diags PASSED.

It prints this series of messages twice, actually, on first power-up. Then it presents a prompt-looking "gm>" but doesn't take any input.

I think I managed to get some limited interaction with it once but it's a long time ago. It's not very useful.
zuluchas wrote:
The LED status digit near the front power switch alternates between 1 and 2 continuously. This is normal while in PROM, right?

Correct.
zuluchas wrote:
There's no keyboard attached, so I expect the no keyboard and gfx(0) errors. All the other diagnostics seem to pass, but I'm experiencing two problems:
1) I am unable to enter anything into the terminal -- it doesn't accept or respond to any input over the VT.

You probably missed a wire in the adapter cable. With communication working the other way, I would guess you've got it set for 9600/n/8/1.

zuluchas wrote:
2) When plugging in a monitor using the primary graphics output, I'm presented with a garbled, black and white console screen which echoes the output on the serial port. You can almost make out what's on the screen, but clearly there's some thing wrong with the graphics subsystem.

Looks like a version of the infamous 'pinstripe of death'. Either that, or you need to reseat the boards, and especially the bridge board connecting the RM's and DG at the front. Unfortunately, GTX graphics seem to all suffer from the infamous pinstripe now. I haven't seen a working set in years.
zuluchas wrote:
I have a couple old SGI keyboards from this era, and am planning to make a keyboard adapter as well. But beyond that, I'm not sure where to begin troubleshooting the console and graphics issues. I hope it's salvageable! The lack of SGI documentation is unfortunate -- or can anyone point me to a reference more specific than This Old SGI (which is excellent but doesn't have a lot of detail about this system and the options available)? Are there scans of the original docs somewhere?

I have an "Owners Guide" for a PowerSeries, but it's not very in depth. You weren't supposed to mess with these systems, you were supposed to call SGI. "This old SGI" has more information than the manual. I've got a PDF of the Crimson Field Service Handbook, but once again, not much.

If you install IRIX 5.3 on it, the "Diagnostics 5.3" CD might be able to find the problem. PM me about that one once you're ready. But I'm pretty sure it will tell you there's a problem with one of the RMs

zuluchas wrote:
Also, is http://sgistuff.g-lenerz.de/ down or is it just me? (I just get a "500 Internal Server Error" message.)

Yeah, Gerhard is having some issues with his current hoster and is moving to a new domain.

_________________
Now this is a deep dark secret, so everybody keep it quiet :)
It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi

Currently in commercial service: Image :Octane2: :Onyx2: (2x) :0300:
In the museum: almost every MIPS/IRIX system.
I put together a keyboard adapter and am now able to communicate with the PROM. Unfortunately, I'll be away for a while so won't have time to put together a new serial adapter and re-test. I did use the layout listed in the link you mentioned (#sgiDB9F) but had moved on to the keyboard link since that was the next project. I've got IRIX 3.3.2 on tape, which I'd like to try out, as well as 4.0.something. I've also managed to pick up a Crimson with VGX, so may try swapping in the graphics boards -- I assume this is a valid combination for the 4d/240 since it supports VGX, right? Would I have to upgrade the frontplane as well?

Here are the PNs for the boards currently installed in the 4d/240 and some pics:
013-0206-001 (tape controller)
013-0203-003 (disk controller)
030-0118-001 (IO2)
2x 030-0135-001 (dual R3k @ 25MHz boards)
030-0117-001 (MC2)
030-0154-003 (RV1.5 w/o alpha)
030-0078-002 (RM1 w/o alpha)
030-0078-001 (RM1 w/ alpha)
030-0075-001 (GE4)
030-0116-002 (some unlisted graphics option board with 2x RGBS BNC, 2x "composite" BNC, and DB-9 "trigger" ports)
030-0139-001 (frontplane)

Sorry the pics are crappy -- they're from a phone camera.

On another note, the two stackable drive bays are slightly different from one another on the inside. It looks like you could just keep stacking them on up, although I've never seen a pic with more than two shown. Does anyone know if there was a limit to the number of bays stackable on a system, and which is supposed to be the top bay of the two (if there is a preferred top bay).

In the prom, the tape drive is recognized by hinv. I haven't had a chance to install a SCSI HDD or anything else yet. I tried reseating the connector boards and the problem got worse! Will reseat again and see how it goes, but looks like this may be another version of the pinstripe issue. :(

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy:
Thanks for the photos, i actually meant one photo of the system when powered on, but this looks neato as well :)
Quote:
030-0116-002 (some unlisted graphics option board with 2x RGBS BNC, 2x "composite" BNC, and DB-9 "trigger" ports)

I found something on:
http://groups.google.nl/group/comp.sys. ... c4a8deff72
and the VP1 is mentioned here:
http://groups.google.nl/group/comp.sys. ... 68f54c4c93
Could this be the VP1 digitizer? Can you check the board for a two-letter-one-number indication, usually next to the SGI part number. If you see a VP1 then you're the lucky owner of a rare video digitizer option for the GTX.

Where is the GM2 BTW?

Anyway, best way to isolate problems is to swap boards. The minimum gfx setup is an IO2, one IP7 and the MC2, followed by the GM2 bridged with GE4 and RV1.5. We only use the RV1.5 framebufer. If this already produces the pinstripe, then the RV1.5 board has bad memory somewhere. If reseating makes things worse, then also look at the backplane and check for bent pins or dirt.

_________________
:Crimson: :PI: :Indigo: :O2: :Indy: :Indigo2: :Indigo2IMP: :O2000: :Onyx2:
European nekoware mirror, updated twice a day: http://www.mechanics.citg.tudelft.nl/~everdij/nekoware
ftp://mech001.citg.tudelft.nl rsync mech001.citg.tudelft.nl::nekoware
dexter1 wrote:
Where is the GM2 BTW?

My thought exactly. But since he's gout output on the screen he probably forgot to list it.

_________________
Now this is a deep dark secret, so everybody keep it quiet :)
It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi

Currently in commercial service: Image :Octane2: :Onyx2: (2x) :0300:
In the museum: almost every MIPS/IRIX system.
Shaking with excitement when you too the photos of the boards?

Your collection is growing :)
Hehe, yeah, that and this phone camera needs full daylight conditions to take decent pics...but there's only so much a halogen lamp can do for you at night!

The collection is growing but floor space is not keeping pace!

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy:
Okay, I've gotten the serial console to work. I ended up going with the DB9M-DB25F configuration and realized I didn't need a null modem to get it working with the VT220. That did the trick, so I could theoretically do some real troubleshooting. I will stick a 2GB drive in see if I can netboot it on 5.3. @jan-japp, I guess I'm ready for the 5.3 diagnostics!

I've got the serial cable hooked up to the GM board and can communicate with it just fine as well. There is a "test" function, but not sure what to do with it.

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy:
A couple of updates:

I've been able to netboot sash 3.3.2 and 5.3 over ethernet (using an AUI-10bT adapter) from the onyx2. I've had a few issues copying the miniroot over the net, though I think they are more SCSI- than net-related.

I've found a few differences between my experience and the This Old SGI notes about booting and about using CD-ROMs. One thing I noticed was that I couldn't boot to an arbitrary file via the boot command in the PROM. I had to specify the boot path beforehand via the environment variable "bootfile" and then just time "boot". Although the SCSI CD drive was recognized as an "unknown / removable" scsi device, I couldn't figure out how to boot sash from it. I tried dksc(0,6,8)sash.IP7 and scsi(0,6,8)sash.IP7 with no luck. Once I netbooted sash 5.3, there was no problem addressing the cd drive as dksc(0,6,x).

I think the SCSI cabling may be in disarray. I'm not sure what the recommended layout is, but when I tried what seemed to make sense, it didn't work. When I received the system, there was a standard 50-pin internal SCSI ribbon cable running from the controller to the drive tower. There it got weird. The tower's sideplane was connected at the second-to-last port on that cable, then the tape drive as the next-to-last, and nothing on the end. That would seem to form a "Y", wouldn't it? Not so good for SCSI, but switching the sideplane to the last plug on the ribbon didn't work. By keeping the original Y configuration and adding a drive to the last plug (with termination on) and an ID of 2 (not 1, that doesn't work) I've got a working configuration with a tape at ID 7 and a drive at ID 2. I could then add a SCSI CD in the first stackable drive unit on top at ID 6. As mentioned before, the two stackable modules are slightly different in their layout and SCSI cabling, and I'm wondering if there is a right or wrong ordering for stacking them. The one I tested with has two 50-pin scsi connectors on the sideplane; the other module has only one. It seems to work (as in, hinv finally recognizes three scsi devices), but then I have trouble copying the miniroot to disk with SCSI timeout errors and such.

I've attached some pics of the modules and cabling for reference.

SCSI to-do: try different stackable module configurations, different SCSI HDDs, take the tape out and examine the jumpers to figure out what it's SCSI settings are.

In terms of graphics, I've taken all the boards out and examine their pins. None are broken, bent, or missing. When I remove the RM boards from the system, the symptoms change but do not improve: essentially the screen is much whiter, but just as garbled. Looks like the GTX is on its last legs.

I can confirm that that "unidentified" board is labeled VP1. And, yes, there's a GM2 in there as well.

Gfx to-do: try the crimson's VGX boards in the 240?

Unfortunately, I'll be away from tomorrow for a long while. Any thoughts appreciated in the meantime!

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy:
I was always under the impression that the hard disk liked to be at ID 1, the tape at ID 2 and a cd drive (notably an older Toshiba as the PROM likes them more than other brands) at ID 4.
Also, check your termination for any issues.

_________________
:Crimson: :Onyx: :O2000: :PI: :Indigo: :Indigo: :O2: :Indigo2: :Indigo2: :Indigo2IMP: :Indigo2IMP: :Indy: :Indy: :Indy: :Cube:

Image <-------- A very happy forum member.
I'll try those combinations. I was unable to get the drive to be recognized at ID 1 even with the tape removed from the chain. It showed up in hinv as occupying every SCSI ID on the bus instead.

Do I understand correctly that the last SCSI device in the chain is the one that needs the "TERM" jumper for SCSI buses w/o an actual physical terminator? That's one of the reasons I'm confused by the apparent Y cabling configuration. Does anyone know how these drive tower sideplanes actually work? Are they just a straight SCSI bus extension or do they act as a SCSI device proxy for everything attached above (that would be weird!)?

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy:
zuluchas wrote:
I'll try those combinations. I was unable to get the drive to be recognized at ID 1 even with the tape removed from the chain. It showed up in hinv as occupying every SCSI ID on the bus instead.

My Indigo2 had that happen when the drive was set to an invalid ID.

_________________
:O3000: :1600SW: :Indigo2IMP: :0300:

"Remember, if they can't find you handsome, they should at least find you handy."
That also happens when you set something to ID 0 which confuses the controller.

_________________
:Crimson: :Onyx: :O2000: :PI: :Indigo: :Indigo: :O2: :Indigo2: :Indigo2: :Indigo2IMP: :Indigo2IMP: :Indy: :Indy: :Indy: :Cube:

Image <-------- A very happy forum member.
FWIW: This system is old enough that it requires a CD-ROM drive with a special firmware which basically mimics a harddisk when first powered up. It is usually marked "P6-CDROM". This one isn't. If you put a generic SCSI CD-ROM in, you won't be able to boot the system from it.

Second: the board in the "stackable" drive module looks like a Y-junction in the SCSI chain. But that's not how SCSI works. SCSI is a chain, terminated at both ends.

_________________
Now this is a deep dark secret, so everybody keep it quiet :)
It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi

Currently in commercial service: Image :Octane2: :Onyx2: (2x) :0300:
In the museum: almost every MIPS/IRIX system.
She's up and running IRIX 3.3.2 now!

viewtopic.php?f=7&t=16722120

And with some part swaps is a pretty functional 4d/240 VGX. Now to do something about a mouse....

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy:
I've got some 8MB sticks for the MC2, so I took out all of the old 2MB ones first. I've been reading old posts here and discovered that you can mix 8mb and 2mb sticks (contrary to This Old SGI), but that there are some limitations.

Success mixing RAM sizes on MC2, suggests limitation is in PROM version:
viewtopic.php?f=5&t=6727&hilit=mc2

Well, what I found was then when I put known perfectly good 8MB sticks in the MC2, they came up as 2MB! I tried it with 8x 8MB sticks, arranged in matching adjacent pairs, each set of four identical -- the 4D sees 16MB. I tried it with 12x 8MB and got "24MB." This is without any 2MB sticks in at all, so it has nothing to do with the supposed inability to mix 8MB and 2MB on the same board.

Is this a PROM version limitation, as suggested in the above post? Or am I missing something?
I don't see any jumpers on the MC2 to change ram type. I don't see any PROM settings or commands. The "resetenv" command suggested earlier does not exist.

The MC2 board is P/N: 030-0117-001 rev D, so maybe too early a version for 8MB sticks?

When first turned on, the 4D comes up with:
Code:
Version 4D1-4.0A IP7 OPT Tue May  2 11:26:34 PDT 1989 SGI


Then in the PROM, the version given is:
Code:
PROM Monitor Version 4D1-4.0A PROM IP5 OPT Fri Jul 14 09:28:31 PDT 1989 SGI

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy:
I have a MC2 030-0117-039 Rev A but i have never occupied it with 8mb sticks since they are in my Crimson at the moment, and i do not have a IP7 anymore.

I'm sure JanJaap knows this.

from the 4dfaq it states that you cannot mix 2mb and 8mb on an MC2.

_________________
:Crimson: :PI: :Indigo: :O2: :Indy: :Indigo2: :Indigo2IMP: :O2000: :Onyx2:
European nekoware mirror, updated twice a day: http://www.mechanics.citg.tudelft.nl/~everdij/nekoware
ftp://mech001.citg.tudelft.nl rsync mech001.citg.tudelft.nl::nekoware
dexter1 wrote:
I'm sure JanJaap knows this.

Uhhhh :oops:

dexter1 wrote:
from the 4dfaq it states that you cannot mix 2mb and 8mb on an MC2.

There was a link to my adventures with a Crimson in this thread. It worked there, but a Crimson doesn't have an MC2.

I already promised the OP to check out my systems. I also have a couple of MC2's in storage. Another possibility is the firmware he has on the IP5, which is not exactly the latest version.

_________________
Now this is a deep dark secret, so everybody keep it quiet :)
It turns out that when reset, the WD33C93 defaults to a SCSI ID of 0, and it was simpler to leave it that way... -- Dave Olson, in comp.sys.sgi

Currently in commercial service: Image :Octane2: :Onyx2: (2x) :0300:
In the museum: almost every MIPS/IRIX system.
I'm thinking that the PROM must be the problem for the MC2 board. I look forward to hearing what you find in the attic, Jan-jaap!

Just out of interest, I'll try replicating the IP17 mix of 8MB and 2MB on the same board (in the Crimson, of course, not the 4D) and let you know what happens (and what version of the PROM that system is using).

_________________
:Onyx2RM: :Onyx2: :O200: :4D70G: :Fuel: :Indigo: :Octane2: :O2: :Indigo2: :Indy: