The collected works of ThaddeusW

So my 1GB memory kit arrived today, two 512MB dimms (holy cow are they huge!). As soon as I got home I prepped an old Linux laptop I have with mincom and a laplink cable and proceaded to install the dimm's into the first unit in the pile. typed power up and the first O300 came to life, awesome! After I started the system I switch to console mode by hitting ctrl_d and watch as the memory check reports an error. The error is "swapping bank 0 with bank 1". Rats do I have a bad dimm? So i open the system, swapped the memory modules and powered it back up. same message.
Further down the console boot log it says:
memory enabled = 512MB
memory disabled = 256MB

Thinking I might have been sent two mismatched dimm's I took them out and inspected them carefully, even comparing chip part #'s. Identical. Okay lets try the next system, power up, and what do you know, no memory errors and "memory enabled = 1024MB". I went back to the first unit and checked the dimm sockets for dirt, nothing clean as a whistle. Tried installing the memory again and wiggleing the diims to make sure they are firmly seated and still the same memory error messages.

Now here is where it gets interesting, The first unit I opened had the two screws securing the access lid. The other four units have their screws missing when the memory was pulled. That person knew the memory was missing and never bothered to open the unit. This leads me to believe that the memory controller had gone bad before the system was decommissioned. From what I know, nodes can be operated without memory when linked to a unit or units with memory, correct? They just ran the unit without memory.

I have not yet tested the other 3 systems but I will tomorrow when I have some more time. Looks like one system is a partial lemon :cry: . Any way to fix this problem or is this a bad logic board?

_________________
Coming soon! :O3x05R: On their way: :Onyx2: :Octane:
Jack of all trades at a small specialized laser and electron beam welding company.

I do everything from IT to designing, building and programming custom control systems. My latest project was an automatic fixture for a helium mass spectrometer leak detector. I also do allot of repair work as some of our machines are decades old with all sorts of upgrades tacked on.
Coming soon! :O3x05R: On their way: :Onyx2: :Octane:
recondas wrote:
The suggestions to clear the power on diagnostic (aka POD) logs are good ones.

To access POD mode, stop at the PROM command line (item 5 in the PROM menu list), and sequentially run the following commands from the command line:
  • pod
  • go cac
  • clearalllogs
  • initalllogs
  • flush
  • reset (the system will restart)

After the systen restarts, go back into the PROM monitor and execute:
  • enableall
  • update
  • reset (the system will restart)
If the memory errors don't reappear, you're probably safe.

If they do reappear then I'd suggest trying that pair of DIMMs in the second set of memory slots - slots 1 and 3 normally hold the first pair, slots 2 and 4 the second pair. The error message you received, ""swapping bank 0 with bank 1", makes it sound like the system has disabled bank 0 and is expecting to find the memory in bank 1. It's a long shot that may not work, but if it does it beats replacing the logic board.


You sir are MY HERO!

I went from this: :(
Code:
**** System Configuration and Diagnostics Summary ****
CONFIG:
No. of NODEs enabled    = 1
No. of NODEs disabled   = 0
No. of CPUs enabled     = 4
No. of CPUs disabled    = 0
Mem enabled             = 512 MB
Mem disabled            = 256 MB
No. of RTRs enabled     = 0
No. of RTRs disabled    = 0

DIAG RESULTS:
/hw/module/001c16/node/mem: MEMBANK(S) 0  disabled
Reason:
Bank 0: Some DIMMs failed mem test.
**** End System Configuration and Diagnostics Summary ****


To this! :D

Code:
**** System Configuration and Diagnostics Summary ****
CONFIG:
No. of NODEs enabled    = 1
No. of NODEs disabled   = 0
No. of CPUs enabled     = 4
No. of CPUs disabled    = 0
Mem enabled             = 1024 MB
Mem disabled            = 0 MB
No. of RTRs enabled     = 0
No. of RTRs disabled    = 0

DIAG RESULTS:
ALL DIAGS PASSED.
**** End System Configuration and Diagnostics Summary ****


Thank you for the clear and precise instructions.

_________________
Coming soon! :O3x05R: On their way: :Onyx2: :Octane: