The collected works of 2ndadamdick

It would also matter where your IO cards are right ??

Each of the two nodeboards, each have a connection to each xbow (and there's two) take ownership of each I/O card. Normally one CPU/heart would take ownership and all communications and ISR would go through that CPU/heart.

If you look in /var/sysgen/system/irix.sm around line 573 (for IRIX6.5.28) and you'll find directive to control nobeboard ownership and interrupt routing

This way if you had allot of heavy I/O on XBOW #1 and it had the choice of CPU's with different cache sizes(which would make a larger largish difference for ISR that are always getting trigger. Having 2MB of cache could make a large difference in some cases) or a more powerful CPU you could make sure the BEST cpu is servicing the hardware on the XBOW and that CPU will handle the extra work, interruptions

Line 573: contains the NOINTR directive which excludes CPU from servicing interrupts
Line 583: contains DEVICE_ADMIN which assigns CPU ownership for devices on that XBOW (it would make sense, even if you assign a CPU that's on a Xbow not connected to the I/O port you still have to go through the owning nodeboards heart on the way from I/O->Xbow->owning heart->router->nodeboard xbow->nodeboard heart->CPU compared to Xbow->heart->CPU )

/var/sysgen/system/numa.sm always contains some NUMA directives, which are handy (espc on a busy system , or a system low on mem that has an application that isn't NUMA aware). Migration is turned off by default - turning it on could make a large difference for processes that are memory hungry and have CPU's serially access large amounts of memory. In this case though oview would show a large amount of traffic through the Xbow/Routers. Migration is one of the cooler features of SGI's NUMA hardware, they widely state it's one of the things that make NUMA a high performance, yet it's disabled by default - I guess they have reasons, but ..... I suppose it doesn't matter that now memory is so cheap - but you can also crank down the kernel replication so it's not using as much memory on each nodeboard
chervarium wrote:
SAQ wrote: Like I said, it seems to have evened out later.

I think I recall reading that (in addition to the cache bit) SGI NUMA archs try to keep things local to a node to minimize remote memory access, and since I have only one processor per node, that would necessarily mean keeping it to a single processor.

I just wasn't sure - MP SPARCstations recommend the faster processor be the first one, wasn't sure if SGI had gotten away from this (seems like they have).

On the plus side, it's noticeably faster than the old I2 ;)

Just like any other NUMA computer... The bootmaster is nothing but a normal node after the kernel is up. One can certainly "influence" the scheduling of the bottom halves of the interrupt handlers, but to what extend? There are no real interrupts in the XIO environment, there are only packets routed through the fabric.


Well in the classic sense like the processor isn't doing a PUSHA/IRET (on a I386 arch) and everything is still bridged by the fabric (I thought the CPU's each still had one main interrupt on the Origins ???????), but none the less the CPU has to stop executing what's it's doing, change contexts and start hashing through a different part of memory entirely. Actually doing a PC style interrupt with a OS that doesn't do every good protection is likely far less overhead then in IRIX. But either way it must have a very measurable impact on the L1/L2 cache stats. I also thought IRIX (within the xbow) let you manage both the top and bottom part of the ISR.... Hmmmm I'm going to start trashing though code, I've never looked at the way IRIX handles interrupt's where (if at all) it's still for low latency good RT performance

I mean : (forgive my incorrect / incomplete asm example 'it's been awhile')

ISR_IS_HERE:
inb 0x3f7.al
mov bx, offset_of_filo_ring
mov [bx], al
inc [bx]
out 0x20, 0x20h
iret

That's a pretty short amount of work an plain old PeeCee's (yes I know, old non-protected mode code, all the assembly I did 15 years ago was on embedded computers and was mostly just doing basic I/O for disks, and some user interface. I used A86/D86 which didn't stress correct syntax either so ....)

Does anyone have much low level knowledge of the LINC chip that's on the more complex. I keep thinking about it and having 132Mhz R4650 just to bridge the two PCI buses (PPCI, CPCI) and it seems like it would just add more latency then FPGA or some custom silicon (well, other custom silicon)

Is the LINC there to translate the very large addresses used in the large SN0/1 systems to something that's fits nicely into the PCI spec ???

Or was the LINC more for performance, and more to co process the DMA (the LINC seem todo allot of scatter gather)

Is the LINC code standard across all LINC bearing cards, or is the LINC code tweaked to all process the specific card it's on. I suppose it seems to make allot more sense of each LINC uses custom code and is more to commodity silicon for I/O to keep the costs low, but still co process them to keep the impact on the CPU's lower. It just seems that it would still be cheaper faster to do even the most complex function (DMA scatter/gather) in a (fp)gate array
pan1k wrote: Hey guys! I brought my SGI to work so I wont have to pay for the electric bill ;-) but it won't display any video now. Do you think all 3 monitors are unable to sync? One of them displays the login screen for a sec then goes to standby mode. I tried going in through the console and resetting it, but I can't even get into the diagnostics screen when you first turn the machine on. Any ideas? At home I have a 19" Sony Trinitron...


Can you setup the network where you are (your likely have to setup the network settings on a PC to match your SGI since you can't login in, but ...) and then telnet into it and try and find a res that works ???? (or did you drag the same monitors to work as well ??)


I've had allot of SUN Trinitron monitors that badly missed behaved with SGI's. I thought my first SGI was defective because the SUN monitor (it still had the 13W3 plug though) looked like it should work, but it did all sorts of weird stuff and could never seem stay locked onto the sync for very long

Are this SUN Trinitron monitors, or real SGI Trinitron monitors ???

Even so you'll prolly be about to telnet in setmon the resolution to something the monitors will sync to on
I remember reading that SGI kind of did a nasty FU license scheme on their R-Bricks

Apparently when you first install a R-Brick it trys to figure out which 03400+ you are and then locks the configuration in the R-Brick and you can't use it for other systems

I also remember reading about a clear routing to put the R-Brick's back to factory state

Can anybody shed some light on this ??


Thanks, Adam
ShadeOfBlue wrote: According to this document (thanks, Ian!), the SSN (system serial number) located in the NVRAM of the brick's L1 controller has to be set to L0000000 for the brick to obtain a new SSN from the neighbouring bricks.

SGI doc #108-0240-002 wrote: When an R brick needs to be added to a system or moved to a different system, SGI
authorized personnel can clear the SSN that is stored in the L1 controller’s NVRAM.
[...]

To clear the SSN, SGI authorized personnel use a temporary authenticator generation
software program, which is located on a secure Web server that can be accessed via
WebSAFE (use your SGI login and password). Using the brick serial number, this software
generates an authenticator (four alphanumeric fields [total of 18 characters]) that is based
on the brick public key and the current date and time.

The SGI authorized personnel input this authenticator into the L1 controller
validation/serial number change software. The L1 controller software compares the
authenticator to the brick public key. When the authenticator matches the brick public key
and it has not expired, the L1 controller software clears the brick SSN. When the
authenticator does not match the public key or if the authenticator has expired, the L1
controller software does not change the SSN.


However, according to the same document, the L1 controller's NVRAM chip is socketed, so if the SSN is stored in cleartext, it could be rewritten without much trouble, hypothetically speaking.


I remember reading about a back door into clearing - has anybody heard of this ??

WTF reason in the world would case SGI intergrate asym crypto keying for the B-Ricks ?????
zarma wrote: And i forget to say on our SGI it takes 2 hours to install everythings from a clean install and without any particular
hardwares when it takes 1 day on Linux with 5000$ of specials harwares to get the same result :lol:


Yeah still only runs on Nvidia FX4x00/5X00 line of professional and *pricey* video cards. Even though there sitting on a fair bit of power, and if they choose to they could use Nvidia CUDA to directly use the video hardware's 128Streaming units to co-process video functions - they've decide all rendering will be in software now 'for better portability and compatibility'. Although all it will take is one vendor using the full power of the video system on the FX cards - like Discreet used on SGI's - and Discreet will be embarrassed by the performance and quickly start using the GPU to run code

Sadly Discreet more or less locks down the kernel on their machines and you can't do anything to customize it without being blacklisted for any support (which is good for the normal Joe Blow but could be a big problem for more complex, bigger and custom installations). Although the one bright side is they're finally givin up on their 'Stone Tax'. Some of the biggest post houses are still keeping their higher end SGI computer (O3K's) to run as CXFS meta-servers. But with CXFS being the only 'killer product' to sell (in that sector) SGI is all but ignoring the sector and isn't even pushing CXFS (which could generate insane sales/money if they cheapened it up and packaged it with their *UGH* Altix machines). Maybe now that they're using Nvidia GPU's they'll get back in the game and start selling Discreet on Altix which would give some killer performance to high end machine - that so far see no replacement for their large Onyx4/Fire machines in sight. Although Discreet has moved allmost(all ???) killer Fire features (like timeline) in Flame2008

CXFS as a singular product if market properly could make it onto a large quantity of datacenters and installation with multi-machine high-bandwidth shared-data needs. There are some solutions that compete - but they forced you into buying everything from them (drives, enclosures, computers, cables etc...) and are mostly just for Video networks - and not suited to data

SGi could of capitalized on the *HUGE* performance increase the NASA Columbia cluster got from CXFS - but they made like *one* press release and that was it. NASA provided more advertisements for CXFS just with their technical paper they released about the upgrade/speedup. Although it's unlikely IBM would ever start pushing CXFS as option instead for GFS but ... who knows ......I think SGI's should reelect Neko as chairman and everybody else should sit on the board via proxy votes !!! Knowing SGI's history if they where given the resources they could expande CXFS even more (features like for example machines being able to access the cache of other machines via FC/RDMA is data exists there instead of the hard disks) and make it much less 'rough and the edges'

Who else has SAN file system that competes (in any sector) with CXFS. There's Avid solution, and EMC - who else (that don't license EMC) ????

The transition to PC machines for Discreet has been a roughly road - that Discreet was pushed into taking because the exec's at SGI spent 1998-2003 (ish ???? when did Mr B go ??) using R&D money to smoke crack in the ghetto. I don't think you can consider asking the crack dealers if they'ld exchange all their 'rock' for a Octane valid market research - and just because they set all the racks from that 128 processor O2k that was exchanged for 'services for their girls' sideways so they can use them as a bench's doesn't mean the line is doomed and not useful because it's to 'hard' to use - and it's certainly no a reflection on IRIX not being 'user friendly'(yeah as a 'sofa' it is hard to use, espc on the ass - they prolly left the sofa bit of out of market research though so they could justify they're crack induced direction of the company when they wheren't all messed up on it. As if being to hard to sit on ever stopped a geek from working on a giant computer - I guess the middle Crays we're nice that way though)!
Sol wrote: How many zeroes do we stick on the check for this puppy?

:)


Which puppy CXFS (tens of thousands) or Discreet Flame (tens of thousands .... humpf .....)

Thank GOD FOR CVD and the rather 1st rate debugging suite !!!!!!
Gray Fox wrote:
pinball_0 wrote:
I bought a sgi 230 (800mhz PIII). Go it fired it up... booted up to Red 7.1 and had login screen. Thinking I am NOT going to get in to snoop, thought I'd try logging in as root with no passwd... no success, so tried one more time as root and passwd as root


I think admins make it too easy for passwords. When I was back in school, our admin used the word hammer for the administrator password and for the bios password. He didnt even care if people had it really. He never changed it. I knew it from 9th grade till I completed school and it always been the same. Probably still is the same up to today.


I think almost everybody now is getting better about passwords - in fact the government and educational institutions in Canada go to the other extreme and are often complete fanatics about passwords (even the ordinary job) often to the point of making the password much too diffucalt for the non-kodak memory equipment (no dictionary words, palindromes, reverse dictionary words, 8 characters or more, no embedded dates and must contain both letters and numbers) - sometimes even 8-12 digit random passwords - that the exact oppisite is the results and you see little hand written notes on the back of the machine, under the desk, organizer etc ....

It's a bit different now though - it's almost impossible to put any machine on the internet without it being scanning by robots many times a day looking for weak passwords. I've never had anything intentional but I work with allot of small companies in my little home town - and it's very to boss around the boss who used to being master of his domain and sometimes they'll use a password that's work just to prove a point (which ends up proving my point). It's normally not the root account that's compromised (I make a uniqie password for each customers root access) and the handfull of times it's happened it's been just to send spam

The *time* time I even had root compromised was one of the first firewalls, development machine to a accounting network. This machine just had some Script-Kiddy programs/scripts to hack other machines and upload the results - but he did a terrible job cleaning up his tracks. And it turns out that wasn't even a weak password - that was in the early days of Linux meets Internet meets ScriptKiddys and they got into the system from one of the many buffer-overflows that existed back then (it was one of the rpc daemons). Now thankfully privilege separation has become standard - or a standard option with many key packages and it's much easier to chroot jail them to limit the damage (plus many daemons have traps to try and detect unknown buffer overflow exploits) plus I was being kind of lazy not uing xinetd to limit many services to loadl only, or just commenting them out of inetd.conf all together !
ka0s wrote: Can anyone confirm that the 2.44 version runs 11% faster on a SGI then the 2.45 version.
And does anyone have a clue why this is?


Different versions of GCC (notably later 3.x to 4.0) can cause slowdowns. And of course comparing somebody compiled in MIPSPro to GCC will normaly show a fairly large difference
pinball_0 wrote: I have been browsing with it and works well.

I brought up a shell in it and see some unix familiar stuff!!

Again I am AMAZED that it is working this WELL with less than minimum RAM requirements.

I have had safari sieze once or twice with the rotating mouse pointer, but stll was able to force quit from the menu and then restarted the app.


TOO COOL !!!

I LOVE MY SGI's AND I AM STARTING TO LOVE THE MAC


Yeah it's kind of amazing what a real-ish, modern-ish and tweaked to the hardware (ish) OS can *actually do* !!!!!!!!

Too often the market force insist the answer is simply faster hardware (willynilly - without specifics on how the hardware should be faster) - it doesn't take very long running software that's made to run on current (or even current-X generation hardware) to prove that's only *at best* half the solution

The Wintel world combined with sloppy programmers and programming models (or lack thereof) has held back performance as much as software (Windows XP - or any Windows running on multi CPU/core processors is a perfect example)

Also with traditional 'real'(ish) computers/OS's the minimum specs where the least you could expect every qualified feature or application to work. Now it seems to simply be a mere starting point where you can expect *some* of the software to work. Before minimum specs tended to be a 'safe' middle ground for the customer, now it's seems only to be the bare minimum - and what you get - is simply what you get. What's exceedingly shameless/lazy for the programmers is for software to be speced with hardware that both the developers and the users can't generally attain for the next few years - and this seems to be the 'status quo' for desktop/"personal computers"

Generally having the developer spending a little more time & paying for it in the (distributed) price is cheaper then having 'cheap' inefficent code wasting CPU cycles or doing stuff like copying buffers a zillion times because it's the first thing that popped into the programmers head - and he never gave a second look

BTW. Congrats on a good deal !!!!!!!!!!!!. I personally always favoured SUN because of the 'bang for the buck' comparing to used SUN's to used/worshiped MAC's. But I definitely appreciated the hardware platform and it seemed to zip along running PPC64 Linux - without ever a crash. However once they changed to Intel the magic was lost for me ..... !!!!(!!!) (!!!!!!!!!!!!!)

I think I read this online. But in general in the old days:
"People would pay twice as much for Mac's that performed twice as well"
"Now: People just pay twice as much"

Although I can't say I blame them (I mean they have to _seem_ like the worships their customers and pretend it's a style/class thing - but really their only be after the bottom line. However customer/fanbase loyalty affects said bottom line espc with Apple(TM)) - moving to Wintel/MacOSX with a 'Doze option in Intel must be making them far more money (in the short term _at least_ if they don't piss off too many of their 'fan base' which I think is more accurate hen 'customers' for most of the users, giving their customer *many* OS choices can't hurt either, and finally Intel is *starting* to live up to the promises (albeit their almost a decade late!) which have killed many'a'arch (MIPS, Alpha, PPC at least for desktop/servers that are 'mainstream' and I use that term losely) and slashing their R&D budget's for bring current products into their next generation (at least in desktop machines, and it seems also at a slower rate in servers)