Yeah I'm on 6.5.30.
So I've read this, particularly the section Checking for Excessive Paging and Swapping.
http://techpubs.sgi.com/library/tpl/cgi ... /ch10.html
Particularly
Code:
-p vflt/s
Frequency with which a process accessed a page that was not in memory. Compare this number between times of good and bad performance. If the onset of poor performance is associated with a sharp increase of vflt/s, swap I/O may be a problem even if %vswp is low or 0.
I have developed a little test harness to gather par and sar output and tested as root with different RAM configurations.
Code:
mapleleaf 8# cat /usr/people/oo/bin/dm1.par_sched
#!/bin/sh
set -e
# http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi/0650/bks/SGI_Admin/books/IA_ConfigOps/sgi_html/ch10.html
date=`date '+%Y-%m-%d_%H%M'`
if [ -z "$1" -o -z "$2" ]; then
echo "usage: $0 <tag> <duration>s"
exit 0;
fi
tag=$1
duration=$2
script=`basename $0`
export TMPDIR=/var/tmp/$USER/$tag/$script/$date
[ ! -d $TMPDIR ] && mkdir -p $TMPDIR
tmpfile=`mktemp -p $TMPDIR XXXXXXXX`
echo "dm1: output to $TMPDIR"
/usr/lib/sa/sadc 1 1 $tmpfile.sa.out
par -rQQ > $tmpfile.$script.out dmrecord -B auto -t $duration -C -v -2 -p video -p audio $tmpfile.mv || \
{
rc=$?
echo "error: failed $!"
}
/usr/lib/sa/sadc 1 1 $tmpfile.sa.out
echo "sar report"
echo "=========="
sar -A -f $tmpfile.sa.out
echo "files"
echo "====="
echo $TMPDIR
ls -al $TMPDIR
I am seeing increased vflt/s - page faults (valid page not in memory)
Tested with 256MB, all good. 171 vflt/s (this is a common average result of a few goes)
Code:
mapleleaf 11# sar -p -f /var/tmp/root/256/dm1.par_sched/2012-02-07_2112/gWYG2417.sa.out
IRIX mapleleaf 6.5 07202013 IP32 02/07/12
21:12:33 vflt/s dfill/s cache/s pgswp/s pgfil/s pflt/s cpyw/s steal/s rclm/s
21:12:44 171.87 69.49 102.03 0.00 0.09 16.93 11.29 75.13 0.00
Test with 384 MB (results from when it fails to record 10s of video, after several successful attempts)
Code:
mapleleaf 15# sar -p -f /var/tmp/root/384/dm1.par_sched/2012-02-07_2126/wkII1425.sa.out
IRIX mapleleaf 6.5 07202013 IP32 02/07/12
21:26:28 vflt/s dfill/s cache/s pgswp/s pgfil/s pflt/s cpyw/s steal/s rclm/s
21:26:30 933.17 374.52 556.73 0.00 0.48 87.02 62.02 399.52 0.00
mapleleaf 16#
mapleleaf 16# sar -p -f /var/tmp/root/384/dm1.par_sched/2012-02-07_2126/ZlPq1410.sa.out
IRIX mapleleaf 6.5 07202013 IP32 02/07/12
21:26:21 vflt/s dfill/s cache/s pgswp/s pgfil/s pflt/s cpyw/s steal/s rclm/s
21:26:23 1353.85 544.06 806.99 0.00 0.70 128.67 89.51 583.22 0.00
Very much higher vflt/s, but zero pgswp and pgfil, which are pages retreived from disk.
So what I have is a "non-disk page fault" condition when I *add* more RAM to the box. And it doesn't seem to matter what arrangement of RAM is added (64MB or 128MB sticks) just that it goes wrong above 256MB of RAM.
So something is getting too big for something. Could just be a poorly written dmrecord. It came from 6.3 and the age of 256MB of RAM. People have talked about it being crappy - perhaps it's failing to localize it's resources effectively once RAM gets to the crazy heights of 384 MB and beyond. You can watch RAM deplete the longer you run it, and since it's spooling to disk through ICE it shouldn't be really consuming anything more than a static pool- not actually consuming RAM indefinitely. I'm going to write my own.