SGI: Development

scsimon: Display temperature of SCSI drives

I finally got some hobby time this afternoon and wrote a small program which displays the temperature, uptime and self-test status of SCSI drives. It can also issue self-tests.

I've tested it on only one drive, so it may not work everywhere -- let me know if something breaks :)

Example usage:

Code: Select all

> sudo ./scsimon /dev/scsi/sc0d1l0
Inquiry response:       [SEAGATE ST336752LC      0004]
Current temperature:    55C/131F
Maximum temperature:    65C/149F
Drive uptime (total):   524 minutes (= 0d 8h 44m)
Next internal test in:  120 minutes
Self-test data (newest first):
0: at: 8h      type: bg extended       result: completed OK
1: at: 6h      type: bg extended       result: completed OK
2: at: 5h      type: bg short          result: completed OK
3: at: 5h      type: bg short          result: aborted (SEND DIAG.)
4: at: 0h      type: bg short          result: completed OK
5: at: 0h      type: bg short          result: completed OK


Download:
scsimon.gz
gzipped mips3 binary
(8.74 KiB) Downloaded 75 times


Installing:
  • Unpack the program: "gunzip scsimon.gz"
  • Make it executable: "chmod +x scsimon"
  • Place it anywhere you like

You must run this program as root.

Compiled with MIPSpro 7.4.4m: "cc -O2 -mips3 -o scsimon scsimon.c"

Source code (IRIX only):
scsimon.c
source code
(10.32 KiB) Downloaded 69 times


Current version: 1.2

Enjoy :)
Thanks!
I was able to get the temps for hard drives on different controllers without any problem:

Code: Select all

# ./scsimon /dev/scsi/sc0d1l0
Inquiry response: [SEAGATE ST336753LC      0005]
Current temperature: 34C/93F
Maximum temperature: 68C/154F
Self-test data:
0: uptime=0h  code='bg short'  result='completed OK'
1: uptime=0h  code='bg short'  result='completed OK'
# ./scsimon /dev/scsi/sc0d2l0
Inquiry response: [SEAGATE ST336753LC      0005]
Current temperature: 32C/89F
Maximum temperature: 68C/154F
Self-test data:
0: uptime=0h  code='bg short'  result='completed OK'
1: uptime=0h  code='bg short'  result='completed OK'
# ./scsimon /dev/scsi/sc7d2l0
Inquiry response: [SEAGATE ST373405LC      2203]
Current temperature: 32C/89F
Maximum temperature: 65C/149F
Self-test data:
# ./scsimon /dev/scsi/sc11d2l0
Inquiry response: [SGI     ST373307LC      2743]
Current temperature: 26C/78F
Maximum temperature: 68C/154F
Self-test data:
0: uptime=1h  code='bg short'  result='completed OK'
1: uptime=1h  code='bg short'  result='completed OK'
though the drive on controller 7 didn't return self-test data.
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************

Code: Select all

Inquiry response: [SEAGATE ST373454LC      D404]
Current temperature: 41C/105F
Maximum temperature: 68C/154F
Self-test data:


No self-test data with this hardware, apparently...
:Octane2: Dual R14K@600MHz, 2GB RAM, V12, 1x72GB HDD
:O2: R10K@175MHz, 512MB RAM, 1x72GB HDD
:Cube: 68040@33MHz, 128MB RAM, NeXTdimension 32MB, 2x 4.3GB HDD

...And lots of other UNIX-like systems for which there is no icon.
Thanks for testing it :)

I see my 2nd generation 15k drive runs much hotter than the newer-gen ones.

As for the missing self-test data, it was either cleared or the drive was never instructed to perform a self-test. There is a command which tells it to do that, I'll see if I can implement it.

Seagate drives also store the number of hours they've been powered up (in total), the next version of the program will be able to read that as well :)
@ShadeOfBlue: Thanks!!

Code: Select all

Octane2 8# ./scsimon /dev/scsi/sc0d1l0
Inquiry response: [IBM     DDYS-T36950M    SC4D]
Current temperature: 39C/102F
Maximum temperature: 85C/185F
Self-test data:
Octane2 9# ./scsimon /dev/scsi/sc0d2l0
Inquiry response: [SEAGATE SX1181677LCV    C00C]
Current temperature: 35C/95F
Maximum temperature: 65C/149F
Self-test data:
0: uptime=2h  code='bg short'  result='completed OK'
1: uptime=1h  code='bg short'  result='completed OK'
Octane2 10# ./scsimon /dev/scsi/sc0d3l0
Inquiry response: [SGI     ST318404LC      3126]
ERROR: Device does not support temperature readout.
Self-test data:
0: uptime=2h  code='bg short'  result='completed OK'
1: uptime=1h  code='bg short'  result='completed OK'
:Octane2: 2xR12000 400MHz, 4GB RAM, V12
SGI - the legend will never die!!
Axatax wrote:

Code: Select all

Inquiry response: [SEAGATE ST373454LC      D404]
Current temperature: 41C/105F
Maximum temperature: 68C/154F
Self-test data:


No self-test data with this hardware, apparently...

Almost the same disk (72GB Seagate 15K.3), but this one works for me:

O300 :

Code: Select all

# ./scsimon /dev/scsi/sc0d1l0
Inquiry response: [SEAGATE ST373453LC      9507]
Current temperature: 39C/102F
Maximum temperature: 68C/154F
Self-test data:
0: uptime=7h  code='bg extended'  result='completed OK'


Octane2:

Code: Select all

# ./scsimon /dev/scsi/sc0d1l0
Inquiry response: [SEAGATE ST373453LC      9507]
Current temperature: 44C/111F
Maximum temperature: 68C/154F
Self-test data:
0: uptime=7h  code='bg extended'  result='completed OK'

Looks like the O300 does a better cooling job, even though it had a "silent fan mod"
To accentuate the special identity of the IRIS 4D/70, Silicon Graphics' designers selected a new color palette. The machine's coating blends dark grey, raspberry and beige colors into a pleasing harmony. ( IRIS 4D/70 Superworkstation Technical Report )
ShadeOfBlue wrote: I finally got some hobby time this afternoon and wrote a tiny program which displays the temperature and self-test status of SCSI drives.


Many thanks.

This output is from my O2:

Code: Select all

# ./scsimon /dev/scsi/sc0d2l0
Inquiry response: [SEAGATE ST318406LC      0108]
ERROR: Device does not support temperature readout.
Self-test data:
0: uptime=1h  code='bg short'  result='completed OK'
1: uptime=1h  code='bg short'  result='completed OK'


An 18 GB Seagate drive. Is it too old? :-(
Update : the new version also displays drive uptime and can trigger self-tests (use the -t option, e.g. "-t short" for a short self-test, you can also use the short notation "-ts" instead)
It can be downloaded from the original post.

Only background self tests ("short" and "extended") plus "abort" are supported.

The temperature readings are currently sourced from the drive's temperature log page, but it appears not all drives have that. There's a second "Informational Exceptions" log page, which also has temperature data, but the program doesn't look at that right now :)
I'll get to it tomorrow.
Thanks!

Code: Select all

Pollux 22# ./scsimon /dev/scsi/sc0d1l0
Inquiry response:       [FUJITSU MAW3147NC       0104]
Current temperature:    34C/93F
Maximum temperature:    65C/149F
Self-test data (newest first):
0: uptime=1491h  code='bg short'  result='completed OK'
ah, very handy. much thanks! :D
r-a-c.de
Rats, SATA is a no-show

Code: Select all

fool 12# ./scsimon /hw/scsi/sc2d0l0
Inquiry response:       [ATA     Hitachi HDP72505A5CA]
ERROR: Device does not support temperature readout.
Self-test data (newest first):
0: uptime=169h  code='default'  result='completed OK'
1: uptime=169h  code='default'  result='completed OK'
2: uptime=168h  code='default'  result='completed OK'
3: uptime=168h  code='default'  result='completed OK'
4: uptime=168h  code='default'  result='completed OK'

Works nifty on the scsi drives tho. Thank you.
ShadeOfBlue wrote: Update : the new version also displays drive uptime and can trigger self-tests (use the -t option, e.g. "-t short" for a short self-test, you can also use the short notation "-ts" instead)
It can be downloaded from the original post.
Heres's the results of a test run using the updated version with extended test option:

Code: Select all

# # ./scsimon -te /dev/scsi/sc0d2l0
Performing self-test ...
Inquiry response:       [SEAGATE ST336753LC      0005]
Current temperature:    31C/91F
Maximum temperature:    68C/154F
Drive uptime (total):   1477790 minutes (= 1026d 5h 50m)
Next internal test in:  46 minutes
Self-test data (newest first):
0: uptime=0h  code='bg extended'  result='in progress'
1: uptime=0h  code='bg short'  result='completed OK'
2: uptime=0h  code='bg short'  result='completed OK'
and after the extended tests were completed:

Code: Select all

# ./scsimon /dev/scsi/sc0d2l0
Inquiry response:       [SEAGATE ST336753LC      0005]
Current temperature:    33C/95F
Maximum temperature:    68C/154F
Drive uptime (total):   1477803 minutes (= 1026d 6h 3m)
Next internal test in:  34 minutes
Self-test data (newest first):
0: uptime=24630h  code='bg extended'  result='completed OK'
1: uptime=0h  code='bg short'  result='completed OK'
2: uptime=0h  code='bg short'  result='completed OK'
***********************************************************************
Welcome to ARMLand - 0/0x0d00
running...(sherwood-root 0607201829)
* InfiniteReality/Reality Software, IRIX 6.5 Release *
***********************************************************************
Update: A new version is available (see original post).

I've added the second method of getting drive temperatures, so everyone who got an error regarding that before, please try again now.

I've also polished the self-test output a bit, added some option descriptions (run scsimon without arguments to see it) and a new '-v' option, which displays the program's version and exits.
Works with SATA ! Woo Hoo !!

Code: Select all

fewel 4# ./scsimon /hw/scsi/sc2d0l0
Inquiry response:       [ATA     Hitachi HDP72505A5CA]
Current temperature:    35C/95F
Self-test data (newest first):
0: at: 169h    type: default           result: completed OK
1: at: 169h    type: default           result: completed OK
2: at: 168h    type: default           result: completed OK
3: at: 168h    type: default           result: completed OK
4: at: 168h    type: default           result: completed OK


Grazie grazie, yi wan grazie !
Newer version works on my O2 with old disks. Only drive uptime is somehow big.

Code: Select all

./scsimon /dev/scsi/sc0d1l0
Inquiry response:   [SGI     IBM DNES-309170YSA30]
Current temperature:   39C/102F
Drive uptime (total):   1684282112 minutes (= 1169640d 8h 32m)
Next internal test in:   1684275712 minutes
ERROR: Device does not support self-test readout.
:O2: R7000/600 576MB Ram CDRW 18+9Gb HDD
http://www.tomosgi.co.cc