IPMI - Control of Your Hardware

See http://openipmi.sourceforge.net/

IPMI (actually ipmitool, from the OpenIPMI Project) is a way for software to ask the hardware about what's going on. What are the temperatures of things? Has some hardware failure been detected (or recovered from)? In short on higher-end server machines (e.g. Dell PowerEdge) IPMI allows me to write some scripts to find out if I should be paying attention to something. This is especially important when my remote cluster (see CSG Cluster) is miles away from my desk. What follows is a sanitized (nothing important deleted I hope) description of what I've figured out about Dell PowerEdge machines, Remote Access Controllers and IPMI.

  • Set the DHCP server for the IPMI address. I use 192.168.12.X when eth0 is 192.168.2.X

  • Plug ethernet cable into eth0 (either NIC1 or NIC2). Here's what I've seen:
    • PowerEdge 1850 - Use NIC1 (right hand NIC)
    • PowerEdge 1425 - Use NIC1 (top NIC)
    • PowerEdge 1950 - Use NIC2 (right hand NIC)
    • PowerEdge 2950 - Use NIC2 (right hand NIC)

  • At boot, press Ctrl-E when NIC is initializing and set the following:
      IPMI over LAN                  On
      NIC Selection                  Failover
        This is not available aon 1850 or 1425
        Use Dedicated with Remote Access Controller is present
      LAN parameters
        (how IP is set)              DHCP
          Use Static for PE 1850. Apparently it cannot reliably use my DHCP
          Use Static if a Remote Access Controller is present
        LAN Enable                   off
      LAN user configuration
        account                      SOMEUSER
        password                     PW4SOMEUSER
    
  • After setting IPMI in BIOS when a Remote Access Controller is present, you have also set the RAC web interface userid/password. Using a modern browser (e.g. firefox) visit the IPMI address you assigned. You'll get a prompt for a certificate and then a prompt for a userid and password. Use the values you set above.

  • At least in some machines, you will find that you cannot add additional userids/passwords for IPMI or the web interface. Dell's OpenManage can do this stuff too (and better I expect). Unfortunately, I've found that OM drags in too much cruft to actually work on my systems. You CAN define these with ipmitool like this:
      export IPMI_PASSWORD=PW4SOMEUSER
      ipmitool -I lan -U SOMEUSER -E -H IPMIIP user list
        Shows all the users that are defined. Note the ID field
        as this number is how you reference userids in other commands.
      ipmitool -I lan -U SOMEUSER -E -H IPMIIP user set name ID NEWUSER
      ipmitool -I lan -U SOMEUSER -E -H IPMIIP user set password ID PW4NEWUSER
      ipmitool -I lan -U SOMEUSER -E -H IPMIIP user set enable ID
      ipmitool -I lan -U SOMEUSER -E -H IPMIIP channel setaccess 1 ID privilege=4
     
     See 'man ipmitool' for this syntax.
    
    I've not yet figured out how to set additional users with ipmitool. Privilege 4 sets IPMI LAN privilege, but not RAC privilege.

  • The simplest ipmitool command just returns some basic status like this:
      ipmitool -I lan -U SOMEUSER -a -H IPMIIP chassis status
    
  • Now that ipmitool can ask your machine for data, you can write a little shell script and put it in a crontab entry, running it daily. The heart of the script should do something like this for each machine:
      export IPMI_PASSWORD=PW4SOMEUSER
      ipmitool -I lan -U SOMEUSER -E -H IPMIIP sel list
    
    This command shows you events that have been logged. Some are not very interesting (like cover has been opened). Others only need watching, e.g. the occaissional parity error which has been recovered from. A little script magic can filter out what you want (or don't) and mail it to you. Once you find some entries that are not interesting, you can clear the SEL logs with:
      export IPMI_PASSWORD=PW4SOMEUSER
      ipmitool -I lan -U SOMEUSER -E -H IPMIIP sel clear