|
Web Interface for CollectDCollectD has lots going for it IMHO especially compared to Ganglia (see my appeal for help on Ganglia). It doesn't cost much to run. It's very easy to set up. On the other hand, you cannot mix statistics from 32 and 64 bit versions of Linux. The web interface is dismal, to say the least - it's a mess to set up; does almost zero error checking and leaves it for you to guess what needs to be installed and where things must live. I spent the better part of a day once to get the web interface to come up and then it was nothing like what I wanted. That was a couple of years ago and I had hoped someone would create a web interface that came closer to what I wanted. No joy - so I've done enough for me. Maybe what I've done will be close to what you want - or at least a decent start. Here's the story. I manage a set of clusters for the CSG. I don't really want collectd to do much more than give me trends of what is happening. I want to look at a week or month's data and see "how full" the cluster is. I want to see if the network is too busy or if the nodes are paging too much. I don't want paper reports to pour over - just the big picture. Collectd data is just what I want. Viewing the data is another issue. CollectD SetupIn our MOSIX cluster we pretend there are two types of nodes - a client and a gateway. Gateways are the nodes users login to and initiate tasks to run on the cluster. The gateways send the processes to the clients where they do most of their execution. For our very CPU intensive world, this model works remarkably well. Here are the collectd config files we use:
The server is a machine that is not in the cluster and it's sole task is to collect data from the other machines. It also collects a modest amount of data on itself. I have more than one server - at least one for each architecture (32 or 64 bit Intel machines). The client nodes collect just the basics. The gateway nodes collect the most data. Each sends its collectd data to a server which runs the same architecture. You may well want substantially more than my gateway, but doing that is easy. Making ImagesThe web site which actually displays my data is not part of the cluster, so one problem I have is getting my collectd data where it can be displayed. At one point I was regularly copying the data from the server to the web machine. This worked until I added my first 64 bit machine and then the web server could not generate graphs from a different machine type. I solved this by generating the graphs on the server machine with a little crontab job like the following. Later another crontab script copies the PNG files to the web server. This means there's more copying than is strictly necessary, but at least it works.
The script shown above relies on having the PHP command line interface installed, so you can run PHP scripts. The makeimages.php script is a highly modified version of the script provided by the collectd folks. It does piles of checking for missing things and when things fail, it generates long, elaborate error messages so you have a chance to actually figure out what the heck is going on. You'll need it. In the summer of 2008 I added another step invoking clusterload.pl, a Perl script I wrote, to come up with a pseudo-load for the entire cluster. Two numbers are plotted - one for the average load per node and one showing the total load (sum of all loads on all nodes). The script above generates two cluster load images (e.g. MACC.total.week.png) for each time period. These images are then referenced in the HTML and PHP scripts used to display information about the cluster. I'm not going to explain all the details of the scripts - that's left as an exercise for the reader. These scripts should be modified by you, but at least you won't have to work nearly as hard at it as I did the first time. Most of the files you need are here. These are renamed as *.txt files so you can browse through them and save the files. You'll need to rename the .txt files as php files etc. All the files should be in one directory (see the shell script above) where they can be invoked. Run your version of this script to generate the PNG images. After it completes, you may want to run another script you have created to copy the newly created images somewhere where your web server can display them. You'll have to construct a few extra scripts to drive this as well as set up the proper crontab entries. Sorry, I don't know the dependencies any more. If you try this, let me know the dependencies you find and I'll come back and document it here. Showing the CollectD DataNow you have a pile of images in your web space. Displaying these is straight forward. Check out http://csg.sph.umich.edu/docs/cluster/stats/ for how this all looks at my site. All the files to create the images are here. Feel free to swipe them and change them for your needs. Of course, feel free to crib the HTML files you see so you can build your own HTML files from collectd images. If anyone is interested, I'd be glad to share my PHP files to generate the pages you see above, just ask. If you just can't get to the files, maybe this zip file will work. Good luck and if you come up with something you like better for your cluster or your server machines, I'd be pleased to hear about it. |