# us: CITI - Center for Information Technology Integration, University of Michigan # # contact: links at http://www.citi.umich.edu/projects/ntap/ # # hello: this file is intended to guide someone through the process of # configuring a PMP after it's had the CITI PMP RPM installed # on it. after these post-install tweaks, the PMP should be ready # to plop onto the network and get down to business. # # note: if you haven't just installed the CITI PMP RPM onto a machine, # some/most of these steps may not be possible. __________________________________________________________________ Overview __________________________________________________________________ This file assumes that you have done an RPM install of the PMP software; however, the steps involved are applicable to any PMP setup. This file is loosely chunked into GARA-related setup, Globus-related setup, Walden-related setup, and other-stuff, with "mandatory" and "optional" steps for each. What's GARA? It provides us with the ability to securely schedule remote programs to run at a certain time while performing fine-grained authorization checks. What's Globus? Globus is a (very large) grid-computing framework that includes a per-resource (i.e., PMP) sturdy authentication daemon, as well as libraries that GARA is built on top of. What's Walden? The plain-vanilla Globus authentication mechanisms don't scale at all (which is a big problem in a grid environment); also, the original authorization setup in GARA didn't scale either. Globus delegates its authentication decisions to Walden (which scales beautifully, significantly reducing administrative headaches). Walden also uses XACML, a policy specification language that is significantly more- expressive and easier to use than the older KeyNote engine. GARA will soon also leverage Walden for its authorization duties, creating a clean parallel between the Globus and GARA components. What's NDT/web100? The web100 project consists of a modified Linux kernel that logs statistics from the TCP/IP stack, a userland library that exposes the data, various utilities to view the data, a server program that runs a fixed type of performance test, a couple of remote clients, and a minimal webserver that enables clients to run a test through a web browser. At CITI, we've modified the web100 server and its Java-based web client to integrate it as a "first- mile" test in the NTAP framework, and also fixed an exploitable security hole in the web server. Once you've finished going through this file, we have a PMP verifier script that ferrets out many common configuration problems. This script can be run any time; to run it: % sudo /usr/local/ntap2/pmp/bin/ntap-postinstall-verify.sh Also, don't forget the online FAQ -- it's linked off of the project homepage: http://www.citi.umich.edu/projects/ntap Good luck! _________________________________________________________________________________________ __________________________________________________________________ [Post-Install] PMP prerequisites (OK to install the RPM first) __________________________________________________________________ 1. Kerberos Download a recent version of Kerberos. A binary version that requires only unpacking is available at: http://web.mit.edu/Kerberos/www/dist/krb5/1.3/krb5-1.3.2-i686-pc-linux-gnu.tar Unpack the tarball -- it will extract to a .asc and .tar.gz file. The .tar.gz is the actual kerberos installation. To unpack it, use the "-C /" flag, like this: % sudo tar xvzf theKerberosStuff.tar.gz -C / It should install `kinit' and whatnot into /usr/local/bin/. Now, set up /etc/krb.conf and /etc/krb5.conf as appropriate for your domain. Make sure that you are able to `kinit' successfully. 2. OpenSSL In order to run the post-install verifier correctly, make sure that /usr/bin/openssl is present (it may well be by default). _________________________________________________________________________________________ _________________________________________________________________________________________ __________________________________________________________________ [Post-Install] Mandatory GARA configuration __________________________________________________________________ 1. /etc/ld.so.conf You will need to add the following line to this file iff it is not already present: /usr/local/openssl-0.9.7a/demos/engines/rsaref This makes-visible a library used by GARA (although maybe only when you're actually building GARA, which you're probably not doing). After adding this line, you naturally need to rerun "ldconfig". 2. Authorization in GARA NOTE: this step is chiefly-involved with the mandatory Walden step #4 below. Authorization for the PMPs largely occurs at the gatekeeper level. Currently, the default authorization checks done by GARA are based on AFS PTS group memberships. The program /usr/local/gara-1.2.2/resource_manager/programs/mod_pts is used to do the lookups. However, to ease debugging and simplify some configurations, it is possible to use a flatfile for the group membership check instead of actually contacting an AFS PTS server. Switching between the "real mod_pts" and the "flatfile mod_pts" is accomplished by recompiling the binary. Two teeny scripts that handle that are in the directory: /usr/local/gara-1.2.2/resource_manager/programs/ .. and are called mod_pts.compile.afs and mod_pts.compile.noafs. The default mod_pts binary uses the _flatfile_ approach. The flatfile approach actually uses two files: one for the fake groups names (mod_pts.conf) and one for the fake group memberships (mod_pts.acl). The formats are simple; just look at the sample CITI setup. The actual authorization policies here are implemented with Keynote. Contact CITI for more information here. One last thing: authorization in GARA can also be done with PERMIS, which uses an LDAP directory (not to be confused with the upcoming authentication mechanism with the Globus grid-mapfile in LDAP) to store its policies. This is more of a proof-of-concept, though, as signed policy certificates are currently unavailable and communications aren't secure. Nevertheless, an RPM for overlaying PERMIS onto an existing PMP installation will be made available off of the NTAP project web page. 3. GARA's diffserv manager The GARA resource manager (the scheduling and authorization component of NTAP), called the "diffserv manager", has an /etc/init.d/ script that governs it. The RPM added the init script, 'diffserv_mgr', to the init.d directory and ran chkconfig to add it to the runlevels, but the RPM did not start the daemon for you. Either reboot the machine or run 'sudo /etc/init.d/diffserv_mgr start' and verify that a process named 'diffserv_manager' is running. __________________________________________________________________ [Post-Install] Optional GARA configuration __________________________________________________________________ 1. /usr/local/gara-1.2.2/resource_manager/programs/mod_pts.acl This file is used when (1) GARA is doing authorization via AFS PTS group membership and (2) /resource_manager/programs/mod_pts was compiled to avoid a lengthy AFS PTS callout by using a flatfile ACL. Note that this method of authorization is rather brittle, because each PMP keeps its own copy of "mod_pts.acl" around. You may need to add yourself to this file, then. 2. Rebuilding GARA You'll probably want to email me with your questions (richterd@citi.umich.edu). However, for the undaunted, there's a funny thing about building GARA on a "new machine" -- e.g. a Fedora Core 1 Linux machine. Apparently GARA's build-stuff was set up to expect an older version of libtool. If you want to rebuild GARA (e.g., to get PERMIS support into the diffserv_manager), you'll either need to fix their build issues or get a copy like....: [richterd@l99 SPECS]$ libtool --version ltmain.sh (GNU libtool) 1.4.2 (1.922.2.54 2001/09/11 03:33:37) Wait! I've tossed that version of libtool into ntap2/gara-1.2.2/, so you can just use that one, if needed. _________________________________________________________________________________________ _________________________________________________________________________________________ __________________________________________________________________ [Post-Install] Mandatory Globus configuration __________________________________________________________________ 1. /etc/services You will need to add the following line to this file iff it is not already present: gsigatekeeper 2119/tcp # Globus Gatekeeper This shows that the globus_gatekeeper will be listening on port 2119. 2. /etc/grid-security/grid-mapfile This file is essentially an allow-list of DNs that the globus_gatekeeper uses to authenticate remote clients. Add DNs as they appear on the kx509 certs, 1 per line. Note that, since this is on a per-user basis, this is a difficult administration point. However, parallel work at CITI focuses on moving the grid-mapfile into an LDAP directory, which will significantly ease PMP setup. Contact CITI for more information. To make things easy on yourself, you may/should add an administrator's DN to the grid-mapfile; to find your DN, acquire Kerberos credentials with `kinit', acquire kX509 credentials with `kx509', and flush the cert to a file with `kxlist -p'. With the kxlist command, you should see "subject=" and then a DN; wrap that DN in double-quotes and paste it into the grid-mapfile, then put a space, then put the administrator's local username (from /etc/passwd) after it (like the other entries). Now, one more thing to prevent future headaches: different versions of OpenSSL display DNs from (k)X509 certs differently, unfortunately. Also, Globus uses its own OpenSSL library for its cert-handling, which means that the Kerberos utilities and Globus could be using different libraries to convert, display, and compare DNs. Fortunately, it's easy to find out what Globus thinks your DN looks like: run the postinstall-verifier script in /usr/local/ntap2/pmp/bin/ and if it fails the basic Globus authentication step because of an authentication failure, we can fix it. Open up the Globus gatekeeper's log in /usr/local/globus-2.4/var/ and go to the end (right after you failed the test). Look for something about a gridmap callout error, or a line that starts "[WALDEN] user:" just above another that says "[WALDEN] srvc: jobmanager". The value of the walden-user field is what Globus thinks you are -- don't sweat, it's the same cert, just a different representation of the DN. E.g., you'll often see "UID" in one DN and "USERID" in another, or "emailAddress" and "Email". Just put in an entry for the kxlist-style and one for the Globus-style, if you're having this problem. UPDATE: please read the online FAQ for an explanation of how Walden will obviate the (formerly absolute) need for the grid-mapfile. 3. /etc/grid-security/hostcert.pem and /etc/grid-security/hostkey.pem The globus_gatekeeper needs a certificate to identify the PMP machine itself. In a "normal" Globus setup, one would use the "openssl" utility to generate a certificate request; eventually, the Globus folks then send you a signed certificate. In our setup, the KCA signs all certificates, so you need to have your KCA admin make such a certificate for each PMP you have. The files storing each such certificate and associated private key must be root-readable *only* -- Globus will fail with a cryptic error message if these permissions are set incorrectly. Here is what your permissions should look like: -r-------- 1 root root 887 Apr 21 12:46 buffalo.hostkey.pem -r-------- 1 root root 1468 Apr 21 12:46 buffalo.hostcert.pem lrwxrwxrwx 1 root root 20 Apr 21 12:47 hostcert.pem -> buffalo.hostcert.pem lrwxrwxrwx 1 root root 19 Apr 21 12:47 hostkey.pem -> buffalo.hostkey.pem Here the files hostcert.pem and hostkey.pem are the names by which the Globus software accesses your certificate and private key files. Here they are shown as soft links pointing to the actual files holding the certificate and key; you could also skip the layer of indirection and store the certificate and key in the hostcert.pem and hostkey.pem files directly. 4. /etc/grid-security/certificates/ This directory contains certificates from the various authorities the Globus software may perform authentication checks with. You need a cert and signing policy for each CA in here. 5. xinetd Now that the globus_gatekeeper service has been added (by the installer) to xinetd's purview, xinetd needs to be restarted. e.g., "/etc/init.d/xinetd restart". Thereafter, verify that the gatekeeper started with "netstat -na | grep 2119" -- if you see a line of output then the gatekeeper is now listening on port 2119. __________________________________________________________________ [Post-Install] Optional Globus configuration __________________________________________________________________ 1. ~/.globus/ In certain instances, you may need a "dot-globus" directory in $HOME (normally when you'll be running command-line jobs with globusrun or globus_client, etc). I forget the specifics right now -- see globus.org. Basically, mkdir ~/.globus && sudo cp -r /etc/grid-security/certificates ~/.globus/ . 2. $X509_USER_PROXY If you want to run command-line jobs with globusrun or globus_client, you'll probably be manually acquiring credentials (e.g. with the kinit, kx509, kxlist -p dance) and may need to set up some environment variables for kxlist and globus_client to use. I set and export my X509_USER_PROXY to: /tmp/X509_proxy_richterd . Now, I can kinit && kx509 && kxlist -p to manually park my short-term cert in that file and globus_client will use it. 3. globus-user-env.sh and globus-user-env.csh Also, if you want to run command-line jobs with globusrun, you'll probably need a bunch of shell environment stuff set up for you. Globus has two shell scripts, /etc/globus-user-env.{,c}sh, that set up your PATH, LD-stuff, etc. So you should source one of these scripts (as appropriate for your shell) before you run "globusrun". _________________________________________________________________________________________ _________________________________________________________________________________________ __________________________________________________________________ [Post-Install] Mandatory Walden configuration __________________________________________________________________ 1. Java JRE/SDK If you haven't yet installed a Java runtime on your system, now is the time to do it. http://java.sun.com will have downloads available (using an RPM is really the easiest way, and it's nearly fool-proof). A Java (more precisely, JDK) version >= 1.4.2 is needed. 2. JAVA_HOME Once Java's installed, note which directory it was installed in -- e.g., mine is in /usr/java/j2sdk1.4.2_05 . Take that path and add it to your environment like this: export JAVA_HOME="/usr/java/j2sdk1.4.2_05" export PATH="${PATH}:${JAVA_HOME}/bin" The idea is that (1) Java needs to know about JAVA_HOME when you are personally running things (likely while tracking down configuration issues) and that (2) you'll want the `java` executable, etc, in your PATH. Next, you need to make sure that the JAVA_HOME defined in one of Walden's scripts corresponds to the version-of/path-to your Java installation. In a text editor, open up the perl script /usr/local/walden/mgridauth/scripts/mgridauthd-bin . The first real line will look like: $JAVA_HOME = "/usr/java/j2sdk1.4.2_05"; Modify that path so that it points to your main Java directory. Note that the "$JAVA_EXEC" variable defined underneath "$JAVA_HOME" should not need to be modified. 3. Java Keystore In order for Walden's policy daemon to perform an LDAP group lookup during its authentication/authorization duties, it needs to trust the LDAP server in order to establish a secure connection. Since the policy daemon is written in Java, the way to do this is to import CITI's KCA certificate into the Java Keystore (in $JAVA_HOME/jre/lib/security/cacerts). A copy of CITI'S KCA cert was installed by the PMP RPM into the file /usr/local/walden/citi_ca.crt . Given that file path, and given that you need root privileges to do this, the way to do this is something like: % sudo keytool -import -file /usr/local/walden/citi_ca.crt \ -keystore cacerts -alias citi_ca NOTE: that command should all be on one line. NOTE: the first thing that keytool should ask you is for the keystore password. The default password is the string "changeit" (minus the double-quotes). BTW, you should change it. When the keytool asks for confirmation, say yes. 4. Guest accounts One of the great things that Walden does is make it so that every user need not have a unique account on every machine; instead, once a user is fully-authenticated and authorized, Walden can assign their job to a guest account, e.g. "guest01", for the duration of the network test. The number of guest accounts you would like to set up, as well as the naming format for them, is part of the next step, #5. If you want to set up four accounts, you can crack open /etc/passwd and just make entries for them. Those accounts are only used for their UID/GIDs -- they are not and should not be login-capable accounts. When I made guest users on our PMPs, I named them "mgrid01", "mgrid02", etc, and the /etc/passwd entries look like: mgrid01:x:33308:33308::/home/mgrid01:/bin/false mgrid02:x:33309:33309::/home/mgrid02:/bin/false mgrid03:x:33310:33310::/home/mgrid03:/bin/false mgrid04:x:33311:33311::/home/mgrid04:/bin/false After that, GARA needs to be made aware of those accounts. This is accomplished by editing the file /usr/local/gara-1.2.2/etc/mod_pts.acl . Don't fret -- that's not really AFS/PTS stuff. Following the format in the file, add the guest accounts (whatever you named them, both in /etc/passwd and the localhost-policy.xml file from the next step, #5) to the group "citi". My entries look like: mgrid01: citi mgrid02: citi mgrid03: citi mgrid04: citi 5. /etc/grid-security/localhost-policy.xml Now you must configure the authentication policy that Walden will use (refer to the website for information on Walden); if you will only be using the /etc/grid-security/grid-mapfile authentication method, you can skip this step. Anyway, tailor the policy X(AC)ML file. 6. sunxacml.jar Now you just need to drop the XACML (the policy language used by Walden) java jarfile -- /usr/local/walden/sunxacml-1.2/lib/sunxacml.jar -- into $JAVA_HOME/jre/lib/ext/ . 7. Walden's mgridauthd The RPM copied the init script for Walden's policy daemon to /etc/init.d/mgridauthd and ran chkconfig to add it to the runlevels. However, the RPM doesn't start it automatically; either reboot or run 'sudo /etc/init.d/mgridauthd start' and verify that a Java process is running MgridAuthServer. _________________________________________________________________________________________ _________________________________________________________________________________________ __________________________________________________________________ [Post-Install] Mandatory NDT/Web100 configuration __________________________________________________________________ 1. Web100 kernel The PMP RPM copied another RPM into /usr/local/ndt (it is likely kernel-web100-2.4.26-2.3.8.i686.rpm). Install that kernel by simply running: % sudo rpm -ivh kernel-web100-2.4.26-2.3.8.i686.rpm NOTE: do not reboot yet! 2. Bootloader setup Now that the kernel has been installed in /boot/, the bootloader now needs to be configured to boot that kernel. We use grub; our entries look like this (note that I set the "default" option to "0" and I added my kernel as the topmost of the kernel entries in /boot/grub/grub.conf): title Web100 Fedora Core (2.4.26-2.3.8) root (hd0,0) kernel /vmlinuz-2.4.26-2.3.8 ro root=LABEL=/ rhgb initrd /initrd-2.4.26-2.3.8.img Now, reboot. If the machine comes up, all of the NTAP services should be running -- check with ntapctl or the postinstall- verifier. If the kernel doesn't work, we'll have to get a new Web100-enabled kernel -- either as an RPM from the Web100 project itself, or as a manually-configured and -patched thing. _________________________________________________________________________________________ __________________________________________________________________ [Post-Install] Optional portal configuration __________________________________________________________________ 1. Adding an alternate traceroute-like utility Currently, the testpilot supports normal plain-vanilla traceroute and tcptraceroute. If one wants to add another utility, it's relatively simple; it consists of installing your utility on the PMPs, adding its name to two configuration files, selecting its default command-line argument(s) (e.g., `traceroute -n'), possibly trimming its output (if too dissimilar to traceroute; however, most tend to conform for parsing-compatibility), and either setting it as the default tracer program or giving it a command-line flag. a. install the utility: e.g., I copied `tcptraceroute' to /usr/local/bin/ on all of CITI's PMPs. b. add to config files: on each PMP, you need to add the same identical entry to two files: /usr/local/gara-1.2.2/etc/diffserv_manager.conf and /usr/local/gara-1.2.2/etc/paramfile.conf. Make sure to append "-client" to your tracer's name, as per the other entries. c. command-line args: choose the arguments that make the tracer display its output in IP addresses (not hostnames) and to show 1-per-line (normally the default anyway). Now, edit /usr/local/ntap2/webserver/bin/testpilot.py and search for "defaultTracerouteRSL". Copy it to a new variable called, e.g., "myOwnTracerRSL"; change the argument(s) only in the "option-string-a" field to what- ever your tracer needs. d. (maybe) trim output: you need to get only lines that have the IPs you want listed in exactly the order you want so an accurate pathmap can be constructed. For instance, normal traceroute has on its first line of output the source and destination IPs, but you don't want it to schedule first a source-dest test and THEN schedule all the hops along the way (likely). If the output ends up having 1 IP per line and has its lines numbered " 1 111.111.111.111\n 2 111.111.111.112\n ....", then you can use "defaultTracerouteEvalFormatter". Else, write one to match what you need. e. wire into testpilot: finally, edit `testpilot.py' and search for "^(traceroute|tcptraceroute)$"; add a pipe- character ("|") after tcptraceroute and then your tracer's name. then, when "--tracer yourTracer" is given on the commandline, yours will be selected. and lastly, search for "self.addTracerProgram" and see where traceroute and tcptraceroute are added. Copy one of their calls and change the "name" to your tracer's name, "rsl" to your tracer's RSL, and "evalFormatter" to your custom one (if needed) or the "defaultTracerouteEvalFormatter" if it works for you. You're done. _________________________________________________________________________________________ _________________________________________________________________________________________ __________________________________________________________________ [NOTE: this is all out-of-date; ignore for now] __________________________________________________________________ !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!! Ignore this item for now; we have a workaround !!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 1. /etc/grid-security/hostcert-ldap.pem and /etc/grid-security/hostkey-ldap.pem Just as with the mandatory Step 3 of the Globus config above, certificates are needed, only now it's to secure the PMPs' LDAP connections with SSL. Again, you should have your KCA generate and sign a cert; also, set it up with similar permissions to the hostcert in Globus Step 3 above. Then, you need to edit the machine's ldap.conf file (often /etc/ldap.conf ) and ensure this is present: TLS_REQCERT demand TLS_CERT /etc/grid-security/hostcert-ldap.pem TLS_KEY /etc/grid-security/hostkey-ldap.pem _________________________________________________________________________________________