How to create self-contained Solaris 10 x86 Jumpstart kit

I was required to create a self-contained, single DVD to automate the installation of Solaris 10 on x86_64. I could not find any up-to-date straight forward guide which can explain how to do it, so I do it here. This is not an explanation for dummies, so you must know (to some degree, of course) what you’re doing.

I will describe the procedure in whole, and will explain in greater details below, if I see fit. A section which will be explained later will be marked with (*) at the end of the line.

  • Install Solaris 10 x86 on a machine. Many actions will happen on this little server…
  • Setup your Solaris installation according to your likings. Make sure you have your beloved users, your passwords, your configurations. Don’t mind much about networking configurations (IP, Netmask, etc) - as they will be unconfigured for the image.
  • Create a Flash Image (flar) of the system (*)
  • Copy the contents of the installation DVD to a directory inside your system. Let’s call it /tmp/dvd
  • Remove /tmp/dvd/Solaris_10/Product directory. You will not need it.
  • Extract the contents of /tmp/dvd/x86.miniroot to /tmp/miniroot (*)
  • Perform several actions with the extracted miniroot (*)
  • Re-archive the contents of the x86.miniroot and place them instead of /tmp/dvd/boot/x86.miniroot
  • Place the flar file inside /tmp/dvd/flash
  • Edit your jumpstart files inside /tmp/dvd/.install_config (*)
  • Edit /tmp/dvd/boot/grub/menu.lst boot loader to add an entry for your installation (*)
  • Create an ISO from the DVD directory (*)
  • Burn the DVD and try to use it

And now for the drill-down

Creating a flash image

Use the command flarcreate to create your own flash image:

flarcreate -n sol10_automation -c -x /tmp /tmp/sol10_auto.flar

This should do the work. Remember - /tmp will not be persistent across reboots! Make sure your files are not there before you reboot the system!

Extracting/Archiving the x86.miniroot

To do so, you need to run the command /boot/solaris/bin/root_archive

Extracting the image can be done like this:

/boot/solaris/boot/root_archive unpack /tmp/dvd/boot/x86.miniroot /tmp/miniroot

Archiving the image can be done like this:

/boot/solaris/boot/root_archive pack /tmp/dvd/boot/x86.miniroot /tmp/miniroot

Actions to perform on the extracted miniroot

Three actions are to be performed on the extracted miniroot. In our example, it resides on /mnt/miniroot.

First, you need to remove the default sysidcfg (which is a symbolic link)

rm /mnt/miniroot/etc/sysidcfg

Now, you have to place your custom sysidcfg in there, instead.

This is an example of my own sysidcfg file:

name_service=NONE
network_interface=nge0 {primary hostname=sol10
ip_address=10.10.10.10
netmask=255.0.0.0
default_route=NONE
protocol_ipv6=no }
nfs4_domain=dynamic
service_profile=open
root_password=12wR2rF34t
security_policy=NONE
system_locale=en_US.UTF-8
timezone=GMT
timeserver=localhost
keyboard=US-English
terminal=xterm

The root password is encrypted. Take it from your own /etc/shadow file. For more information about sysidcfg file, check out Sun site.

Following that, you need to edit a specific file in the miniroot. Edit /tmp/miniroot/usr/sbin/install.d/profind and search for the cdrom() function. Search the line

if [ -f /tmp/.preinstall ]; then

and hash (remark) it. Don’t forget to remark the closing “fi” below.

Jumpstart contents

This has to be inside /tmp/dvd/.install_config . Edit the file /tmp/dvd/.install_config/rules and make sure it has only one line (in our example. If you know what you’re doing with Jumpstart, go ahead!)

any -   x86-begin any_machine  x86-end

This line will match any hardware, run x86-begin script (from that same directory) on it prior to running the installation itself, and run x86-end script on it after the installation phase. It allows up further customisation during installs (verify what type of RAID, check memory, whatever). The installation profile itself is the file any_machine.

You will need to run “check” on the file to build the rules.ok file

cd /tmp/dvd/.install_conf

/tmp/dvd/Solaris_10/Misc/jumpstart_sample/check

Lets look at my any_machine file:

install_type    flash_install
archive_location local_file /cdrom/flash/sol10_auto.flar
partitioning    explicit
filesys         any 8196 swap
filesys         any 10240 /
filesys         any free /storage

Notice that the installation type is “flash_install” and that the location of the file is local, inside /cdrom (where the bootable dvd will be mounted) inside a directory called flash. Partitioning is defined here, explicitly.

For more information about Jumpstart, search in Sun site. They have plenty of information.

Edit Grub

Add the following entry to your /tmp/dvd/boot/grub/menu.lst file

title Solaris10 Jumpstart
kernel /boot/multiboot kernel/unix - install -B \
install_media=cdrom
module /boot/x86.miniroot

Make sure it is the default option for grub.

Creating DVD ISO from the directory

We’re almost done. To create a DVD iso file from the directory, perform the following actions:

cd /tmp/dvd

mkisofs -b boot/grub/stage2_eltorito -c .catalog -no-emul-boot -boot-load-size 4 -boot-info-table -relaxed-filenames -l -ldots -r -N -d -D -V SOL_10_1008_X86 -o /tmp/sodvd.iso .

Don’t ignore the “.” at the end!

(This specific line was tested on Linux, but there is no reason for it not to work on any modern Solaris system)

Appendix

You would like to keep your /tmp/dvd directory somewhere else, or you will lose it on your next reboot.

This sums it up. Let me know if the procedure is broken somehow.

Poor man’s load balancing

I was requested to setup a “poor man’s” load balancing. The server accesses various HTTP servers with a GET command, and the customer fears that the server’s IP will get blocked. To work around this, or at least - to minimize the problem, the customer has purchased three IP addresses.

I have assigned all three addresses to the server, and was into smart routing as a solution. I did not want to capture all outbound communication, but only HTTP (port 80). Also - the system is Centos, which means there are only few available iptbles modules, so “random” module is not an option. Compiling a new kernel for a server across an ocean didn’t sound like the best idea, so I have attempted to work with the available tools.

With net.ipv4.conf.default.rp_filter set to 0 in /etc/sysctl.conf , and with real internet IPs (won’t work above NAT), I added the following rules to the mangle table:

iptables -t mangle -N CUST_OUTPUT
iptables -t mangle -A CUST_OUTPUT -o ! lo -p tcp -m recent –remove -j MARK –set-mark 1
iptables -t mangle -A OUTPUT -o ! lo -p tcp –dport 80 -m state –state NEW -m recent –update –seconds 60 –hitcount 6 -j  CUST_OUTPUT
iptables -t mangle -A OUTPUT -o ! lo -p tcp –dport 80 -m state –state NEW -m recent –update –seconds 60 –hitcount 4 -j MARK –set-mark 3
iptables -t mangle -A OUTPUT -m state –state NEW -m mark –mark 3 -j ACCEPT
iptables -t mangle -A OUTPUT -o ! lo -p tcp –dport 80 -m state –state NEW -m recent –update –seconds 60 –hitcount 2 -j MARK –set-mark 2
iptables -t mangle -A OUTPUT -m state –state NEW -m mark –mark 2 -j ACCEPT
iptables -t mangle -A OUTPUT -o ! lo -p tcp –dport 80 -m state –state NEW -m recent –set

To the nat table, I have added these following rules:

iptables -t nat -A POSTROUTING -m mark –mark 2 -j SNAT –to 1.1.1.2
iptables -t nat -A POSTROUTING -m mark –mark 3 -j SNAT –to 1.1.1.3
iptables -t nat -A POSTROUTING -m mark –mark 1 -j SNAT –to 2.3.4.5

(replaced the real customer’s IPs with the junk above). Of course - all three IP addresses are accessible from the Internet and fully routable.

After a short run, I saw the following lines when running iptables -t nat -L -v

Chain POSTROUTING (policy ACCEPT 39055 packets, 2495K bytes)
pkts bytes target     prot opt in     out     source               destination
143  8580 SNAT       all  –  any    any     anywhere             anywhere            MARK match 0×2 to:1.1.1.2
144  8640 SNAT       all  –  any    any     anywhere             anywhere            MARK match 0×3 to:1.1.1.3
72  4320 SNAT       all  –  any    any     anywhere             anywhere             MARK match 0×1 to:2.3.4.5

Statistically, this is quite ok. Mark 0×1 forces the same route as Mark 0×0, so this is rather balanced.

Works like a charm, for now :-)

Persistent raw devices for Oracle RAC with iSCSI

If you’re into Oracle RAC over iSCSI, you should be rather content - this configuration is a simple and supported. However, working with some iSCSI target devices, order and naming is not consistent between both Oracle nodes.

The simple solutions are by using OCFS2 labels, or by using ASM, however, if you decide to place your voting disks and cluster registry disks on raw devices, you are to face a problem.

iSCSI on RHEL5:

There are few guides, but the simple method is this:

  1. Configure mapping in your iSCSI target device
  2. Start the iscsid and iscsi services on your Linux
    • service iscsi start
    • service iscsid start
    • chkconfig iscsi on
    • chkconfig iscsid on
  3. Run “iscsiadm -m discovery -t st -p target-IP
  4. Run “iscsiadm -m node -L all
  5. Edit /etc/iscsi/send_targets and add to it the IP address of the target for automatic login on restart

You need to configure partitioning according to the requirements.

If you are to setup OCFS2 volumes for the voting and for the cluster registry, there should not be a problem as long as you use labels, however, if you require raw volumes, you need to change udev to create your raw devices for you.

On a system with persistent disk naming, follow this process, however, on a system with changing disk names (every reboot names are different), the process can become a bit more complex.

First, detect your scsi_id for each device. While names might change upon reboots, scsi_ids do not.

scsi_id -g -u -s /block/sdc

Replace sda with the device name you are looking for. Notice that /block/sda is a reference to /sys/block/sdc

Use the scsi_id generated by that to create the raw devices. Edit /etc/udev/rules.d/50-udev.rules and find line 298. Add a line below with the following contents:

KERNEL==”sd*[0-9]“, ENV{ID_SERIAL}==”14f70656e66696c000000000004000000010f00000e000000″, SYMLINK+=”disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}-part%n” ACTION==”add” RUN+=”/bin/raw /dev/raw/raw%n %N”

Things to notice:

  1. The ENV{ID_SERIAL} is the same scsi_id obtained earlier
  2. This line will create a raw device in the name of raw and number in /dev/raw for each partition
  3. If you want to differtiate between two (or more) disks, change the name from raw to an aduqate name, like “crsa”, “crsb”, etc, for example:

KERNEL==”sd*[0-9]“, ENV{ID_SERIAL}==”14f70656e66696c000000000005000000010f00000e000000″, SYMLINK+=”disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}-part%n” ACTION==”add” RUN+=”/bin/raw /dev/raw/crs%n %N”

Following these changes, run “udevtrigger” to reload the rules. Be advised that “udevtrigger” might reset network connection.

Protect Vmware guest under RedHat Cluster

Most documentation on the net is about how to run a cluster-in-a-box under Vmware. Very few seem to care about protecting Vmware guests under real RedHat cluster with a shared storage.

This article is just about it. While I would not recommend using Vmware in such a setup, it has been the case, and that Vmware guest actually resides on the shared storage. To relocate it is out of the question, so migrating it together with other resources is the only valid option.

To do so, I have created a simple script which will accept start/stop/status arguments. The Vmware guest VMX is hard-coded into the script, but in an easy-to-change format. This script will attempt to freeze the Vmware guest, and only if it fails, to shut it down. Mind you that the blog’s HTML formatting might alter quotation marks into UTF-8 marks which will not be understood well by shell.

#!/bin/bash
# This script will start/stop/status VMware machine
# Written by Ez-Aton
# http://www.tournament.org.il/run
 
# Hardcoded. Change to match your own settings!
VMWARE="/export/vmware/hosts/Windows_XP_Professional/Windows XP Professional.vmx"
VMRUN="/usr/bin/vmrun"
TIMEOUT=60
 
function status () {
  # This function will return success if the VM is up
  $VMRUN list | grep "$VMWARE" &>/dev/null
  if [[ "$?" -eq "0" ]]
  then
    echo "VM is up"
    return 0
  else
    echo "VM is down"
    return 1
  fi
}
 
function start () {
  # This function will start the VM
  $VMRUN start "$VMWARE"
  if [[ "$?" -eq "0" ]]
  then
    echo "VM is starting"
    return 0
  else
    echo "VM failed"
    return 1
  fi
}
 
function stop () {
  # This function will stop the VM
  $VMRUN suspend "$VMWARE"
  for i in `seq 1 $TIMEOUT`
  do
    if status
    then
      echo
    else
      echo "VM Stopped"
      return 0
    fi
    sleep 1
  done
  $VMRUN stop "$VMWARE" soft
}
 
case "$1" in
start)     start
        ;;
stop)      stop
        ;;
status)   status
        ;;
esac
RET=$?
 
exit $RET

Since the formatting is killed by the blog, you can find the script here: vmware1

I intend on building a “real” RedHat Cluster agent script, but this should do for the time being.

Enjoy!

Xen Networking - Bonding with VLAN Tagging

The simple scripts in /etc/xen/scripts which manage networking are fine for most usages, however, when your server is using bonding together with VLAN tagging (802.11q) you should consider an alternative.

A PDF document written by Mark Nielsen, GPS Senior Consultant, Red Hat, Inc (I lost the original link, sorry) named “BOND/VLAN/Xen Network Configuration” as a service to the community, game me few insights on the subject. Following one of its references, I saw a bit more elegant method of doing a bridging setup under RedHat, which takes managing the bridges away from xend, and leaves it at the system level. Lets see how it’s done on RedHat style Linux distribution.

Manage your normal networking configurations

If you’re using VLAN tagging over bonding, than you should have to setup a bonding device (be it bond0) which has definitions such as this:

/etc/sysconfig/network-scripts/ifcfg-eth0 and /etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
ISALIAS=no

/etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0
BOOTPROTO=none
BONDING_OPTS=”mode=1 miimon=100″
ONBOOT=yes

This is rather stright-forward, and should be rather a default for such a setup. Now comes the more interesting part. Originally, the next configuration part would be bond0.2 and bond0.3 (in my example). The original configuration would have looked like this (this is in bold because I tend to fast-read myself, and tend to miss things too often). This is not how it should look when we’re done!

/etc/sysconfig/network-scripts/ifcfg-bond0.2 (same applies to ifcfg-bond0.3)

DEVICE=bond0.2
BOOTPROTO=static
IPADDR=192.168.0.2
NETMASK=255.255.255.0
ONBOOT=yes
VLAN=yes

Configure bridging

To setup a bridge device for bond0.2, replace the mentioned above ifcfg-bond0.2 with this new /etc/sysconfig/network-scripts/ifcfg-bond0.2

DEVICE=bond0.2
BOOTPROTO=static
ONBOOT=yes
VLAN=yes
BRIDGE=xenbr0

Now, create a new file /etc/sysconfig/network-scripts/ifcfg-xenbr0

DEVICE=xenbr0
BOOTPROTO=static
IPADDR=192.168.0.2
NETMASK=255.255.255.0
ONBOOT=yes
TYPE=bridge

Now, on network restart, the bridge will be brought up, holding the right IP address - all done by initscripts, with no Xen intervention. You will want to repeat the last the “Configure bridge” part for any additional bridge you want to be enabled for Xen machines.

Don’t let Xen bring any bridges up

This is the last part of our drill, and it is very important. If you don’t do it, you’ll get a nice networking mess. As said before, Xen (community), by default, can’t handle bondings or vlan tags, so it will attempt to create or modify bridges to eth0 or the likes. Edit /etc/xen/xend-config.sxp and remark any line containing a directive containing starting with “network-script“. Such a directive would be, for example

(network-script network-bridge)

Restart xend and restart networking. You should now be able to configure VMs to use xenbr0 and xenbr1, etc (according to your own personal settings).

MySQL permissions for LVM Snapshots

aking LVM snapshots as a mean of backing up MySQL is rather simple, as can be described here. However, if you are into security, you would strive to grant minimal permissions for the action to the MySQL user. Per MySQL Documentation, the required privileges is “RELOAD”. That should be enough, granted on *.*, of course.

Raw devices for Oracle on RedHat (RHEL) 5

There is a major confusion among DBAs regarding how to setup raw devices for Oracle RAC or Oracle Clusterware. This confusion is caused by the turn RedHat took in how to define raw devices.

Raw devices are actually a manifestation of character devices pointing to block devices. Character devices are non-buffered, so they act as FIFO, and have no OS cache, which is why Oracle likes them so much for Clusterware CRS and voting.

On other Unix types, commonly there are two invocations for each disk device - a block device (i.e /dev/dsk/c0d0t0s1) and a character device (i.e. /dev/rdsk/c0d0t0s1). This is not the case for Linux, and thus, a special “raw”, aka character, device is to be defined for each partition we want to participate in the cluster, either as CRS or voting disk.

On RHEL4, raw devices were setup easily using the simple and coherent file /etc/sysconfig/rawdevices, which included an internal example. On RHEL5 this is not the case, and customizing in a rather less documented method the udev subsystem is required.

Check out the source of this information, at this entry about raw devices. I will add it here, anyhow, with a slight explanation:

1. Add to /etc/udev/rules.d/60-raw.rules:

ACTION==”add”, KERNEL==”sdb1″, RUN+=”/bin/raw /dev/raw/raw1 %N”

2. To set permission (optional, but required for Oracle RAC!), create a new /etc/udev/rules.d/99-raw-perms.rules containing lines such as:

KERNEL==”raw[1-2]“, MODE=”0640″, GROUP=”oinstall”, OWNER=”oracle”

Notice this:

  1. The raw-perms.rules file name has to begin with the number 99, which defines its order during rules apply, so that it will be used after all other rules take place. Using lower numbers might cause permissions to be incorrect.
  2. The following permissions have to apply:
  • OCR Device(s): root:oinstall , mode 0640
  • Voting device(s): oracle:oinstall, mode 0666
  • You don’t have to use raw devices for ASM volumes on Linux, as the ASMLib library is very effective and easier to manage.

    Xen VMs performance collection

    Unlike VMware Server, Xen’s HyperVisor does not allow an easy collection of performance information. The management machine, called “Domain-0″ is actually a privileged virtual machine, and thus - get its own small share of CPUs and RAM. Collecting performance information on it will lead to, well, collecting performance information for a single VM, and not the whole bunch.

    Local tools, such as “xentop” allows collection of information, however, combining this with Cacti, or any other SNMP-based collection tool is a bit tricky.

    A great solution is provided by Ian P. Christian in his blog post about Xen monitoring. He has created a Perl script to collect information. I have taken the liberty to fix several minor things with his permission. The modified scripts are presented below. Name the script (according to your version of Xen) “/usr/local/bin/xen_stats.pl” and set it to be executable:

    For Xen 3.1

    ?Download xen_stats.pl
    #!/usr/bin/perl -w
     
    use strict;
     
    # declare...
    sub trim($);
    #<a href="/blog/uploads/xen_cloud.tar.gz" title="xen_cloud.tar.gz" target="_blank">xen_cloud.tar.gz</a>
    # we need to run 2 iterations because CPU stats show 0% on the first, and I'm putting .1 second betwen them to speed it up
    my @result = split(/\n/, `xentop -b -i 2 -d.1`);
     
    # remove the first line
    shift(@result);
     
    shift(@result) while @result && $result[0] !~ /^xentop - /;
     
    # the next 3 lines are headings..
    shift(@result);
    shift(@result);
    shift(@result);
    shift(@result);
     
    foreach my $line (@result)
    {
      my @xenInfo = split(/[\t ]+/, trim($line));
      printf("name: %s, cpu_sec: %d, cpu_percent: %.2f, vbd_rd: %d, vbd_wr: %d\n",
        $xenInfo[0],
        $xenInfo[2],
        $xenInfo[3],
        $xenInfo[14],
        $xenInfo[15]
        );
    }
     
    # trims leading and trailing whitespace
    sub trim($)
    {
      my $string = shift;
      $string =~ s/^\s+//;
      $string =~ s/\s+$//;
      return $string;
    }

    For Xen 3.2 and Xen 3.3

    ?Download xen_stats.pl
    #!/usr/bin/perl -w
     
    use strict;
     
    # declare…
    sub trim($);
     
    # we need to run 2 iterations because CPU stats show 0% on the first, and I’m putting .1 second between them to speed it up
    my @result = split(/\n/, `/usr/sbin/xentop -b -i 2 -d.1`);
     
    # remove the first line
    shift(@result);
    shift(@result) while @result && $result[0] !~ /^[\t ]+NAME/;
    shift(@result);
     
    foreach my $line (@result)
    {   
            my @xenInfo = split(/[\t ]+/, trim($line));
            printf(“name: %s, cpu_sec: %d, cpu_percent: %.2f, vbd_rd: %d, vbd_wr: %d\n“,
            $xenInfo[0],
            $xenInfo[2],
            $xenInfo[3],
            $xenInfo[14],
            $xenInfo[15]
            );
    }   
    # trims leading and trailing whitespace
    sub trim($)
    {   
            my $string = shift;
            $string =~ s/^\s+//;
            $string =~ s/\s+$//;
            return $string;
    }

    Cron settings for Domain-0

    Create a file “/etc/cron.d/xenstat” with the following contents:

    # This will run xen_stats.pl every minute
    */1 * * * * root /usr/local/bin/xen_stats.pl > /tmp/xen-stats.new && cat /tmp/xen-stats.new > /var/run/xen-stats

    SNMP settings for Domain-0

    Add the line below to “/etc/snmp/snmpd.conf” and then restart the snmpd service

    extend xen-stats   /bin/cat /var/run/xen-stats

    Cacti

    I reduced Ian Cacti script to be based on a per-server setup, meaning this script gets the host (dom-0) name from Cacti, but cannot support live migrations. I will try to deal with combining live migrations with Cacti in the future.

    Download and extract my modified xen_cloud.tar.gz file. Extract it, place the script and config in its relevant location, and import the template into Cacti. It should work like charm.

    A note - the PHP script will work only on PHP5 and above. Works flawlessly on Centos5.2 for me.

    Oracle RAC with EMC iSCSI Storage Panics

    I have had a system panicking when running the mentioned below configuration:

    • RedHat RHEL 4 Update 6 (4.6) 64bit (x86_64)
    • Dell PowerEdge servers
    • Oracle RAC 11g with Clusterware 11g
    • EMC iSCSI storage
    • EMC PowerPate
    • Vote and Registry LUNs are accessible as raw devices
    • Data files are accessible through ASM with libASM

    During reboots or shutdowns, the system used to panic almost before the actual power cycle. Unfortunately, I do not have a screen capture of the panic…

    Tracing the problem, it seems that iSCSI, PowerIscsi (EMC PowerPath for iSCSI) and networking services are being brought down before “killall” service stops the CRS.

    The service file init.crs was never to be executed with a “stop” flag by the start-stop of services, as it never left a lock file (for example, in /var/lock/subsys), and thus, its existence in /etc/rc.d/rc6.d and /etc/rc.d/rc0.d is merely a fake.

    I have solved it by changing /etc/init.d/init.crs script a bit:

    • On “Start” action, touch a file called /var/lock/subsys/init.crs
    • On “Stop” action, remove a file called /var/lock/subsys/init.crs

    Also, although I’m not sure about its necessity, I have changed init.crs script SYSV execution order in /etc/rc.d/rc0.d and /etc/rc.d/rc6.d from wherever it was (K96 in one case and K76 on another) to K01, so it would be executed with the “stop” parameter early during shutdown or reboot cycle.

    It solved the problem, although future upgrades to Oracle ClusterWare will require being aware of this change.

    Minimal Centos5 and SSH Server with X forwarding

    Installation of minimal selection of Centos 5 - only base, core and dialup selections, will leave a small-sized system, with the minimum required.

    This is a good setup to start setting up a firewall, and it lets you add any required package later using ‘yum’.

    This said, you cannot forward X over SSH at this state. A minimal setup is a minimal setup, after all. I was searching for the package required to allow X forwarding over SSH, and found it - xorg-x11-xauth

    Install this one, and X will be forwarded over your SSH connection (you still need to connect with the ‘-X’ flag from the client).