Using NGINX for your static content

A client has a group of 6 or so servers that provide some php based applications. In order to reduce the load on Apache (number of concurrent threads), we thought that it would be a good idea to move static content (images, css, javascript, etc) off onto another server. Past experience indicates that NGINX might be a good choice for this. It became clear that managing the static content on this new server was going to be a little challenging. Suppose we have 3 application servers and want to use the same static content server for all of them – we need to find a way to divide the content up, one way would be to keep the content for each of the application servers in a different folder and change the URLs to reflect that:

http://static.company.com/app1/foo.css

http://static.company.com/app2/bar.jpg

http://static.company.com/app3/moo.js

While this would work, we have to go back and change all the paths in the application to add the folder and manage the content in the folders. Annoying and not very portable. Another thought was to use a different virtual server in NGINX for each application server:

http://app1-static.company.com/foo.css

http://app2-static.company.com/bar.jpg

http://app3-static.company.com/moo.js

Then we create CNAME records for each that makes app1-static a CNAME for static. Now we can just change the hostname in the URL for the static content. We put the hostname for the static content in the config for the application. The path portion stays the same.

Now the problem becomes maintaining the static content. In the SVN repository for the application, the static content is maintained along side the rest of the application. In a developer mode, the static content server can be set to the same machine as the application server. How do we maintain just the static content in the virtual machines on the static content server? Answer; we don’t. Make the static content server a caching proxy server instead and get the content from the application server, cache it and deliver it to the end user. In the NGINX config, we can make it a caching proxy for just our hosts. Say we have app1, app2 and app3 application servers and the static content CNAME records app1-static, app2-static and app3-static, we can use the following in the NGINX config:

if ($host ~* (.*)-static\.company\.com) {
  set $host $1;
}
proxy_pass http://$host.company.com;

As long as we have our CNAME pointing to this static content server with the hostname format like shown, it will fetch, cache and deliver our stuff for us. We don’t have to worry about keeping the files on the static content server up to date with the apps. Tweaking the other cache settings depend on how often you want to refresh the cache and how large the cache can be, etc.

 

 

rsync backup to Amazon EBS storage

I’d been toying with the idea of using Amazon Web Services storage system as a remote backup facility for a dedicated server or VPS. The problem has been that there doesn’t seem to be any really good way to remotely mount the Amazon file system so that you get full backups without moving the whole backup set each time. I tried some fuse based systems that make the AWS storage look like a drive and use rsync, but it was really slow – particularly for one application I had with thousands of small files. What I really wanted was a way to really mount the EBS storage volume as an ext3 filesystem and just rsync my data to it. I did not want to pay to leave an EC2 instance running for the other 23.5 hrs of the day my backup wasn’t running (now with the micro instance this isn’t such a big deal), so what I decided to try was to boot an EC2 instance, mount the EBS storage, transfer some ssh keys and do the rsync to that EC2 instance. Launch the whole thing from a daily cron job and you have an inexpensive remote snapshot of your data.

Using Amazon’s AWS SDK for PHP I was able to write a simple script that would do this:

<?php
require_once '/usr/share/pear/AWSSDKforPHP/sdk.class.php';
$notify = 'youremail@yourdomain.com';
$diskvol = 'vol-XXXXXX';
$diskdev = '/dev/sdc';
$queue = 'https://queue.amazonaws.com/NNNNNNN/rsync';
$waitsleep = 60; # check every minute
$waitcycles = 60; # for up to an hour
$ec2 = new AmazonEC2();
$startup = base64_encode(file_get_contents("startup.sh"));
$opts = array(
 'KeyName' => 'keyname',
 'InstanceType' => 't1.micro',
 'SecurityGroup' => 'secgroup',
 'Placement' => array('AvailabilityZone'=>'us-east-1d'),
 'UserData' => $startup,
);
$response = $ec2->run_instances('ami-XXXXXX', 1, 1, $opts);
$instanceid = $response->body->instancesSet->item->instanceId;
# wait for it to boot
$end = false;
$c = 0;
$dnsfinal = '';
$passes = 25;
$count = 0;
$done = false;
while ( $done == false ) {
  $count++;
  sleep(10);
  $tmp_ec2 = new AmazonEC2();
  $resp = $tmp_ec2->describe_instances();
  $x = $resp->body->reservationSet->item->count();
  for ($i=0; $i<$x; $i++) {
    $dns = $resp->body->reservationSet->item[$i]->instancesSet->item->dnsName;
    $state = $resp->body->reservationSet->item[$i]->instancesSet->item->instanceState->name;
    $tempid = $resp->body->reservationSet->item[$i]->instancesSet->item->instanceId;
    if ( !strcmp($tempid, $instanceid) ) {
      if ( $state == "running" ) {
        $dnsfinal = $dns;
        $done = true;
        break;
      }
    }
  }
  if ( $count >= $passes ) { $done = true; }
}
# should be done now - if the machine is up, save the console output
# and attach the disk volume
$disk_ec2 = new AmazonEC2();
$resp = $disk_ec2->attach_volume($diskvol, $instanceid, $diskdev);
# The machine should have the command script, watch the queue and wait for something to happen
$sqs = new AmazonSQS();
for ($i=0; $i<$waitcycles; $i++) {
  sleep($waitsleep);
  $response = $sqs->receive_message($queue);
  $message = urldecode($response->body->ReceiveMessageResult->Message->Body);
  $handle = $response->body->ReceiveMessageResult->Message->ReceiptHandle;
  # got a message
  if ( strlen($message) > 0 ) {
    # all done - what happened?
    alert("AWS job completed, status: " . $message);
    $sqs->delete_message($queue, $handle);
    break;
  }
  if ( $i == ($waitcycles - 1)) {
    alert("AWS job did not complete within $waitcycles cycles.");
  }
}
# terminate this instance
$response = $ec2->terminate_instances($instanceid);
if (!$response->isOK()) {
  alert("Shutdown of machine failed: " . $instanceid);
}
function alert($msg) {
  global $notify;
  $ses = new AmazonSES();
  $message = array(
   "Subject" => array("Data"=>"AWS Job Status"),
   "Body" => array("Text"=> array("Data"=>$msg)),
  );
  $r = $ses->send_email('fromemailaddress', array('ToAddresses'=>$notify), $message);
}

The machine is booted and transfers the startup.sh script, which does the backup from the EC2 instance side using a AWS shell library:

#!/bin/bash
diskok=0
sshok=0
mkdir /mnt/backup
sleep 10
mount /dev/sdc1 /mnt/backup
sleep 10
if [ -d "/mnt/backup/lost+found" ]
 then
 diskok=1
 date > /mnt/backup/starttime
 else
 diskok=0
fi
curl https://s3.amazonaws.com/XYZ/aws -o /root/aws
echo YOURAWSCRED> ~/.awssecret
echo YOURAWSSEC>> ~/.awssecret
chmod 600 ~/.awssecret
perl ~/aws --install
if [ $diskok -eq 0 ]
 then
 aws send-message /NNNNNNNN/rsync -message "diskerror"
 exit 1
fi
mkdir /root/.ssh
cat > /root/.ssh/id_rsa <<id_rsaEOM
-----BEGIN RSA PRIVATE KEY-----
YOUR PRIVATE KEY 
-----END RSA PRIVATE KEY-----
id_rsaEOM
chmod 600 /root/.ssh/id_rsa
cat > /root/.ssh/known_hosts <<known_hostsEOM
if rsync -a -e 'ssh' backup@yourhost.com:/backup/daily.0 /mnt/
backup/.
 then
 disksize=`df -h`
 aws send-message /NNNNNNN/rsync -message "complete $disksize"
 date > /mnt/backup/endtime
 else
 aws send-message /NNNNNNNN/rsync -message "error"
fi

There are probably a lot of improvements that could be made, but it is pretty handy. Depending on how much the data changes from one day to the next, this sometimes only takes a couple of minutes to run.

Extending LVM partitions on a linux VMWare guest

Sometimes you need more space. If you’ve done what I’ve done and used CentOS default partitions on VMWare ESXi guests and you suddenly find out that a machine needs a larger disk, this is how I’ve done it.

  1. Make the disk larger in ESXi
  2. umount /backup
  3. vgchange -a n vg_backup
  4. echo “1″ > /sys/block/sdb/device/rescan (may need to reboot)
  5. fdisk /dev/sdb
  6. Delete the partition that you want to make larger (assume it is at the end of the disk), add the partition back with the same starting cylinder, new ending cylinder.
  7. pvresize /dev/sdb1
  8. vgchange -a y vg_backup
  9. lvextend -l +100%FREE /dev/mapper/vg_backup-lv_backup
  10. lvdisplay (make sure its bigger)
  11. e2fsck -f /dev/mapper/vg_backup-lv_backup
  12. resize2fs /dev/mapper/vg_backup-lv_backup
If this is the root partition or you can’t unmount it, you can still do this, but will have to skip a couple steps (1, 4) and add a couple reboots (after 1 and 12). Your mileage may vary.
A better way I had discovered is not use LVM at all. Just use an ext4 partition, make it larger in ESXi and then change the partition table in the guest OS.