Tuesday, November 13, 2007

Lab 4 Schema Clarifications

I came away from our lab 4 design session with a couple wrong impressions regarding the schema, and got them clarified by Sam yesterday. Here's a less ambiguous version of the message format we're supposed to use (would have updated it on the lab page but I'm not on campus at the moment).


<idea guid="">
  <initiated date="" technology="(wstechnology)">Name provided by user</initiated>
  <submitted date="" technology="(submit server tech)">RY name of submit server creator</submitted>
  <spam>true</spam>
  <domain>www.foo.com</domain>
  <body>Foo and gunk are better for this site than xs</body>
</idea>


You should notice that the wsuser that we POSTed from our submit script is getting thrown away. I assumed that since we went through the trouble of POSTing it, we'd definitely use it-- and I ended up putting the user provided name in the submitted element.

Looking back I don't think we got our schema right. I think we're trying to cram too much information into too few elements. Sure, it keeps us from having to add an additional element (<user> for example) to store the information; but the result is that we've lost some information-- wsuser (less important) and made it more confusing (more important).

Saturday, November 10, 2007

SQS: Queue Length / Auth Signature

To get the queue length, as well as the visibility timeout, you make a request using the GetQueueAttributes action. The PHP library I'm using to make my calls to SQS doesn't support this call (must have been written before the 2007-05-01 release of SQS) so my options are to find a new library, or to write my own function to do this.

I decided to try writing my own first, and while researching this I found something I was looking for while doing lab 4. How to compute the authorization header, or Signature.

The process is as follows, you take the query parameters and concatenate them all end to end (key preceding value). Don't include the ?, &, or = signs. Then you calculate the HMAC-SHA1 signature of that string (using your secret access key). Then convert it to base64.

Here's the example Amazon gives on their site.

The following request:

?Action=CreateQueue
&QueueName=queue2
&AWSAccessKeyId=0A8BDF2G9KCB3ZNKFA82
&SignatureVersion=1
&Expires=2007-01-12T12:00:00Z
&Version=2006-04-01


translates into the following string:

ActionCreateQueueAWSAccessKeyId0A8BDF2G9KCB3ZNKFA82Expires2007-01-12T12:00:00ZQueueNamequeue2SignatureVersion1Version2006-04-01

which when hashed with the secret key (fake-secret-key, used in this example) yields:

wlv84EOcHQk800Yq6QHgX4AdJfk=
(URL encoded version: wlv84EOcHQk800Yq6QHgX4AdJfk%3D)


I looked at my PHP library, and sure enough here are the methods that create the signature. They require the PEAR Crypt_HMAC package.


function hex2b64($str) {
  $raw = '';
  for ($i=0; $i < strlen($str); $i+=2) {
    $raw .= chr(hexdec(substr($str, $i, 2)));
  }
  return base64_encode($raw);
}

function constructSig($str) {
  $hasher =& new Crypt_HMAC($this->secretKey, "sha1");
  $signature = $this->hex2b64($hasher->hash($str));
  return($signature);
}

Lab 5 : Approval Client

I'm just about finished with my approval client, the things I need to do are:

A) figure out how to get a count of the number of items in the queue (my library doesn't support that function)
B) Make my SOAP calls to the WHOIS service (is there a specific service we're supposed to be using for this?)

I've got my prototype running here. I'm using Flex for the front end, with data provided by a PHP backend. My plug for Flex follows...

Based on anecdotal evidence (i.e., conversations I've had) I think that Adobe Flex is one of the most misconstrued technologies in our department. I just wanted to take a few lines and address some of the misconceptions I've heard as I've discussed Flex with fellow students.

Things you've probably heard about Adobe Flex:
1. It's proprietary.
2. It's uses Flash.
3. It costs hundreds of dollars for the compiler/IDE.
4. Flex data services costs several thousand dollars per processor to deploy.

While there is some truth to all the statements above, but bottom line in regards to cost is this: Flex is free.

You can download the free SDK (incidentally, all you need to compile and deploy Flex applications) here

If you'd like an IDE, go here to download a free academically licensed version of Flex Builder 2, with Charting. You'll need to provide some identification.

Follow links here to download Flex Data Services Express (licensed free of charge on up to 1 processor). I should mention that I've used data in all the Flex applications I've developed, and I have yet to try Flex Data Services. There is a wide variety of options for getting data to your application. I've used Java web services, PHP pages returning XML, AMF (Actionscript's serialized data format) streams, and a couple others.

And yes, it does run in a Flash player (the resemblance to Flash ends there) but that does have its advantages. As long as your platform has a Flash player, your application will look and function exactly the same on Linux, a Mac, a Windows PC, or whatever. And as far as RIA's go, that's saying something.

All things accounted for, it's cheaper to develop in Flex than it is to develop in AJAX. And the applications look better consistently across platforms. So if the "cost and Flash" are the only things holding you back from checking it out, you really ought to look into it.

Thursday, November 8, 2007

Lab 4 - Submit Server

Once again, I got my best advice at the end of the lab. I'll share it with you-- use a library when trying to interact with the SQS.

The first approach I tried that failed was using cURL to send the PUT request. You can see the code I used for that in my previous POST. Although it works in general, I wasn't having much success using it with SQS. Perusing the SQS documentation a little more closely revealed that I was not sending the correct headers. Here's a link to it.

I didn't find that link until I had given up on cURL and switched to PEAR's HTTP_Request package. As far as general purpose HTTP request packages go, the interface is much easier to use. I added most of the headers, but was having trouble formatted the Authorization header correctly. That's when I got the advice to use a library.

I checked out a couple SQS libraries for PHP, the one I ended up going with was one I found on Amazon's site. The documentation is wanting, but I was able to figure out what to do. It makes use of the PEAR extensions. I didn't have to make the changes to the PEAR code like it suggests on the site (I don't know if it's because it was correct, or because I wasn't exercising the faulty code).

Here's the code I used to make the request:

function submitToQueue($xml) {
  $q = new sqs('ACCESS-KEY', 'SECRET-KEY', 'http://queue.amazonaws.com/');
  $queueId="A3N3IV5XJH079S/processing";
  $q->putMessage($xml, $queueId, 1000);
}


And that was it.

Saturday, November 3, 2007

GET / POST / PUT Using PHP

The simplest way by far to do a GET in PHP (if you just want the return contents and don't care about the headers) is to use the file_get_contents. It's useful for getting the contents of a file quickly, or the contents of a web page. For example this method, from lab 2, retrieves data from S3.


function queryS3($path) {
  $contents = file_get_contents('http://s3.amazonaws.com/cs462-data/' . $path);

  return $contents;
}



Doing an HTTP POST, I use the cURL library. If you've never used curl before, there's a slight learning curve for doing a POST. I was able to figure it out after looking at a couple samples (the PHP documentation isn't much help). The key is creating an associative array from your POST data fields.

Here's an example from Lab 3, where my submit script (which is actually a Submit object) finally submits the idea to the submit server.


function sendToSubmitAppServer() {
  $url = "http://sslb-p.webappwishlist.com:8080/submit";
  $useragent="Johns Web Service Server, version 7";

  $data = array();
  $data['domain'] = $this->domain;
  $data['name'] = $this->name;
  $data['idea'] = $this->idea;

  $ch = curl_init();
  curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
  curl_setopt($ch, CURLOPT_URL,$url);
  curl_setopt($ch, CURLOPT_POST, 1);
  curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
  $result = curl_exec ($ch);
  curl_close ($ch);
}


Doing HTTP PUT requests was a little more tricky, because I could send PHP doesn't offer a straightforward way of specifying the body of a PUT request, specifically where the body comes from a local string variable. I found a solution here (it basically says all the stuff I just said, with a code example. Here's my code example from lab 4, where I'm sending the XML data file to SQS. I can't be sure that it works, as I don't know how to test if something was sent to the queue. I'll update this example if I find any errors during the next couple days.


function submitToQueue($xml) {
  $url = "http://queue.amazonaws.com/A3N3IV5XJH079S/processing";
  $useragent="Distributed Systems/v3.4 (compatible; Mozilla 7.0; MSIE 8.5; http://classes.eclab.byu.edu/462/)";

  $fh = fopen('php://memory', 'rw');
  fwrite($fh, $xml);
  rewind($fh);

  $ch = curl_init();
  curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
  curl_setopt($ch, CURLOPT_INFILE, $fh);
  curl_setopt($ch, CURLOPT_INFILESIZE, strlen($xml));
  curl_setopt($ch, CURLOPT_TIMEOUT, 10);
  curl_setopt($ch, CURLOPT_PUT, 1);
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

  $result = curl_exec($ch);
  curl_close($ch);
  fclose($fh);
}


It's opening a memory stream like it would a file stream, and then uses the CURLOPT_INFILE to specify the data to be sent in the body. I ran into a situation like this before (not while using PUT) which I solved by actually writing data to a file and then passing the file back in. Talk about a hack...

Friday, November 2, 2007

Updating PHP on Fedora Core 4

Using Amazon's standard EC2 images as the base image to your servers probably means you're going to be running Fedora Core 4, with outdated versions of PHP and MySQL. In my case PHP 5.0.4-- and I've run into several problems with this, as I've wanted to use some of PHP's advanced functionality. Functionality that is either not included by default with PHP 5.0, or not included at all.

An example of functionality not included by default would be JSON, and example of functionality not included at all would be memory based streams. Both of which have had or are having applications to this lab. I finally decided I needed to have PHP 5.2 installed on my servers. The only problem is that none of the repositories configured by default have PHP 5.2 packages for Fedora 4 (this is what has hindered me in the past).

I finally solved my problem after stumbling upon this site: remi.collet.free.fr (I knew my 3 years of French education in high school would pay off some day) that has a repository with lots of packages including update packages for MySQL and PHP for FC4.

I'll outline the steps I followed to update PHP 5, and include the commands (which happen to be included on the above site, here although it may take a few minutes to find them if you don't know French.

1. I downloaded (using wget) the repository configuration file to the repository configuration directory.

cd /etc/yum.repos.d/
wget http://remi.collet.free.fr/rpms/remi-fedora.repo

2. I made a yum call, enabling the repository in the process (as it's disabled by default).

yum --enablerepo=remi install php-5.2.4

3. I restarted Apache.

apachectl restart

And that's it. I now have PHP 5.2.4 running on Fedora 4. I'm sure that an expert could have accomplished it some other way, but as a relative n00b, I have to admit I quite rely on yum.

Thursday, November 1, 2007

Lab 3 - List App/Web Server Integration

I got off to an early start on lab 3, and had most of my code working in a couple hours. The thing that hung me up for two weeks was trying to figure out how the step "Register with the SSLB" fit into this lab. I finally asked Sam about it and he told me it was a mistake. Looking back I realize that I deserved to wallow around in confusion for not having asked the question sooner.

I used PHP again, I think I've pretty much given up on Python this semester... there's always next semester. The template engine I've been using, Smarty, is pretty powerful. Not that I need to exercise its full power for this project, I really like it though. I actually like it so much that I've switched to it from XSLT on another project I'm working on.

One of the other challenges I had during this lab was figuring out how to do the URL rewrites, as the keyword /submit had to go to the submit page, and /everything-else had to go to the idea list for the domain 'everything-else'. I'm sure there are Perl gurus out there who could have whipped up a regex in 5 seconds to handle that... unfortunately I'm not one of them. What I did was rewrite all URLs to go to a driver script that parsed the original URL, and then used PHP objects to generate the appropriate response in each case. These PHP objects were converted from the scripts I originally planned to redirect to for each action.

Testing this lab also turned up a bug in my listing app server. It was a subtle bug that manifest itself by returning only 1 idea for a given domain (even in cases where there were more than 1). Once I identified it, I feared the worse. It took me 30 minutes to track down the source, which turned out to be a missing '$' on my loop variable (which was used to address an array). I won the book in class for the PHP quiz (for identifying the MySQL wasn't enabled by default in a PHP installation)... but I don't know enough to understand why $myarray[i] (should have been $myarray[$i]) didn't cause a more visible error. I'll have to check my error reporting settings in my php.ini file.

I have to say I've been having a great experience in this class, overall. The greater emphasis on architecture and design, and lesser emphasis on KLOCs has been an effective approach. I can think of 2 other CS classes (off the top of my head) I have taken that could benefit from this model.