Ryan Kanno: The diary of an Enginerd in Hawaii

Everything you’ve ever thought, but never had the balls to say.

Archive for the ‘HowTo’ Category

Backing up your Subversion (SVN) repository on Dreamhost with cron

Two events spurred me to write this blog.

First, my 2 year old “Subversion + Dreamhost + Post-Commit” blog still gets quite a number of hits. Second, after the latest Dreamhost outage move, I’m beginning to feel a little more vigilant about backing up my data.

As a standard disclaimer, if you’re not familiar with the Unix shell, I highly suggest you not try this unless under the supervision of someone who reads Perl books for fun. By accessing your Dreamhost shell, you can seriously f-up your account and I will not fix it for you. You have been warned. :) (Don’t you just love smileys?)

Setup

There are a few prerequisites to being able to back up your SVN repository.

  1. First and foremost, you must have already installed a SVN repository into your Dreamhost account via the control panel.
  2. Second, you must know how to SSH into your Dreamhost account. As a FYI, you sorta-kinda-need to know what that means in order to follow this tutorial.

Grabbing the backup script

Wait, you didn’t think I was writing my own right? In any case, if you actually installed/compiled Subversion on your own, it would’ve contained this file, hotbackup.py. Fortunately for us, Dreamhost has this file conveniently available at: /usr/bin/svn-hot-backup, but it’s an older version of the backup script. There are some subtle differences like being unable to pass in the number of backups you want the script to manage. Personally, I like to be on the edge, so let’s get the latest version. Execute the following commands from your home directory.

$ cd ~
$ mkdir scripts
$ cd scripts
$ wget http://svn.collab.net/repos/svn/trunk/tools/backup/hot-backup.py.in
$ mv hot-backup.py.in svn-hot-backup.py

The commands issued above created a directory called scripts in your home directory, switched into the directory, downloaded the latest hot-backup.py file from CollabNet, and renamed it to svn-hot-backup.py. Now that you have the file, you’ll need to make a few edits. Personally, I’m accustomed to vi, but pick your poison (pico, nano, text editor of your choice) and find these two values (they should be close to the top of the file in consecutive lines).

# Path to svnlook utility
svnlook = r"@SVN_BINDIR@/svnlook"

# Path to svnadmin utility
svnadmin = r"@SVN_BINDIR@/svnadmin"

and change them to the following:

# Path to svnlook utility
svnlook = r"/usr/bin/svnlook"

# Path to svnadmin utility
svnadmin = r"/usr/bin/svnadmin"

(If you’re wondering, if and when you compile/install Subversion yourself, these two variables would have been automagically filled in for you.)

The python script we downloaded not only performs a hotcopy of your svn directory, but also can archive it and manage a set number of copies. Pretty neat right?

Preparing for the backups

Before you can actually back up your SVN repository, you’ll want to create a directory structure to manage your backups. Execute the following commands from your home directory.

$ cd ~
$ mkdir backup
$ cd backup
$ mkdir svn
$ cd ~/scripts

The commands issued above created a directory called backup in your home directory, switched into the directory, and created another directory called svn within the backup directory. We’ll be using this directory to store all your backups. Finally, we switched back into the scripts directory created in the previous steps. Now that we have the backup script and directory structure to manage the back ups, let’s test it out!

Before you can back up your repository, you’ll have to know the name of the Subversion repository you’re trying to back up. To find the name of your repository, you can either look in the svn directory in your home directory, or you can check out the ID value in your Subversion Goodies control panel. In any case, remember the name of your SVN repository and issue the following commands.

$ cd ~/scripts/
$ python2.4 svn-hot-backup.py --archive-type=zip --num-backups=10 ~/svn/REPOSITORY_NAME_HERE/ ~/backup/svn/
Notice, change the value of REPOSITORY_NAME_HERE to the id of the SVN repository you want backed up.

You should see the following in the console:

Beginning hot backup of '/home/USERNAME/svn/lkg/'.
Youngest revision is REVISION_NUMBER
Backing up repository to '/home/USERNAME/backup/svn/REPOSITORY_NAME_HERE-701'...
Done.
Archiving backup to '/home/USERNAME/backup/svn/REPOSITORY_NAME_HERE-701.zip'...
Archive created, removing backup '/home/USERNAME/backup/svn/REPOSITORY_NAME_HERE-701'...
If you see the following, the backup was a success! You can even check on the file by changing into the backup/svn directory!

Voila! (But there’s more)

Automating the backups

Now that you actually have the script backing up your SVN repository, let’s automate them! To do so, we’ll use the handy cron daemon. Cron has similarities to the Windows task scheduler in that it provides a service that enables a user to execute commands at a specified date/time or set intervals. To tell cron the tasks you want to execute, you’ll need to load a configuration file called a crontab. You can read more about it here and here. In any case, here’s what my crontab configuration file looks like.

MAILTO=ryankanno@CHANGE_TO_YOUR_EMAIL.com
# minute (0-59),
# |      hour (0-23),
# |      |       day of the month (1-31),
# |      |       |       month of the year (1-12),
# |      |       |       |       day of the week (0-6 with 0=Sunday).
# |      |       |       |       |       commands
  0      0       *       *       *      /usr/bin/python2.4 /home/USERNAME/scripts/svn-hot-backup.py --archive-type=zip --num-backups=10 /home/USERNAME/svn/REPOSITORY_NAME/ /home/USERNAME/backup/svn/

Create a file in your scripts directory called svn_backup_once_a_day.cron and copy the contents above into your file. I’ve setup my crontab to backup my svn repository once a day.

Notice, change the value of ryankanno@CHANGE_TO_YOUR_EMAIL.com to your email address (or comment the line out with a # if you don’t want emails sent to you), USERNAME to your Dreamhost username, and REPOSITORY_NAME to your Subversion repository.

Once you have this file called svn_backup_once_a_day.cron in your scripts directory, load the file into your crontab by issuing the following command:

$ crontab svn_backup_once_a_day.cron

As a FYI, this will replace your old crontab. If you have other items already running on cron, it’s a good idea to list them via the crontab -l command first. If you want to make sure that your cron will run, you can test it out by setting the values in the crontab to the time you want it to run. I’ll leave this as an exercise to the reader. :)

Storing your backups

Though out of scope of this blog, you’ll still have to store your backups somewhere. Please just don’t leave them in your Dreamhost account. Your best bet is probably to get an Amazon S3 account and store your backups there. Personally, I like to run another script immediately after the hotcopy finishes that pushes the backup to my S3 account. Other options include scp/sftp’ing the backups to your home machine. Here’s a link to read more about that option.

Voila! Enjoy!

Tagged: , , , , , , , , .


Upgrading your DVR: How to increase your DVR’s recording time

This blog is for all my Hawaii television addicts.

Since I rarely have time to watch live television, my Oceanic Time Warner DVR is constantly filled to max capacity. This means I’m always battling my inner demons on what shows I have to erase… Rock of Love, A Shot at Love, Flavor of Love… you know, all the good stuff. To solve my problem, I’ve finally decided to invest the $150 to upgrade my DVR and increase its total number of recording hours.

Luckily for you, I’ll walk you through the steps to upgrade your own DVR!

As a standard disclaimer, if you attempt to upgrade your own DVR and f-it up, I can’t and won’t fix it. So… if technology scares you, please parents, do not try this unless supervised by your technology-oriented youngster. If you don’t understand what SATA, external enclosures, or hard drives mean, do not, and I repeat do not try this at home!

The setup

Before you can upgrade your DVR, you’ll need to make sure that you have the Scientific Atlanta Explorer 8300HD. Just match what your DVR looks like to the one in the picture. It’s not that hard. This is what mine looks like: the front and the back. I do know for a fact that Oceanic has a few versions of their cable boxes out in the wild. I’m pretty sure you can upgrade (some of) the other models as well, but I’ve personally only upgraded the 8300HD. So if you want to be ballsy and upgrade a different cable box, feel totally free - just be warned that this guide won’t apply to you. I’m not even sure if you can still turn in your old cable box because of the demand for HDTV in Hawaii, but calling up Oceanic can’t hurt.

Aside from owning an 8300HD, you’ll need three additional components to make this upgrade work. I’ve included links to where I purchased the following items. Fear not, I don’t make any commissions on these links so feel free to buy these products from anywhere you see fit.

Here are a few pictures of the aforementioned items.

External SATA enclosureMaxtor SATA 500 GB hard driveeSATA to SATA cableEverything unpacked!

The results

First, make sure your 8300HD is turned off. Place the hard drive into the external enclosure. Next, after connecting the external SATA enclosure to the 8300HD (with the SATA to eSATA cable), power the external hard drive before turning the DVR box back on. Note, it’s extremely important that the external SATA enclosure be turned on prior to the cable box being powered on. Once booted, the 8300HD should recognize a new, external data source and prompt you to format the new drive. The following message should appear:

Format hard drive prompt

Once formatted, you should see a success message:

Format success!

Voila! DVR Upgraded!

The benes

There are numerous benefits to increasing your DVR’s total recording time.

  • No more having to rush home because you forgot the DVR is full.
  • No more making those life-altering decisions about what movies to delete.
  • Being able to store almost a year’s worth of reality crap is fun!

Of course, there’s the almost 4X increase in the DVR’s recording time as you can see by the following before and after pictures. Not bad!

Before upgradeAfter upgrade

The cons

There’s no such thing as a free pass in life… so here are a few of the cons.

  • As I wrote earlier, the external hard drive needs to be powered on before your cable box. This means one of two things. Either you always turn the external drive on first or leave it on permanently. Since I know I could never remember to do the former, I’ve decided to leave the device on permanently - meaning a slightly larger electricity bill. As someone trying to get off the grid, that makes me sad.
  • You can’t rip the recorded video off the external hard drive. Unfortunately, the data is encrypted. Unless you’re a cryptographic expert, worked on the 8300HD, or have a few Beowulf clusters, deal with it. You won’t be able to share your recordings.
  • $150 bucks is a lot to spend on easing one’s mind, but I think it’s money well spent considering the prices here and here.

Some linkage

Of course I couldn’t have upgraded my DVR without the Internet. Here’s a link to the forums and guides I read to assist me along the way. Check them out, some of them are quite interesting.

Finally, check out my flickr set if you need to see any more pictures!

Enjoy!

Tagged: , , , , , .


Using the extra() QuerySet modifier in Django for WeGoEat

Since I actually used this method to reduce the number of Update:”explicit” SQL calls made in WeGoEat, I figured I’d write a little blog explaining the context in which it was used, and maybe, just maybe, it’ll help shed some light on how others can take advantage of this neat little function.

Background

As a Django “proof-of-concept”, I’m working on a local restaurant review site for my home state of Hawai`i. (I actually just released it yesterday). For each restaurant, I want to be able to calculate the average of all reviews and display this listing in a paginated view. (Yes, I do realize there’s no average rating, but that has to do with there being no users. ;P).

The Problem

Having a serious “wtf was I thinking moment”, I initially wrote a Restaurant model function that returned the average (review) rating for each restaurant instance. Little did I realize that when I actually displayed the restaurant’s average reviews, I would be making an additional SQL avg() call for every restaurant. Though I’m paging “n” records at a time, this function added an additional “n” SQL calls for every view that contained a restaurant listing, just to name a few.

In pseudo-code, my initial naive function resembled the following: (I’m sure we’re all guilty of writing something of the sort… ok, fine, I know I was. ;P)

1
2
3
4
5
6
     def get_average_review(self):
         query = 'QUERY TO GET AVERAGE (SELECT AVG(rating)...); (I have the query below)'
         # Get cursor from connection
        cursor = connection.cursor()
        cursor.execute(query)
        return cursor.fetchall()

Duh.

Here’s a picture of the number of queries it took:

Duh

The “extra()” solution

After profiling my application and realizing what a bone-headed mistake I made, I began researching the extra() Queryset modifier. Yes, I realize that these extra lookups aren’t the most portable and often violate the DRY principle, but it’ll probably suffice for most of all my personal projects. :)

Since I’m already retrieving a list of Restaurants and filtering them via letter, island, and what not, I figured I could add an average rating subquery. The entire call looks as such:

1
2
3
4
5
6
7
     restaurants = Restaurant.objects.filter(name__istartswith = letter).extra(
             select={'<strong>avg_rating</strong>': 'SELECT AVG(overall_rating) FROM restaurants_restaurant as res, reviews_review, django_content_type \
                                          WHERE restaurants_restaurant.id = res.id \
                                          AND res.id = reviews_review.object_id \
                                          AND reviews_review.content_type_id = django_content_type.id \
                                          AND django_content_type.model = \'restaurant\''},
                       )

As you can see, I’m exploiting the fact that restaurants_restaurant will be available from the Restaurant.objects.filter() call. (I know, I know… bad for portability).

But voila!

Now, in my templates, when I iterate over the restaurants, I can get issue the following:

1
2
3
4
5
6
7
8
9
10
11
{% for restaurant in restaurant_list %}
&lt;tr&gt;
    &lt;td&gt;&lt;a href="{{restaurant.get_absolute_url}}"&gt;{{ restaurant.name }}&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;{% if restaurant.avg_rating %}
	   {% load show_stars %} 
           &lt;span class="average-rating"&gt;
	   {% show_stars <strong>restaurant.avg_rating</strong> of 5 round to quarter %}
           &lt;/span&gt;
           {% endif %}&lt;/td&gt;
&lt;/tr&gt;
{% endfor %}

Notice how I used my show_stars template tag that I blogged about a few weeks ago to display the average restaurant rating. (Cheap shameless plug, but damn effective! :P) I’d link to a page in action, but since I just opened up my site to a few select users, I’ll update this post when I actually have any reviews. :P

Oh, and before I forget, thanks to my co-worker Stephen for assisting me with my SQL issues! :)

Here’s a picture of the final result:

Yay

Note:

As an added bonus, I also realized a few other ’spots’ where the .extra() Queryset modifier would come in handy. Since I’m also using the wonderful django-voting application from Jonathan Buchanan, I came across this post about accessing a dictionary via a template in the Django-users Google Group.

Basically, I had come across the same issue as the poster. Since I allow users to vote on reviews (similar to Amazon, Yelp, etc.), I wanted to retrieve the score of each Review instance to display on a paginated listing of all Reviews. Using the same extra() modifier, I was able to inject the total number of votes and the score when I retrieved all Reviews as such:

Btw, I just injected most of the code from Jonathan’s template tag. :)

1
2
3
4
5
6
7
8
9
10
11
.extra(select={'total_votes': 'SELECT COUNT(vote) FROM votes as v, reviews_review as rev, django_content_type \
                                        WHERE reviews_review.id = rev.id \
                                        AND v.object_id = reviews_review.id \
                                        AND v.content_type_id = django_content_type.id \
                                        AND django_content_type.model = \'review\'', 
 
                                        'score': 'SELECT SUM(vote) FROM votes as v, reviews_review as rev, django_content_type \
                                        WHERE reviews_review.id = rev.id \
                                        AND v.object_id = reviews_review.id \
                                        AND v.content_type_id = django_content_type.id \
                                        AND django_content_type.model = \'review\''},)

Pretty neat right?

Now, when iterating through the reviews, I can use the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
{% for review in object_list %}
	&lt;tr&gt;
		&lt;td&gt;&lt;a href="{{review.content_object.get_absolute_url}}"&gt;{{ review.content_object.name }}&lt;/a&gt;&lt;/td&gt;
		&lt;td&gt;&lt;a href="{% url profile-detail username=review.user.username %}"&gt;{{ review.user.username }}&lt;/a&gt;&lt;/td&gt;
		&lt;td&gt;&lt;nobr&gt;{% load show_stars %}
			&lt;span class="rating"&gt;{% show_stars review.overall_rating of 5 round to half %}&lt;/span&gt;
			&lt;/nobr&gt;
		&lt;/td&gt;
		&lt;td&gt;"&lt;span style="font-weight:bold; color:#092e20;"&gt;{{ review.get_recommendation_display }}&lt;/span&gt;"&lt;/td&gt;
		&lt;td&gt;&lt;span style="font-size:.875em;"&gt;{{ review.submit_date|timesince }} ago&lt;/span&gt;&lt;/td&gt;
		<strong>&lt;td&gt;Total of {{ review.score|default:0 }} from {{ review.total_votes }} {{  review.total_votes|pluralize:"person,people" }}.&lt;/td&gt;</strong>
	&lt;/tr&gt;
{% endfor %}

Hope y’all learned something like I did! :) Oh, and before I forget my standard disclaimer, “since this is on my blog, feel free to take/use/steal/distribute/copy/modify any code you see fit, but if you find any bugs, have any comments, or think the code can be cleaner, I’d love to hear from you.”

Enjoy!

Tagged: , , , , , , .


How to tie your tie (like a Merovingian).

I’ve always wondered how the Merovingian in the Matrix tied his tie. It’s bothered me for years… I guess it’s that attention to detail that keeps me up coding at 2am in the morning. :)

And then (as if through divine intervention) I was sent this video link on youtube

Yes, I know… I’m a nerd. (But you still love me)


Update: As Rudy commented, in the Matrix, the tie is actually ‘backwards’. Apparently, there are nerdier people out there than me because I totally forgot about that detail. ;)

Thanks Rudy!

In any case, a link was conveniently provided… (though, personally, I still like the other video.)

Tagged: , , , , , , , , , .


My Dreamhost + Django + Subversion Setup

Since I haven’t put out a technical article in a while, this blog will explain how I’ve setup Dreamhost + Django + Subversion to play nicely together in a seamless development environment via a shared hosting provider. Hopefully - someone, somewhere can find this information useful and insightful in their own development environment.

The very first thing I did was unleash my first Django web application on Dreamhost. Thanks to an excellent tutorial from Jeff Croft, a detailed explanation about FastCGI contained within the Django documentation, and a few helpful pointers on the Dreamhost wiki, I was able to get my application deployed in a matter of a few hours.

You can check it out here!

However, after going through Jeff’s excellent tutorial, I still wasn’t completely satisfied with my Django deployment on Dreamhost. Something was missing. There wasn’t a seamless way to continue development on my home machine, deploy to a test environment, and still keep my live site intact. After all, I’m a true believer in the open source dictum of ‘release early, release often‘, and without a way to test my application on a live server, I wasn’t happy with my configuration management.

Ideally, I envisioned having a live web application (i.e. http://www.wegoeat.com/) and another url that I could deploy my beta releases to (i.e. http://beta.wegoeat.com/). From a configuration management standpoint, I would tag major release builds and to maintain that release over its life (via bug fixes, minor enhancements), I would create a branch of the tag. Thus, the live site would be updated from the branches directory, while the beta url would update from the trunk in my Subversion repository. So to summarize the ‘extra’ steps I did to ensure a smoother deployment cycle, I’ve conjured up the following action list.

  1. The very first thing I did was follow Jeff’s tutorial - instead of creating a single directory in my django_projects directory, I created two. One was named ‘project_live’ and the other ‘project_beta’.
  2. Next, I checked out the appropriate source files from the appropriate locations in my Subversion repository. The ‘project_live’ directory came from my branches directory and represents my ‘live’ site. The ‘project_beta’ directory came from the trunk and represents my ‘beta’ site. Obviously, the settings.py file for the Django applications as well as the configuration files for FastCGI were different according to the directories. Since my settings will probably be very different then your settings, I’ll leave this as an exercise to the reader.
  3. Note, as far as Dreamhost goes, I created two domain entries, one @ http://www.wegoeat.com that will host my live site, and another @ http://beta.wegoeat.com that will be my beta site.
  4. I followed my own tutorial and created a post-commit hook to update the appropriate Dreamhost directories when I committed to the repository.

And voila! We’re done.

Now, I can develop on my home machine where I’ve checked out the trunk of my Subversion repository. Whenever I commit, the post-commit hook updates the project_beta directory on my Dreamhost account, and all the while, my live site is still functioning.

Stay tuned for my next blog where I discuss how to get Custom PHP + MediaWiki + EAccelerator playing nicely together on Dreamhost!

Tagged: , , , , , , , , , , , , .


Powered by Wordpress. Stalk me.