Ryan Kanno: The diary of an Enginerd in Hawaii

Everything you’ve ever thought, but never had the balls to say.

Lessons Learned: Google App Engine + App Engine Patch + Django + Boto

Update:

Mitch Garnaat from CloudRight has pointed out that you can actually set the policy of the S3 file in the set_contents_from_file call instead of making another roundtrip request into S3 (and saving you some coin). Thanks Mitch!

Btw, I’m using App Engine Patch 1.0 and Boto 1.6a.


Sorry I haven’t updated my blog in a few weeks months, but I’ve been a little busy. With that said, along with Erlang, I’ve been playing around with Google App Engine, App Engine Patch (for Django support), and the Boto library (for Amazon S3 support). After not having touched Python code in a few months, I wanted to document some of my lessons learned to help over developers who may be in a similar boat.

Lessons learned

  • If you’re upgrading the App Engine Patch, make sure you don’t have the App Engine library installed in a hidden directory
  • Uploading bulk data changed ever so slightly
  • If you’re not running off of Boto’s trunk, you’ll need to patch your Boto installation to work with App Engine.

Make sure the App Engine library isn’t installed in a hidden directory

Apparently, Google’s SDK 1.1.9 doesn’t like to rely on files that won’t be uploaded with your application – and hidden directories are no longer uploaded. I was running into the dreaded purple-nurple screen of death. Thank goodness for this AppEngine Google Group post, but I’m still not even sure when this popped up considering Google’s articles still refer to this setup.

Bulk upload

Compared to the previous SDK I was playing around with, bulk uploading changed significantly. I recall having to patch Goog’s bulkupload.py file to get unicode support. However, their new remote api tool has definitely fixed this issue, so +1 for Googs. People are reporting that uploading unicode is still broken, but it’s not. Or at least it wasn’t for me. Second, if you’re like me and don’t read documentation, you’ll find out (the hard way) that the method signature to HandleEntity changed. Instead of accepting a datastore.Entity, it’s now expecting a db.Model object.

Note: When actually running the remote api tool, you’ll also want to make sure your PYTHONPATH includes your current project. (Another one-liner in the documentation. :P )

Integrating Boto + App Engine

I wasn’t running off of Boto’s trunk and I was getting an obscure type conversion error. Being too lazy to check out the source, I jumped to their issue tracker and found a patch (halfway down the page) by one of the App Engine Patch lead devs. Apply the patch and you’ll be on your way to uploading images/data from App Engine into Amazon S3! If you’re looking for example code, I’ve included a small snippet of what I tested.

1
2
3
4
5
6
7
8
9
10
11
    @staticmethod
    def upload_to_s3(original_filename, photo):
        """ Upload a photo file, storing its original name as metadata in an S3 bucket """
        connection = Connection(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY)
        bucket = connection.get_bucket(settings.AWS_IMAGE_BUCKET_NAME)
        photo_uuid = str(uuid.uuid4())
        new_key = Key(bucket)
        new_key.key = photo_uuid
        new_key.set_metadata('original_filename', original_filename)
        new_key.set_contents_from_file(photo, policy='public-read')
        return photo_uuid

Note: I only tested the code above with small images ~300-500K in size and it seemed to work perfectly fine (with no load! :P ). As always, feel free to use, steal, take, and/or copy anything on this blog. Hopefully somewhere, someone on the Interwebs will find these tips handy!

Enjoy!

Tagged with , , , , ,


Seven Things Meme aka Damn you, Harper Reed!

Internet memes rock my socks!

Unfortunately, I was tagged by Harper for a “7 things” meme where I have to divulge seven tidbits of randomness about myself. I’m generally not a big fan of these things since most of you on the Interweb could care less about me, but it is Harper… and well, he is my new bicycle, so here goes:

1. I have “girlie” handwriting

I blame this 1000% all on my sister. Throughout the years, people have told me that I write like a girl – whatever that means. I don’t know how many days of my childhood I spent playing “school”, but it’s quite apparent that it was damn too many. Thanks Stace!

2. I used to play an inordinate amount of video games

They say a picture is worth a thousand words, but this one just might make you speechless. Damn you, John Carmack, for making one of the greatest games of my generation and *almost* failing me out of college. I’ve just recently been sucked into this wonderful World of Warcraft. God, help me.

3. I <3 puzzles

Whether it’s the crosswords, sudoku, kakuro, chess, or that damn Bejeweled, I absolutely love puzzles. I typically have a really short attention span, but puzzles have known to captivate me for hours on end.

4. I am ultra-competitive

I don’t know if it was the math games my mom challenged me with as a child, but I’m probably on the extremely far end of the competitiveness scale. It could be eating (grr, Kobayashi, here I come), games at Dave and Buster’s, or just a game of pickup basketball, I’ll compete at anything. And if you’re lucky, I’ll probably throw in some trash talking as well.

5. I am a total sucker for 80’s cartoons

Whether it’s G.I. Joe, Transformers, M.A.S.K., Thundercats, Voltron, or Silverhawks; I’m an uber 80’s cartoon fan. I don’t know if I should be admitting this as a 30 year old, but I have an extensive collection of Transformers and G.I. Joes safely stored away at my parent’s house. Thanks, Mom & Dad!

6. I love to read

I’m not sure when this happened in my life, but I’ve come to love reading. When I was younger, I absolutely loved Encyclopedia Brown and the Choose Your Own Adventure series. As I’ve gotten older, I’ve stuck to the more non-fiction variety, and I’m currently working on these books here. One of these days, I’m hoping someone will get me an Amazon Kindle. =P (*hint* *hint*)

7. Hi, my name is Ryan, and I’m a caffeine Coke Zero addict

Since my parents never allowed me to drink caffeine as a child, I don’t know exactly when this happened. It could have been all those all-nighters in grad school trying finish up the last bits of code, or the late nights playing online games. In any case, I drink a six pack a day, on the conservative estimate, but I’m really, really trying to stop. I think.

Whew!

And there you have it folks – a few lame tidbits about me. Now, according to Harper, to make this official, I have to tag 7 other unfortunate online souls. So… I tag:

  • Ed S – I met Ed through a basketball league, and it just so happens we share a similar interest in all things technology. Not only does he manage one of the larger media sites in Hawaii, but he is also a featured blogger @ the Honolulu Advertiser.
  • Stephen F – A former co-worker, I’m hoping that this will kick-start his blogging. Since he’s busy with kids, school, and work, I’m not sure he’ll have time for this, but hey, it’s worth a try. He knows all things .Net and Ruby.
  • Kevin M – We met during our freshman year at U of I in calculus. He’s a .Net machine working for Clarity Consulting and is full of zany ideas. He was even featured on Microsoft’s Coding4Fun for his DIY foosball hack. I’m sure he’ll have something great to write.
  • Greg Y – We used to work in the same organization, and I’d help him out with small projects. He’s leading the Hawaii Web 2.0 charge in the organization and also masquerades as a part-time blogger @ Pulp Connection.
  • Mark Q – One of my current co-workers who knows all things CSS and Mac. He’s a Symfony user *cough*fanboy*cough* and is currently trying to convince me to switch over to the darkside. “Resistance is futile.
  • Scott V – We met during our freshman year at U of I playing what else but Quake. Not only is he half man, half crazy, but he’s also the nerdvana of all things system administration. He currently works at Threadless with Harper causing all sorts of havoc.
  • Trent N – I mentored Trent through HiTechQuest. He’s a young’un but has all types of technological potential. Since he’s the youngest I’ve tagged, I’m sure he’ll have all sorts of interesting things to teach us old folk.

And if you’re one of the chosen ones, here are the rules:

  • Link your original tagger(s), and list these rules on your blog.
  • Share seven facts about yourself in a post – some random, some weird.
  • Tag seven people at the end of your post by leaving their names and the links to their blogs.
  • Let them know they’ve been tagged by leaving a comment on their blogs and/or Twitter

Tagged with , , , , , , , ,


Another turning point, a fork stuck in the road…

“50 years from now, when you look back at your life, don’t you want to say you had the guts to get in the car?”

Shia LaBeouf as Sam Witwicky, Transformers

As some of you may know, I recently left my position as a Navy contractor at COMPACFLT for another position at a small, local technology startup. Having come from a family of “lifers”, it wasn’t an easy decision. Even though I absolutely loved both the work and people, I felt that I would have too many regrets not taking this opportunity.

So with that said, here’s a few of my lasting memories:

  • Late nights in the dreaded server room with Wally and co.
  • The lunchroom gang – Our topics ranged the gamut; from MMA to the next 4chan meme.
  • Admiral’s Cup – We won a basketball game! (Not by forfeit!)
  • Ping pong Fridays!
  • Almost killing my co-worker Cindy with the Wii controller… and then my yo-yo.
  • Monday morning chats with Gits about his weekend PS3 adventures.
  • Steve’s yellow lunch truck

Though I came in a contractor, I feel as though I’m leaving as family. So with that said, farewell, good luck, and Godspeed to everyone at FLT!

Click on my picture to check out my flickr set!

Tagged with , , , ,


Using Capistrano to deploy to WebFaction

There’s nothing I love more than sweet automation.

After spending the better part of an hour searching the great Googs, there was only a single blog I could find describing how to use Capistrano to deploy to WebFaction. Unfortunately, Justin was describing a Capistrano 1.4 deployment. I found a few posts on the WebFaction forums, but nothing concrete. So after a few hours fiddling with the technology, here’s how I configured my Rails 2.1.1 project to use Capistrano 2.5 to deploy to WebFaction.

Assumptions

Before getting started, I’m going to assume the following:

  • I’m assuming you’ve already used the one-click WebFaction goodness to create a brand new Rails application in ~/webapps/<application_name>. If you don’t know what I’m referring to, make sure to check out the Rails and Typo Demo screencast. Make sure you have a domain, application, and website configured.
  • I’m also going to assume that your nifty Rails application is safely stored away in either a Subversion or Git repository and you’ve frozen Rails in your application.
  • Finally, I’m going to assume you setup your database via WebFaction’s control panel.

Installing Capistrano

The very first thing you have to do is install Capistrano on your local machine by issuing the following command:

$ gem install -y capistrano

After installing Capistrano, the first thing you have to do is to “capify” your local Rails project. Change into your project’s root directory and issue the following command:

$ capify .

This configures your Rails project to play nicely with Capistrano. Two files should’ve been created; Capfile in the project root and config/deploy.rb. The deploy.rb file contains the Rails project application-specific deployment configuration.

Configuring WebFaction

Jumping back to WebFaction, I followed a few of the steps in Justin’s blog. First thing’s first, ssh into your WebFaction account and create a directory called webapps-releases in your home directory. This directory is where we’re going to deploy the application to.

Since you’ve already configured a Rails application at ~/webapps/<application_name>, change into that directory. You should see a standard Rails project with the exception of an extra file called autostart.cgi. Remove everything in the directory except the autostart.cgi file by issuing the following commands:

$ cd ~/webapps/<application_name>
$ mv autostart.cgi ~/
$ rm -rf *
$ mv ~/autostart.cgi .

Once the directory is clear, create a symlink to the log directory that will be in the webapps-releases directory we created earlier.

$ ln -s ~/webapps-releases/<application_name>/shared/log ~/webapps/<application_name>/log

Note: I’m assuming here that the WebFaction app and the Rails application have identical names.

Next, open up your favorite editor of choice (*cough*Vi*cough*) and edit the autostart.cgi file. Jump to the end of the file and comment out the following line:

1
2
 
# os.system('/usr/local/bin/mongrel_rails start -d -e production -P /home/<webfaction_username>/webapps/<webfaction_app_name>/log/mongrel.pid -p <port>')

and right below it, cut and paste the following:

1
2
 
  os.system('/usr/local/bin/mongrel_rails start -c /home/<webfaction_username>/webapps-releases/<webfaction_app_name>/current -d -e production -P /home/<webfaction_username>/webapps/<webfaction_app_name>/log/mongrel.pid -p <port>')

Creating your custom deploy.rb

After configuring WebFaction, we have to configure the Capistrano application deployment configuration. On your local machine, find the file config/deploy.rb and replace it with the one below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
 
set :webfaction_username, "<webfaction_username>"
set :webfaction_db_type, "<webfaction_db_type>"
set :webfaction_db, "<webfaction_db>"
set :webfaction_db_username, "<webfaction_db_username>"
set :webfaction_port, "<webfaction_port (get from autostart.cgi)>"
set :database_yml_template, "database.example.yml"
 
set :application, "test"
set :deploy_to, "/home/#{webfaction_username}/webapps-releases/#{application}"
 
set :scm, :subversion
set :scm_user, "<scm_username>"
set :scm_password, Proc.new { Capistrano::CLI.password_prompt("Subversion password for #{scm_user}: ") }
set :repository, Proc.new { "--username #{scm_user} --password #{scm_password} --no-auth-cache <http://path/to/your/svn/goes/here/>"} 
 
set :user, "#{webfaction_username}"
set :use_sudo, false 
 
set :domain, "<webfaction_domain>"
 
role :app, domain
role :web, domain
role :db,  domain, :primary => true
 
desc "Symlink public to what webfaction expects the webroot to be"
task :after_symlink, :roles => :web do
  run "ln -nfs #{release_path}/public /home/#{webfaction_username}/webapps/#{application}/"
end
 
namespace :deploy do
 
  # Taken from http://jonathan.tron.name/2006/07/15/capistrano-password-prompt-tips 
  # Thanks Jonathan! :)
  desc "Creates the database configuration on the fly"
  task :create_database_configuration, :roles => :app do
    require "yaml"
    set :production_db_password, proc { Capistrano::CLI.password_prompt("Remote production database password: ") }
 
    db_config = YAML::load_file("config/#{database_yml_template}")
    db_config.delete('test')
    db_config.delete('development')
 
    db_config['production']['adapter'] = "#{webfaction_db_type}"
    db_config['production']['database'] = "#{webfaction_db}"
    db_config['production']['username'] = "#{webfaction_db_username}"
    db_config['production']['password'] = production_db_password
    db_config['production']['host'] = "localhost"
 
    put YAML::dump(db_config), "#{release_path}/config/database.yml", :mode => 0664
  end
 
  after "deploy:update_code", "deploy:create_database_configuration"
 
  desc "Redefine deploy:start"
  task :start, :roles => :app do
    invoke_command "/usr/local/bin/mongrel_rails start -c #{deploy_to}/current -d -e production -P /home/#{webfaction_username}/webapps/#{application}/log/mongrel.pid -p #{webfaction_port}", :via => run_method
  end
 
  desc "Redefine deploy:restart"
  task :restart, :roles => :app do
    invoke_command "/usr/local/bin/mongrel_rails restart -c #{deploy_to}/current -P /home/#{webfaction_username}/webapps/#{application}/log/mongrel.pid", :via => run_method
  end
 
  desc "Redefine deploy:stop"
  task :stop, :roles => :app do
    invoke_command "/usr/local/bin/mongrel_rails stop -c #{deploy_to}/current -P /home/#{webfaction_username}/webapps/#{application}/log/mongrel.pid", :via => run_method
  end
end
Note: Change all the values in tags like <webfaction_username>, <webfaction_db>, <webfaction_db_username>, etc. to those values that fit your configuration!
Otherwise, this file in itself won’t do you any good.

Props out to Jonathan for the fantastic Capistrano tips!

After copying the deploy.rb file and editing the appropriate variables, run the following command in your Rails project’s root directory:

$ cap deploy:setup

This command creates the appropriate directory structure for Capistrano on the deployment server based upon values set in your deploy.rb. Next, run the following command to check your dependencies.

$ cap deploy:check

If everything is successful, you should see a message that reads something like…

You appear to have all necessary dependencies installed

Next, push your code out to the server using the following command:

$ cap deploy:update

Finally, to start up your application run the following Capistrano command:

$ cap deploy:start

Now, you should be able to run the standard Capistrano tasks to deploy your application to WebFaction!

Explanation

Most techies like to have an explanation of what’s going on with the Capistrano deploy.rb. I could probably write another blog about it, but I’m lazy (and pressed for time). The :create_database_configuration task basically writes the database.yml production configuration on the fly (courtesy of this blog posting).

The basic gyst of the rest of the script is that WebFaction is proxying a Mongrel instance. The Capistrano deploy.rb override the original deploy:start, deploy:stop, and deploy:restart tasks to run Mongrel commands that WebFaction can understand. Typically, the default Capistrano tasks run script/spin and reaper, but it was easier just to redefine the task. If anyone has any tips/suggestions to improve the script, I’m all ears!

Voila! (Enjoy)

Tagged with , , , , ,


web2email.py – A web to email Python backup script

I’m back, at least for the time being. There’s definitely a calm before the impending storm, but until then, I’m back posting little tidbits of uselessness. Enjoy!

Python goodness

While introducing the concept of automation to a friend of mine, I came across a requirement to archive a series of URL’s on a daily basis. Luckily for me, the URL’s consisted primarily of plain text. Loading up VIM, I concocted this Python script in a few hours – most of which was spent searching Googs <3.

If you're looking for a true web crawler, this won't be for you - though loading up lxml/Beautiful Soup, cssutils, and a Javascript parser to determine what artifacts need to be downloaded shouldn’t be all that difficult…

But, I’ll leave that as an exercise for the reader (That’s you, btw!)

In any case, the following script crawls a URL and sends the page via Googs or Webfaction via SMTP-AUTH or via a plain SMTP server of your choosing. Sorta-kinda like having your own WayBackMachine. In any case, cut and paste the following into a neat file called web2email.py.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
#! /usr/bin/env python2.5
# -*- coding: utf-8 -*-
#
# Copyright (c) 2008 Ryan Kanno (ryankanno@localkinegrinds.com)
# License: GNU GPLv3
 
import urllib2
 
import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEBase import MIMEBase
from email.MIMEText import MIMEText
from email.Utils import COMMASPACE, formatdate
from email import Encoders
import datetime
 
from optparse import OptionParser
import sys, logging
 
__doc__ = """
 
This script retrieves a URL and sends its contents via email to 
a list of recipients.  Typically, this script is run from a cron
job that sends emails to a Gmail account to archive the contents
of a URL.
 
Mail can be sent via normal or authenticated SMTP.  Tested using 
Gmail SMTP (authenticated), Webfaction SMTP (authenticated), and
localhost (normal).
 
Example:
 
Sends the contents of http://www.espn.com to friend@domain.com using your Gmail settings
 
    python web2email.py -u gmail_username \
                        -p gmail_password \
                        -f gmail_username@gmail.com \
                        -r friend@domain.com http://www.espn.com
 
Sends the contents of http://www.espn.com to friend@domain.com using your Webfaction settings
 
    python web2email.py -u webfaction_username \
                        -p webfaction_password \
                        -f webfaction_account@webfaction_domain.com \
                        -s smtp.webfaction.com \
                        -r friend@domain.com http://www.espn.com
 
Sends the contents of http://www.espn.com to friend@domain.com using your local settings
 
    python web2email.py -f your_email@domain.com \
                        -s localhost \
                        --port 25 \
                        -r friend@domain.com http://www.espn.com
"""
 
__author__  = "ryankanno@localkinegrinds.com"
__url__     = "http://blog.localkinegrinds.com"
__version__ = "0.1"
 
USAGE = "usage: %prog [options] url" 
DESC  = __doc__.split('\n\n')[0]
 
def configure_logging(log_level, format='%(asctime)s %(levelname)s %(message)s'):
    logging.basicConfig(level=log_level, format=format)
 
def _validate_options_and_args(parser, options, args):
    logging.debug("Validating options and arguments.")
    if (len(args) != 1):
        parser.error("Incorrect number of arguments.  Script expects 1 (URL to backup), but received %i." % len(args))
        sys.exit(2) # Command line syntax error
    elif not options.recipients: 
        parser.error("You must include at least one recipient.")
        sys.exit(1) 
    elif (options.username and options.password is None) or (options.username is None and options.password is not None):
        parser.error("You must include both a username and password.")
        sys.exit(1) 
    elif not options.from_email:
        parser.error("You must include a valid from email address.")
        sys.exit(1) 
 
def getPage(url):
    logging.debug("Attempting to retrieve %s" % url)
    try:
        response = urllib2.urlopen(url)
        return response.read()
    except urllib2.HTTPError, e:
        logging.error("HTTPError (%s) occurred retrieving %s" % (e.code, url))
        sys.exit(1)
    except urllib2.URLError, e:
        logging.error("URLError (%s) occurred retrieving %s" % (e.reason, url))
        sys.exit(1)
 
def mail(send_from, send_to, subject, text, content_type, files=[], server='localhost', port=25, username=None, password=None):
 
    def _auth(server, port, username, password):
        logging.debug("Attempting to send email via %s:%i using the following credentials (%s:%s)." % (server, port, username, password))
        smtp = smtplib.SMTP(server, port) 
        smtp.ehlo()
        smtp.starttls()
        smtp.ehlo()
        smtp.login(username, password)
        smtp.sendmail(username, send_to, msg.as_string())
        smtp.close()
 
    def _unauth(server, port):
        logging.debug("Attempting to send email via %s:%i" % (server, port))
        smtp = smtplib.SMTP(server, port)
        smtp.sendmail(send_from, send_to, msg.as_string())
        smtp.close()
 
    assert type(send_to)==list
 
    msg=MIMEMultipart()
    msg['From'] = send_from
    msg['To'] = COMMASPACE.join(send_to)
    msg['Date'] = formatdate(localtime=True)
    msg['Subject'] = subject
 
    text = MIMEText(text)
    text.set_type(content_type)
    text.set_param('charset', 'UTF-8')
 
    msg.attach(text)
 
    for f in files:
        part = MIMEBase('application', "octet-stream")
        part.set_payload(open(file,"rb").read())
        Encoders.encode_base64(part)
        part.add_header('Content-Disposition', 'attachment; filename="%s"' % os.path.basename(f))
        msg.attach(part)
 
    if not username and not password:
        _unauth(server, port)
    else:
        _auth(server, port, username, password) 
 
def main():
    parser = OptionParser(usage=USAGE, description=DESC)
 
    parser.add_option("-u", "--username", dest="username", metavar="USER", help="Username to SMTP server")
    parser.add_option("-p", "--password", dest="password", metavar="PWD", help="Password to SMTP server")
    parser.add_option("-s", "--server", dest="server", metavar="SERVER", help="SMTP server (Defaults to Gmail)")
    parser.add_option("--port", dest="port", metavar="PORT", type="int", help="SMTP server port (Defaults to Gmail)")
    parser.add_option("-f", "--from", dest="from_email", metavar="FROM", help="From address")
    parser.add_option("-r", "--recipient", action="append", dest="recipients", metavar="RCPT", type="string", help="Email recipient")
    parser.add_option('-t', '--test', action="store_true", dest="test", metavar="TEST", help="Run tests")
    parser.add_option('-v', '--verbose', action='store_const', dest='log_level', const=logging.DEBUG, help='Verbose output')
    parser.set_defaults(server="smtp.gmail.com", port=587, test=False, log_level=logging.INFO)
    (options, args) = parser.parse_args()
 
    _validate_options_and_args(parser, options, args)
    configure_logging(options.log_level)
 
    if options.test:
        _test() # Too lazy to write a test for this script.  @TODO - use mocks 
 
    # Retrieve URL and return html
    html = getPage(args[0])
 
    # Send mail with returned html as body 
    mail(options.from_email, options.recipients, 
         '%s @ %s' % (args[0], (datetime.datetime.now().strftime("%A %B %d %I:%M:%S %p %Y"))), 
         html, 'text/html', 
         server=options.server, port=options.port, username=options.username, password=options.password)
 
    # Return with appropriate exit code
    sys.exit(0)
 
def _test():
    import doctest
    doctest.testmod(sys.modules[__name__])
 
if __name__ == '__main__':
    main()

All right stop, cron time! (imagine a 90’s pop song)

As an added bonus, you can install this script to run via cron so you’ll magically end up with webpages archived in your inbox! Neat. You can read my previous post on cron, or you can create the following crontab.

MAILTO=ryankanno@CHANGE_TO_YOUR_EMAIL.com
# minute (0-59),
# |      hour (0-23),
# |      |       day of the month (1-31),
# |      |       |       month of the year (1-12),
# |      |       |       |       day of the week (0-6 with 0=Sunday).
# |      |       |       |       |       commands
  0      0       *       *       *      /usr/bin/python2.5 /PATH/TO/web2email.py -u GMAIL_USER -p GMAIL_PWD -f FROM_USER -r RECIPIENT URL

As a side note, don’t forget double quotes around URL if there’s spaces!

Notice, change the value of ryankanno@CHANGE_TO_YOUR_EMAIL.com to your email address (or comment the line out with a # if you don’t want emails sent to you), GMAIL_USER to your Google username, GMAIL_PWD to your Google password, FROM_USER to the from address in the mail header, RECIPIENT to the recipient email address, and URL to the URL you want backed up.

I know, I know. The critics.

The critics will say that your Gmail username and password are in cleartext. I know. They are. So… I’m hoping that since you just need an archive of a publicly available URL on the Internets, the data doesn’t need to be super-duper-Fort-Knox-protected. If it does, this script isn’t for you. :( Oh, yeah, before I forget… here’s a hint… *cough*create another Google account*cough*. With that said, archive to your heart’s content!

Enjoy!

Tagged with , , , , ,


Powered by Wordpress. Stalk me.