In the past month or so, I’ve had a great experience learning about the limits of loading data into my relational databases for The Geno.me. I started with rake tasks using ActiveRecord inserts. Slow. Then I tried SQL inserts. Slow. Then I tried bulk SQL inserts. Better. Then I realized that loading from large flat files wouldn’t be super realistic in deployment (the way I was set up). Sqlite3 in development and PG in production on Heroku just wouldn’t cut it. I’d ultimately have ~250 GB of genetic data loaded before I even launch the application. What to do?
Amazon RDS. Scalable, simple (ish), and pay for what you use. Perfect. Getting set up…a bit more involved.
1. Register for your Amazon Web Services account
This is a pretty easy task. Go here. Follow directions.
2. Download and Configure the RDS Command Line Toolkit
The command line toolkit can be found here. Download and put into a directory that you can easily find. Open the readme.txt and let’s work through those directions.
1. Make sure Java is installed.
2. Set up environment variables as follows:
AWS_RDS_HOME = Directory where the RDS files were saved.
JAVA_HOME = Java installation directory
AWS_CREDENTIAL_FILE = ${AWS_RDS_HOME}/credential-file-path.template
First road bump for me. I’m new to Macs. How to set an Environment Variable? This StackOverflow post directed me nicely.
Open launchd.conf:
$ mate /etc/launchd.conf
Here is what I added to my launchd.conf file:
setenv AWS_RDS_HOME /Developer/RDSCli-1.4.007
setenv JAVA_HOME /Library/Java/Home
setenv AWS_CREDENTIAL_FILE /Developer/RDSCli-1.4.007/credential-file-path.template
3. Add your security credentials to the AWS_CREDENTIAL_FILE.
$ mate ${AWS_RDS_HOME}/credential-file-path.template
Find your security credentials by logging into your AWS account, and click on your name in the upper right corner, and they are listed under ‘security credentials.’ Add them to the file, and we are almost there.
4. Add ${AWS_RDS_HOME}/bin to your path.
Again, I’m new to Macs. How to do this? They actually tell you in the Amazon RDS user-guide, but I can’t find that pdf right now, so here’s the command for other newbies like myself:
$ export PATH=$PATH:$AWS_RDS_HOME/bin
Now all should be working! Test by typing rds --help. If a list of rds commands appeared, then all is good in the hood!
3. Configure the RDS Add-on With Heroku
If you hadn’t before this step, by now you should have initialized a DB instance on Amazon, otherwise you will have no endpoint/credentials to provide Heroku!
Heroku provides very clear directions for this here.
One note: When adding security group permissions, the command they list tripped me up a bit.
rds-authorize-db-security-group-ingress default \
--ec2-security-group-name default \
--ec2-security-group-owner-id 098166147350
I didn’t realize this was meant to be one line and that the backslashes should be ignored. Could have been me being stupid, but I spent over 10 minutes on this. Yep, me being stupid.
4. Update your Gemfile to Include ‘mysql2′ in Production
I still like Sqlite3 in development on my local machine, and I was actually having some major issues getting mysql2 to work locally. So, I placed ‘mysql2′ into the :production group and called it a day. Of course, ‘sqlite3′ should be in the development group. This took a while before Heroku stopped yelling at me, though I can’t remember why.
5. The Database URL That worked for Heroku’s Configuration
For some reason, it took me a while to reason out the proper database URL to give to Heroku. They tell you to use:
url=mysql://user:pass@rdshostname.amazonaws.com/databasename
However, this didn’t seem to work for me. Some really fancy googling led me to place the following:
url=mysql2://user:pass@rdshostname.amazonaws.com/databasename
I guess there’s something about the gems that were different? IDK. It works now.
So far I’m VERY happy with Amazon RDS. More importantly, they allow you to import flat files (CSV) of data as an easy way to fill the database (which is the whole reason I did this). Getting their import to work turned out to be another adventure. But that…is for another time. Thanks for reading, and hope this helps any other beginners like myself!
Wow thank you for that tip… I didn’t know I had to update the URL to mysql2:// either, and my searches (until your blog) came up blank.
So glad this helped! Thanks for reading.