Rake task to transfer a Rails database, say from MySQL to Postgres and back again

Posted by face on August 31, 2008


development:
  adapter: mysql
  database: legaltorrents_development
  username: fred
  password: password
  socket: /tmp/mysql.sock

production:
  adapter: postgresql
  database: legaltorrents_production
  username: fred
  password: password
  host: localhost


Thanks to Ruby on Rails, transferring and converting database from one database platform to another only takes a few lines of code. There are rake tasks for dumping to YAML and back. However the existing YAML scripts I found had issues with some of our data and then failed for blobs. This script will only work with a “rails style” database. By “rails style” I mean any database where every table has a unique key named “id”.

A special thanks to Matson Systems, Inc. for having me write this script for LegalTorrents and then contribute it to all under a BSD license.


Be sure to read the warnings in script. Here is a complete example converting a production postgresql database into a development mysql database. Let’s start with config/database.yml found to the left. Now both schemas must be identical. For this example let’s ensure both schemas are at the same migration:
1
2
rake db:migrate
env RAILS_ENV=production rake db:migrate
Ok, let’s get the rake task and run it. Running the rake task may take a long time.
1
2
3
4
5
cd lib/tasks
wget 'http://github.com/face/rails_db_convert_using_adapters/tree/master%2Fconvert.rake?raw=true' -O convert.rake

# and run it
rake db:convert:prod2dev
You should now have a copy of the production data in the development database.


Update Oct 1, 2008:Fixed a bug today for Rails 2.1.1. Also got rid of the hash of data that was a relic from an early version of the script that used a single model object.

Here is the full code to convert.rake:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
#
# Convert/transfer data from production => development.    This facilitates
# a conversion one database adapter type to another (say postgres -> mysql )
#
# WARNING 1: this script deletes all development data and replaces it with
#                     production data
#
# WARNING 2: This script assumes it is the only user updating either database.
#                     Database integrity could be corrupted if other users where 
#                     writing to the databases.
#
# Usage:  rake db:convert:prod2dev
#
# It assumes the development database has a schema identical to the production 
# database, but will delete any data before importing the production data
#
# A couple of the outer loops evolved from 
#    http://snippets.dzone.com/posts/show/3393
#
# For further instructions see 
#    http://myutil.com/2008/8/31/rake-task-transfer-rails-database-mysql-to-postgres
#
# The master repository for this script is at github:
#    http://github.com/face/rails_db_convert_using_adapters/tree/master
#
#
# Author: Rama McIntosh
#         Matson Systems, Inc.
#         http://www.matsonsystems.com
#
# This rake task is released under this BSD license:
#
# Copyright (c) 2008, Matson Systems, Inc. All rights reserved.
# 
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 
# * Redistributions of source code must retain the above copyright
#   notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
#   notice, this list of conditions and the following disclaimer in the
#   documentation and/or other materials provided with the distribution.
# * Neither the name of Matson Systems, Inc. nor the names of its
#   contributors may be used to endorse or promote products derived
#   from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
# COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.

# PAGE_SIZE is the number of rows updated in a single transaction.
# This facilitates tables where the number of rows exceeds the systems
# memory
PAGE_SIZE=10000

namespace :db do
  namespace :convert do    
    desc 'Convert/import production data to development.   DANGER Deletes all data in the development database.   Assumes both schemas are already migrated.'
    task :prod2dev => :environment do

      # We need unique classes so ActiveRecord can hash different connections
      # We do not want to use the real Model classes because any business
      # rules will likely get in the way of a database transfer
      class ProductionModelClass < ActiveRecord::Base
      end
      class DevelopmentModelClass < ActiveRecord::Base
      end

      skip_tables = ["schema_info", "schema_migrations"]
      ActiveRecord::Base.establish_connection(:production)
      (ActiveRecord::Base.connection.tables - skip_tables).each do |table_name|

        ProductionModelClass.set_table_name(table_name)
        DevelopmentModelClass.set_table_name(table_name)
        DevelopmentModelClass.establish_connection(:development)
        DevelopmentModelClass.reset_column_information
        ProductionModelClass.reset_column_information
        DevelopmentModelClass.record_timestamps = false

        # Page through the data in case the table is too large to fit in RAM
        offset = count = 0;
        print "Converting #{table_name}..."; STDOUT.flush
        # First, delete any old dev data
        DevelopmentModelClass.delete_all
        while ((models = ProductionModelClass.find(:all, 
            :offset=>offset, :limit=>PAGE_SIZE)).size > 0)

          count += models.size
          offset += PAGE_SIZE

          # Now, write out the prod data to the dev db
          DevelopmentModelClass.transaction do
            models.each do |model|
              new_model = DevelopmentModelClass.new(model.attributes)
              new_model.id = model.id
              new_model.save(false)
            end
          end
        end
        print "#{count} records converted\n"
      end
    end
  end
end

Digg! Delicious! Technorati Blinklist Furl Reddit

Conditionally caching actions when logged in for Rails 2.1 and 2.0

Posted by face on August 31, 2008

Screen shot of Legaltorrents.com

As some of you know I’ve been working full time for Matson Systems, Inc. building out LegalTorrents. I must apologize I have been neglecting parts of my blog. Fortunately, I’ve been swamped building out cool features for LegalTorrents and Matson wants to contribute back. After this caching article look for a rake task to convert a Rails app from one database platform to another, then a plugin for generating Activity Streams.

As a new community, we have to support tens of thousands of registered users. Yet on any given news day we need to support hundreds of thousands of non registered users.


This can be done using action caching and very modest hardware requirements. Given huge hardware resources, using memcached would solve the issue. However, meeting initial demands can be done using action caching with very modest hardware requirements. We use the built-in rails action caching to disk with a TTL hack from cron. We don’t want our logged-in users subjected to a TTL, as their changes should be instantaneous. We simply don’t cache actions for users who are logged in, and provide cached pages for everyone else. As the number of registered users grows… then we’ll use memcached.


Let’s begin with conditional caching in Rails 2.1 (if 2.0, see below). Conditional action caching is a new feature of the Rails 2.1 API. First off, pre Rails 2.1 the default was disk. In rails 2.1, the default is RAM. Not going to work on limited resources:


# Put this in RAILS_ROOT/config/initializers/something.rb
ActionController::Base.cache_store = :file_store, "#{RAILS_ROOT}/tmp/cache"
Ok, the application controller is a great place to decide if we want to cache:

class ApplicationController < ActionController::Base
  # ...
  protected
  # of course logged_in? needs to be defined...restful_authentication is what I recommend.
  def do_caching?
    !logged_in? && flash.empty?
  end
Now we can use the new Rails 2.1 :if feature to conditionally cache actions:

class TorrentsController < ApplicationController
  # ...
  caches_action :show , :if => Proc.new { |controller|
                          controller.send(:do_caching?) }

The final piece of the puzzle is a TTL. We use find to remove files older than 10 minutes, giving us a 10 minute TTL:

# This cron entry that runs every 10 minutes and removes any files older than 10 minutes named '*.cache'
3,13,23,33,43,53 * * * * find /home/ltdeploy/legaltorrents/tmp/cache -mmin +10 -name '*.cache' -exec rm -f {} \;
And that is it. For those of you not familiar with caches action here is a more complex example for a page that integrates the will_paginate plugin using :cache_path:

class CategoriesController < ApplicationController

  caches_action :show, 
    :if => Proc.new { |controller| controller.send(:do_caching?) }, 
    :cache_path => Proc.new { |c|
      c.params[:page] ?
      "#{c.request.host}.#{c.request.port}/#{c.send(:category_path,c.params[:id])}/page/#{c.params[:page]}" :
      "#{c.request.host}.#{c.request.port}/#{c.send(:category_path,c.params[:id])}/page/1"
    }

End of Story for Rails 2.1



Now, Rails 2.0 doesn’t have :if in caches_action. To work around this we used a simple monkey patch:


class ApplicationController < ActionController::Base
  # ...
  protected
  # Overrides Rails core to do action_cache when not logged in...Only works in Rails 2.0 and maybe earlier
  def perform_caching
    @@perform_caching && !logged_in? && flash.empty?
   end
Then we cache as normal:

class TorrentsController < ApplicationController
  # ...
  caches_action :show


Digg! Delicious! Technorati Blinklist Furl Reddit

Finally got NetShare to work on my iphone 3G

Posted by face on August 01, 2008

Screen shot of NetShare

I was so excited, Apple and AT&T approved tethering for my iPhone…but it was gone, no wait, gizmodo say’s it’s back! (click on picture to the left which is the “App Store” link in the gizmodo article to install NetShare right now via iTunes). Update:The link is broken now, NetShare is not available in the US app store. For us lucky few who purchased it….it still works…

However, after installing NetShare via iTunes onto my iphone I could not get it to work. I followed the instructions only to get:

Safari can’t open the page. Safari can’t open the page “http://www.apple.com/”. The error was: “unknown error” (NSURLErrorDomain:-1) Please choose Report Bugs to Apple from the Safari menu, note the error number, and describe what you did before you saw this message.

After fiddling with it for a few minutes I found a solution. The iphone appeard to be still trying to using the WiFi connection and not 3G.

So, right before the final step of launching NetShare on the iphone (i.e. after setting up the adHoc network on the iphone to your PC), launch Safari on the iPhone and surf to a page. This will cause the little WiFi icon to change to 3G at the top of your phone. Now lanuch NetShare that you have the 3G icon back and everything will work like a dream.

I can even use the ssh client on my OSX laptop to connect to remote servers using connect.c to proxy ssh through the iphone. To get this to work was something like:


wget http://www.taiyo.co.jp/~gotoh/ssh/connect.c
gcc connect.c -o connect -lresolv
sudo cp connect /usr/local/bin/.
ssh -o 'ProxyCommand /usr/local/bin/connect -5 -S 10.10.10.1:1080 %h %p' 206.71.190.75
# Of course the ProxyCommand can go in your ~/.ssh/config

Have iPhone, Have Laptop, will travel!


Digg! Delicious! Technorati Blinklist Furl Reddit