Posts Tagged ‘spock proxy’

MySQL data sharding using Spock Proxy

Tuesday, August 12th, 2008

Yesterday at the Silicon valley MySQL Meetup, Frank of Spock.com talked about Spock Proxy. Spock Proxy is a fork of MySQL proxy which has been built to meet the data sharding needs of Spock.com, the people search engine.

Here are some highlights:

  • Spock.com’s web interface is built on Rails and they use ActiveRecords as their O-R layer for MySQL data access
  • Spock has around 1,000 web servers using Rails and they connect to MySQL slaves and masters using Spock Proxy
  • Spock Proxy acts like a normal MySQL engine, except that it transparently talks to other MySQL servers. At spock they use 4 master and 4 slaves each having their own Spock Proxy.
  • The Web servers each have one connection open to the Spock Proxy while the proxy may have 100s of pooled connections
  • The Proxy tokenizes a SQL statement and figures out the target shard for the query. The query must have a shard_key. The shard_key is stored in a Universal DB which stores the dictionary of the partitioned tables, shard hostname/user/password, ranges and range for auto_incremented columns
  • It currently supports only range based partitioning — while a lot of partitioning is done based on hashing, but should not be a big deal to change
  • The current alpha version is very much suited to meet Spock’s internal needs, but I’m sure people will take this up to generalize
  • Unsupported query constructs (like inner queries, group by, multi-table joins) may not throw exceptions. DDLs are also not supported