When your software is profitable, a number of the tables can develop fairly massive — I’m taking a look at you customers desk. When you’re curious sufficient, you periodically examine how your database performs. If any sluggish question pops up within the metrics, there’s a fantastic probability that some index is lacking.
State of the DB engines
Whereas most fashionable database engines can create indexes in an asynchronous, non–blocking method, it’s good to get conversant in all of the exceptions from this rule. I extremely suggest studying the documentation of PostgreSQL, MySQL. I’m unsure about SQLite right here because the documentation doesn’t clearly state that. Nevertheless, my fast chit–chat with LLM might offer you some insights.
What’s the issue then?
As you already know CREATE INDEX
assertion will likely be dealt with asynchronously by the database if acceptable algorithm is used. Because of this no reads, writes and replace will likely be blocked.
Usually, for the Rails software, you’ll run a migration by way of Ruby course of throughout deployment, utilizing all of the goodies from ActiveRecord::Migration
class and its environment.
Let’s think about the database schema like that:
ActiveRecord::Schema[7.1].outline(model: 2024_06_13_121701) do
create_table "customers", drive: :cascade do |t|
t.string "electronic mail"
t.boolean "energetic"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
finish
finish
Think about you need to shortly discover all these customers which didn’t activate the account:
Consumer.the place(energetic: false)
When you have sufficient customers, talking of dozens or tons of of tens of millions, doing full desk scan might merely kill the database efficiency. Full desk scan occurs when database has no index to make use of and have to examine each row whether or not it meets the standards.
I’ll keep on with PostreSQL within the examples if not said in any other case.
Apparent answer
Let’s add the index then:
➜ trololo git:(grasp) bin/rails g migration AddIndexOnActiveUsers
invoke active_record
create db/migrate/20240613121751_add_index_on_active_users.rb
Implementation:
# PostgreSQL
class AddIndexOnActiveUsers < ActiveRecord::Migration[7.1]
disable_ddl_transaction!
def change
add_index :customers, :energetic, algorithm: :concurrently
finish
finish
# MySQL
class AddIndexOnActiveUsers < ActiveRecord::Migration[7.1]
def change
add_index :customers, :energetic, algorithm: :inplace
finish
finish
Let’s run it:
➜ trololo git:(grasp) bin/rails db:migrate
== 20240613121751 AddIndexOnActiveUsers: migrating ============================
-- add_index(:customers, :energetic)
-> 0.0013s
== 20240613121751 AddIndexOnActiveUsers: migrated (0.0013s) ===================
Okay, it was fairly quick, proper? That’s appropriate on your dev machine, not essentially for the manufacturing setup. Now amplify this time 0.0013s
by 6 or 7 orders of magnitude in case your desk is sufficiently big:
➜ trololo git:(grasp) bin/rails db:migrate
== 20240613121751 AddIndexOnActiveUsers: migrating ============================
-- add_index(:customers, :energetic)
-> 13000.4928s
== 20240613121751 AddIndexOnActiveUsers: migrated (0.0013s) ===================
I’ll do the maths for you: 1300.4928s means 21 minutes 40.49 seconds. However it may be even longer — don’t ask me how I discovered about this.
Whereas the method of migration will find yourself finally, the migration blocking some other deployments to your software throughout this time could also be unacceptable for varied causes:
- you can not launch some other change to manufacturing till migration completes
- one thing different goes improper and it is advisable quickly deploy a hotfix, however you possibly can’t because the deployment is blocked by lengthy working migration™
- course of supervisor on a deployment machine might count on output inside, e.g. 5 minutes. Lengthy working migration will get killed in such situation.
What to do then?
Merely skip the migration physique for RAILS_ENV=manufacturing
:
class AddIndexOnActiveUsers < ActiveRecord::Migration[7.1]
disable_ddl_transaction!
def change
add_index :customers, :energetic, algorithm: :concurrently, if_not_exists: true until Rails.env.manufacturing?
finish
finish
The migration will likely be executed on manufacturing setting, however could have no impact in your database. schema.rb
or construction.sql
(relying on what you employ) will likely be aligned. Adequate entry may also seem in schema_migrations
desk. All of the builders could have index added on their native databases, check setting will likely be aligned too.
However Hey, the place’s my index on manufacturing?! you may ask. And that’s a fairly legitimate query. What you’ll want is a solution to run a rails runner or rake job on manufacturing. Relying on the capabilities you’ve, you may select to:
- Run it inside
bin/rails console
:
# PostgreSQL
ActiveRecord::Migration.add_index :customers, :energetic, algorithm: :concurrently, if_not_exists: true
# MySQL
ActiveRecord::Migration.add_index :customers, :energetic, algorithm: :inplace, if_not_exists: true
- Do the identical by way of
bin/rails runner
:
# PostgreSQL
bin/rails r "ActiveRecord::Migration.add_index :customers, :energetic, algorithm: :concurrently, if_not_exists: true"
#MySQL
bin/rails r "ActiveRecord::Migration.add_index :customers, :energetic, algorithm: :inplace, if_not_exists: true"
- Final, however not least, implement a
Rake
job. It has the benefit that it needs to be dedicated to the repository so that you don’t lose the historical past what’s occurred:
# PostgreSQL
namespace :indexes do
job add_index_on_active_users: :setting do
ActiveRecord::Migration.add_index :customers, :energetic, algorithm: :concurrently, if_not_exists: true
finish
finish
# MySQL
namespace :indexes do
job add_index_on_active_users: :setting do
ActiveRecord::Migration.add_index :customers, :energetic, algorithm: :inplace, if_not_exists: true
finish
finish
Execute it with bin/rails indexes:add_index_on_active_users
.
For the final choice it’s additionally simple to reinforce it with logging to simply determine execution in Grafana, Datadog or some other software you employ on your logs.
namespace :indexes do
job add_index_on_active_users: :setting do
Rails.logger.data("job indexes:add_index_on_active_users began")
ActiveRecord::Migration.add_index :customers, :energetic, algorithm: :concurrently, if_not_exists: true
Rails.logger.data("job indexes:add_index_on_active_users completed”)
finish
finish
Tiny particulars
When you’re conscious sufficient, you in all probability noticed if_not_exists: true
flag. We like idempotence and that’s the rationale. If anybody runs this job once more, nothing will occur. When you favor to see ActiveRecord::StatementInvalid
as a substitute, be at liberty to skip it.
As talked about within the preface, to make use of acceptable algorithm for index creation algorithm: :concurrently
for PostgreSQL and algorithm: inplace
for MySQL needs to be specified.
There’s one other quirk for PostgreSQL:
One other distinction is {that a} common
CREATE INDEX
command will be carried out inside a transaction block, howeverCREATE INDEX CONCURRENTLY
can not.
Every Rails migration execution is wrapped inside transaction. To disable this conduct, it is advisable use disable_ddl_transaction!
methodology inside your migration. In any other case, following error will pop up:
bin/rails aborted!
StandardError: An error has occurred, this and all later migrations canceled: (StandardError)
PG::ActiveSqlTransaction: ERROR: CREATE INDEX CONCURRENTLY can not run inside a transaction block
/Customers/fidel/code/fidel/trololo/db/migrate/20240613121751_add_index_on_active_users.rb:5:in `change'
Attributable to:
ActiveRecord::StatementInvalid: PG::ActiveSqlTransaction: ERROR: CREATE INDEX CONCURRENTLY can not run inside a transaction block (ActiveRecord::StatementInvalid)
/Customers/fidel/code/fidel/trololo/db/migrate/20240613121751_add_index_on_active_users.rb:5:in `change'
Attributable to:
PG::ActiveSqlTransaction: ERROR: CREATE INDEX CONCURRENTLY can not run inside a transaction block (PG::ActiveSqlTransaction)
/Customers/fidel/code/fidel/trololo/db/migrate/20240613121751_add_index_on_active_users.rb:5:in `change'
Duties: TOP => db:migrate
(See full hint by working job with --trace)
Nevertheless, it’s not an issue for our customized script or Rake
job run on manufacturing setting.