In the past couple of weeks I've been researching databases and optimization techniques and I just wanted to share something I discovered.
I'm using:
I have an "items" table with 15 columns (5 string, 5 integer, 1 text, 2 date and 2 float) and I'm trying to add 100,000 items to the table.
Here are 4 different methods I've tried:
Takes 262 seconds (2.62ms per item)
Create an array with 100,000 items created using "Item.new" and then run "Item.import items":
items = []
100_000.times { items << Item.new(...) }
Item.import items
This technique takes 144 seconds (1.44ms per item) where the "Item.new" part takes 43.3 seconds and the "Item.import" part takes 100.7 seconds.
Improvement: ~55% compared to plain activerecord
Instead of creating models, just create an array of arrays:
fields = %w{col1 col2 col3 ...}
items = []
100_000.times { items << ['value1', 'value2', ...] }
Item.import fields, items
This technique takes 54 seconds (0.54ms per item) where creating 100,000 arrays takes 0.22 seconds and the insert takes 53.8 seconds.
Improvement: ~21% compared to plain activerecord
This time I created a CSV file and imported it using SQL (the CSV file has to be on the server's file system).
File.open('/tmp/items.csv', 'w') do |f|
100_000.times do { f.write(...) }
end
ActiveRecord::Base.connection.execute(
"COPY items FROM '/tmp/items.csv' csv;")
This technique takes 2.22 seconds, 0.84 seconds to create the CSV and 1.38 for the "COPY" command.
Improvement: ~0.85% compared to plain activerecord
Technique | Total time (seconds) | Time per item (millisec) | Improvement |
---|---|---|---|
Plain ActiveRecord | 262 | 2.62 | |
activerecord-import (models) | 144 | 1.44 | 55% |
activerecord-import (arrays) | 54 | 0.54 | 21% |
COPY | 2.23 | 0.02 | 0.85% |
You can see the code for this benchmark here.
Hope you found this useful, David.