Degradation of indexing speed in SQLite 3.6.20
Hardware
laptop with Intel(R) Core(TM)2 Duo CPU T5470 1.60GHz (working as 800MHz CPU for low heating)
1 GB RAM
HDD Western Digital Scorpio Black 2.5" 7200 rpm 16Mb (FS ext3)
OS Debian (lenny+testing+sid).
SQLite build
My debian package has these options to compile SQLite 3.6.20:
You can get binary and sources packages from http://mobigroup.ru/debian/pool/main/s/sqlite3/ (current build is 3.6.20-mobigroup.1).
Test results
test.tcl
Upd.
I did perform additional test for speed of select query. Is used the real query from telephony billing system.
Select Time for
select rowid from test where name='Группа направлений 0' and destcode=500 and delete_date IS NULL
Note: the query returns 1...3 rows. Time is measured by SQLite timer as user+sys times.
The test results
There is no speed degradation of no-indexed database.
As temporary fix of indexing problem we can use single-column realtime index and create complex index after full population of database (I split database monthly).
Upd.
I did perform the tests on server hardware (with 8 Gb RAM). As we can see the index creating speed degradation is result of index size more than SQLite page cache size.
The last index size must be more than 2Gb and can't be effectively created with cache_size equal to 2Gb. This command interrupted manually by Ctrl+C keys.
laptop with Intel(R) Core(TM)2 Duo CPU T5470 1.60GHz (working as 800MHz CPU for low heating)
1 GB RAM
HDD Western Digital Scorpio Black
OS Debian (lenny+testing+sid).
SQLite build
My debian package has these options to compile SQLite 3.6.20:
DEB_OPT_FLAG := -O2 -fno-strict-aliasing \
-DSQLITE_CORE=1 \
-DSQLITE_ENABLE_COLUMN_METADATA \
-DSQLITE_ENABLE_STAT2 \
-DSQLITE_DEFAULT_RECURSIVE_TRIGGERS=1 \
-DSQLITE_ENABLE_COMPRESS \
-DSQLITE_ENABLE_FTS3 \
-DSQLITE_ENABLE_INET \
-DSQLITE_ENABLE_MD5 \
-DSQLITE_ENABLE_RTREE \
-DSQLITE_ENABLE_TABLEFUNC \
-DSQLITE_ENABLE_VIRTUALTEXT \
-DSQLITE_ENABLE_ICU \
-DSQLITE_ENABLE_UUID \
-DSQLITE_ENABLE_VERSIONING \
-DSQLITE_SOUNDEX \
-DSQLITE_TEMP_STORE=3 \
-DSQLITE_DEFAULT_PAGE_SIZE=4096 \
-DSQLITE_DEFAULT_CACHE_SIZE=128000 \
-DSQLITE_DEFAULT_TEMP_CACHE_SIZE=8000 \
-DSQLITE_DEFAULT_FILE_FORMAT=4
LDFLAGS=-luuid -lz -ldl `icu-config --ldflags`
You can get binary and sources packages from http://mobigroup.ru/debian/pool/main/s/sqlite3/ (current build is 3.6.20-mobigroup.1).
Test results
┌────────────────────┬────┬───┬─────┬──────┬──────┬───────────────────────────┐
│Groups count (N) │1 │10 │100 │1000 │10000 │ │
├────────────────────┴────┴───┴─────┴──────┴──────┴───────────────────────────┤
│DB population │
├────────────────────┬────┬───┬─────┬──────┬──────┬───────────────────────────┤
│Creation time │2s │20s│205s │2225s │21329s│./test.tcl groups N │
│DB size │2,6M│26M│255M │2,6G │26G │items 25000 │
├────────────────────┴────┴───┴─────┴──────┴──────┴───────────────────────────┤
│Indexing after DB population │
├────────────────────┬────┬───┬─────┬──────┬──────┬───────────────────────────┤
│Single-column │ │ │ │ │ │ │
│indexing time │<1s │5s │49s │574s │ │CREATE INDEX single_idx on │
│DB size │2,9M│29M│290M │2,9G │- │test(destcode) │
├────────────────────┼────┼───┼─────┼──────┼──────┼───────────────────────────┤
│Analyze time │<1s │<1s│28s │588s │- │ANALYZE │
├────────────────────┴────┴───┴─────┴──────┴──────┴───────────────────────────┤
│Indexing after DB population │
├────────────────────┬────┬───┬─────┬──────┬──────┬───────────────────────────┤
│Multi-column │ │ │ │ │ │CREATE INDEX complex_idx │
│indexing time │1s │6s │80s │>5000s│ │on test │
│DB size │4,0M│40M│397M │ - │- │(destcode,name,delete_date)│
├────────────────────┼────┼───┼─────┼──────┼──────┼───────────────────────────┤
│Analyze time │<1s │4s │211s │- │- │ANALYZE │
├────────────────────┴────┴───┴─────┴──────┴──────┴───────────────────────────┤
│Single-column indexing with DB population │
├────────────────────┬────┬───┬─────┬──────┬──────┬───────────────────────────┤
│ │ │ │ │ │ │./test.tcl groups N │
│Creation time/DB │3s │29s│658s │>5000s│ │items 25000 index single │
│size │2,9M│29M│290M │ - │- │ │
├────────────────────┴────┴───┴─────┴──────┴──────┴───────────────────────────┤
│Multi-column indexing with DB population │
├────────────────────┬────┬───┬─────┬──────┬──────┬───────────────────────────┤
│ │ │ │ │ │ │./test.tcl groups N │
│Creation time/DB │3s │36s│1875s│ │ │items 25000 index complex │
│size │4,0M│40M│ 397M│- │- │ │
└────────────────────┴────┴───┴─────┴──────┴──────┴───────────────────────────┘
test.tcl
#!/usr/bin/tclsh8.5
# Use as ./test.tcl groups X items Y [index single|complex]
package require sqlite3
array set opts $argv
puts "Test of $opts(groups) with $opts(items) items in each group"
catch {file delete test$opts(groups).db}
catch {file delete test$opts(groups).db-journal}
sqlite3 db test$opts(groups).db
db eval {PRAGMA cache_size=16000}
#db eval {PRAGMA synchronous=OFF}
db eval {
CREATE TABLE test ( -- справочник направлений для телефонии
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
save_date REAL NOT NULL DEFAULT (julianday('now')),
delete_date REAL,
name text collate russian not null, -- группа направлений
destname text collate russian not null, -- название направления
destcode integer not null, -- префикс E.164 номера телефона
rcode text collate NOCASE not null, -- код региона
price real not null, -- стоимость минуты разговора
currency text collate NOCASE not null default 'RUB'
)
}
if {[info exists opts(index)] && $opts(index) eq {single}} {
db eval {CREATE INDEX single_idx on test(destcode)}
}
if {[info exists opts(index)] && $opts(index) eq {complex}} {
db eval {CREATE INDEX complex_idx on test(destcode,name,delete_date)}
}
for {set j 0} {$j<[expr $opts(groups)]} {incr j} {
set name "Группа направлений $j"
db transaction {
for {set i 0} {$i<$opts(items)} {incr i} {
set destname "Направление $i"
set destcode [string range [expr round(1000000000000000*rand())] 0 [expr round(15*rand())]]
set price [expr round(10000*rand())/100.]
# puts "insert into test (name,destname,destcode,rcode,price) values ($name,$destname,$destcode,'',$price)"
db eval {insert into test (name,destname,destcode,rcode,price) values ($name,$destname,$destcode,'',$price)}
}
}
}
db close
Upd.
I did perform additional test for speed of select query. Is used the real query from telephony billing system.
Select Time for
select rowid from test where name='Группа направлений 0' and destcode=500 and delete_date IS NULL
┌────────────────────┬────┬───┬─────┬──────┬──────┐
│Groups count (N) │1 │10 │100 │1000 │10000 │
├────────────────────┼────┼───┼─────┼──────┼──────┤
│without index │0.1s│1s │8s │78s │796s │
├────────────────────┼────┼───┼─────┼──────┼──────┤
│single index │0s │0s │0.08s│0.4s │- │
└────────────────────┴────┴───┴─────┴──────┴──────┘
Note: the query returns 1...3 rows. Time is measured by SQLite timer as user+sys times.
The test results
There is no speed degradation of no-indexed database.
As temporary fix of indexing problem we can use single-column realtime index and create complex index after full population of database (I split database monthly).
Upd.
I did perform the tests on server hardware (with 8 Gb RAM). As we can see the index creating speed degradation is result of index size more than SQLite page cache size.
$ time ./test.tcl groups 1000 items 25000; \
time sqlite3 test1000.db "pragma cache_size=500000;CREATE INDEX single_idx on test(destcode);"
Test of 1000 with 25000 items in each group
real 10m7.098s
user 9m36.932s
sys 0m17.545s
real 3m5.097s
user 2m55.639s
sys 0m4.392s
$ time ./test.tcl groups 2000 items 25000; \
time sqlite3 test2000.db "pragma cache_size=500000;CREATE INDEX single_idx on test(destcode);";
Test of 2000 with 25000 items in each group
real 21m7.168s
user 19m13.756s
sys 0m33.994s
real 6m42.785s
user 6m12.483s
sys 0m17.749s
$ > time ./test.tcl groups 4000 items 25000; \
time sqlite3 test4000.db "pragma cache_size=500000;CREATE INDEX single_idx on test(destcode);";
Test of 4000 with 25000 items in each group
real 40m36.081s
user 38m28.492s
sys 1m11.944s
real 13m42.090s
user 12m48.424s
sys 0m28.150s
$ time ./test.tcl groups 8000 items 25000; \
time sqlite3 test8000.db "pragma cache_size=500000;CREATE INDEX single_idx on test(destcode);"
Test of 8000 with 25000 items in each group
real 80m36.335s
user 76m46.752s
sys 2m9.436s
^CError: interrupted
real 214m12.327s
user 23m39.909s
sys 1m18.873s
The last index size must be more than 2Gb and can't be effectively created with cache_size equal to 2Gb. This command interrupted manually by Ctrl+C keys.
Comments