понедельник, 14 декабря 2009 г.

The small signature for sqlite3-rdiff

Depends: current test version of sqlite3-rdiff

The sqlite3-rdiff utility produces a signature file of size about 10% of original database size. Yes, it's better than coping an entire database but it's not good enough for production use. So I wrote a new algorithm that can build the small signatures. The algorithm calculates checksums for set of rows and so with N rows in each set the signature size will be decreased by the factor of N! Of course the delta file size will be increased but only a little for most databases.

My code uses a hack


select rowid/N as rowid, sum(murmurhash('$unixepoch',$cols)) from ... group by rowid


This solution is not good and may be used for testing only. This code will fail with N>16.


$ time ./sqlite3-rdiff signature slave.db slave.db.signature
signature slave.db slave.db.signature --table-name % --rows-per-hash 1
=7 system_config
=160 center
=7 role
=1 macroregion
=988 point
=7 region
=1841 user
=2867166 document_status
=25 comment
=250985 document_comment
=0 operation
=16 datatype
=7 type_template
=51 template
=370 template_field
=511335 document
=1593416 document_field
=757015 document_file
=7336 document_photo
=16 status_switch
=491129 message
=7 system_config_history
=160 center_history
=7 role_history
=1 macroregion_history
=988 point_history
=7 region_history
=1841 user_history
=2869992 document_status_history
=25 comment_history
=251191 document_comment_history
=0 operation_history
=16 datatype_history
=7 type_template_history
=51 template_history
=370 template_field_history
=511335 document_history
=1593416 document_field_history
=757015 document_file_history
=7336 document_photo_history
=16 status_switch_history

real 0m55.309s
user 0m53.075s
sys 0m1.444s


$ time ./sqlite3-rdiff delta slave.db.signature master.db slave.db.delta
delta slave.db.signature master.db slave.db.delta --table-name % --rows-per-hash 1
-0 +0 system_config
-0 +0 center
-0 +0 role
-0 +0 macroregion
-0 +0 point
-0 +0 region
-0 +0 user
-687 +2922 document_status
-0 +0 comment
-50 +230 document_comment
-0 +0 operation
-0 +0 datatype
-0 +0 type_template
-0 +0 template
-0 +0 template_field
-0 +550 document
-0 +1348 document_field
-0 +970 document_file
-0 +2 document_photo
-0 +0 status_switch
-0 +4443 message
-0 +0 system_config_history
-0 +0 center_history
-0 +0 role_history
-0 +0 macroregion_history
-0 +0 point_history
-0 +0 region_history
-0 +0 user_history
-0 +4202 document_status_history
-0 +0 comment_history
-0 +230 document_comment_history
-0 +0 operation_history
-0 +0 datatype_history
-0 +0 type_template_history
-0 +0 template_history
-0 +0 template_field_history
-0 +550 document_history
-0 +1348 document_field_history
-0 +970 document_file_history
-0 +2 document_photo_history
-0 +0 status_switch_history

real 4m20.282s
user 3m18.276s
sys 0m5.944s



$ time ./sqlite3-rdiff --rows-per-hash 16 signatureN slave.db slave.db.signature16
signature slave.db slave.db.signature16 --table-name % --rows-per-hash 16
=1 system_config
=19 center
=1 role
=1 macroregion
=90 point
=1 region
=185 user
=179198 document_status
=3 comment
=15687 document_comment
=0 operation
=2 datatype
=1 type_template
=10 template
=57 template_field
=45270 document
=227657 document_field
=63430 document_file
=495 document_photo
=3 status_switch
=30697 message
=1 system_config_history
=11 center_history
=1 role_history
=1 macroregion_history
=62 point_history
=1 region_history
=116 user_history
=179375 document_status_history
=2 comment_history
=15700 document_comment_history
=0 operation_history
=2 datatype_history
=1 type_template_history
=4 template_history
=24 template_field_history
=31959 document_history
=99589 document_field_history
=47314 document_file_history
=459 document_photo_history
=2 status_switch_history

real 3m40.539s
user 1m40.978s
sys 0m8.097s


$ time ./sqlite3-rdiff --rows-per-hash 16 deltaN slave.db.signature16 master.db slave.db.delta16
delta slave.db.signature16 master.db slave.db.delta16 --table-name % --rows-per-hash 16
-0 +0 system_config
-0 +0 center
-0 +0 role
-0 +0 macroregion
-0 +0 point
-0 +0 region
-0 +0 user
-5808 +8042 document_status
-0 +0 comment
-480 +654 document_comment
-0 +0 operation
-0 +0 datatype
-0 +0 type_template
-0 +0 template
-0 +0 template_field
-16 +553 document
-16 +1352 document_field
-16 +974 document_file
-4 +4 document_photo
-0 +0 status_switch
-16 +4452 message
-0 +0 system_config_history
-0 +0 center_history
-0 +0 role_history
-0 +0 macroregion_history
-0 +0 point_history
-0 +0 region_history
-0 +0 user_history
-16 +4211 document_status_history
-0 +0 comment_history
-16 +238 document_comment_history
-0 +0 operation_history
-0 +0 datatype_history
-0 +0 type_template_history
-0 +0 template_history
-0 +0 template_field_history
-16 +558 document_history
-16 +1357 document_field_history
-16 +978 document_file_history
-11 +11 document_photo_history
-0 +0 status_switch_history

real 3m28.369s
user 1m51.347s
sys 0m7.928s



$ ls -lh|grep slave.db|awk '{print $5 "\t" $8}'
1.5G slave.db
3.4M slave.db.delta
4.3M slave.db.delta16
160M slave.db.signature
13M slave.db.signature16

суббота, 12 декабря 2009 г.

sqlite3-rdiff: master-slave replication for SQLite

Link: http://mobigroup.ru/files/sqlite-ext/sqlite3-rdiff

Depends: tcl 8.5, sqlite3, murmurhash SQLite extension

I'm glad to annonce the sqlite3-diff utility for SQLite replication. Are used the ROWID value as unique key of row and murmurhash for build signatures for each row. It's enough for master-slave replication. The INTEGER PRIMARY KEY is not mandatory becouse sqlite3-rdiff may store the ROWID values.

The signature and delta files are valid SQLite3 databases too. They can be dumped/restored, updated and analyzed by any SQLite3 client application.

Hash collisions are resolved by using unique salt for each signature. So after first sync the theoretical frequency of collision (when any record is changed but the hash is same) is about 10^-9, after second sync - 10^-18, after N syncs - 10^-N*9. The salt is unix epoch time which is saved by using "PRAGMA user_version" in signature and delta databases.

Note: the master-master replication is now unsupported becouse is can't be so universal and are needed some conflict-resolution strategies. But it possible with uuid extension to have really unique identifier for each record and use it as key for master-master replication (and of course the ROWID values must be ignored). But the signature and delta files will be too big.

Usage

NAME

sqlite3-rdiff - compute and apply signature-based row differences for SQLite3 databases

SYNOPSYS
sqlite3-rdiff [options] signature old-file signature-file

sqlite3-rdiff [options] delta signature-file new-file delta-file

sqlite3-rdiff [options] patch old-file delta-file result-file


DESCRIPTION

Signature is solving by murmurhash2 algorithm for full row. Hash collision is resolved by using unique salt for each replication session.

OPTIONS:

--table-name regular expression for table names, use ? as any single symbol and % for any set of symbols.

AUTHOR
Alexey Pechnikov <pechnikov@mobigroup.ru>

SEE ALSO
The main rdiff-backup web page is at http://geomapx.blogspot.com/2009/12/sqlite3-rdiff-master-slave-replication.html.


Example.

The test on my Core2 Duo laptop.


# add test triggers to block table inserts
# the replication utility may disable triggers when patching database
$ sqlite3 2009-11-01.db "create trigger telephony_log_i before insert on telephony_log \
begin select raise(ABORT,'trigger fire');end"

# make slave database with some changes
$ cp 2009-11-01.db 2009-11-01.db.slave
$ sqlite3 2009-11-01.db.slave "delete from telephony_log where rowid%100000=0;"
$ sqlite3 2009-11-01.db.slave "delete from telephony_log_error where rowid%10=0;"

# how many rows are deleted in previous commands?
$ sqlite3 2009-11-01.db "select count(*) from telephony_log where rowid%100000=0;"
14
$ sqlite3 2009-11-01.db "select count(*) from telephony_log_error where rowid%10=0;"
41


$ time ./sqlite3-rdiff signature 2009-11-01.db.slave 2009-11-01.db.signature
signature 2009-11-01.db.slave 2009-11-01.db.signature --table-name %
=1465379 telephony_log
=303709 telephony_log_rating
=377 telephony_log_error
=140 telephony_log_counter

real 0m40.361s
user 0m28.850s
sys 0m0.936s


$ time ./sqlite3-rdiff delta 2009-11-01.db.signature 2009-11-01.db 2009-11-01.db.delta
delta 2009-11-01.db.signature 2009-11-01.db 2009-11-01.db.delta --table-name %
-0 +14 telephony_log
-0 +0 telephony_log_rating
-0 +41 telephony_log_error
-0 +0 telephony_log_counter

real 1m50.502s
user 1m34.338s
sys 0m1.268s


$ time ./sqlite3-rdiff patch 2009-11-01.db.slave 2009-11-01.db.delta 2009-11-01.db.replicated
patch 2009-11-01.db.slave 2009-11-01.db.delta 2009-11-01.db.replicated --table-name %
-0 +14 telephony_log (2 triggers disabled)
-0 +0 telephony_log_rating (0 triggers disabled)
-0 +41 telephony_log_error (0 triggers disabled)
-0 +0 telephony_log_counter (0 triggers disabled)

real 0m24.624s
user 0m0.168s
sys 0m2.764s

$ ls -lh|grep 2009-11-01.db|awk '{print $5 "\t" $9}'
342M 2009-11-01.db
44K 2009-11-01.db.delta
342M 2009-11-01.db.replicated
23M 2009-11-01.db.signature
342M 2009-11-01.db.slave


We can check the success of replication by starting replication again. There are no new changes in replicated database and so the we may get the empty patch.


$ ./sqlite3-rdiff signature 2009-11-01.db.replicated 2009-11-01.db.signature2
signature 2009-11-01.db.replicated 2009-11-01.db.signature2 --table-name %
=1465393 telephony_log
=303709 telephony_log_rating
=418 telephony_log_error
=140 telephony_log_counter

$ ./sqlite3-rdiff delta 2009-11-01.db.signature2 2009-11-01.db 2009-11-01.db.delta2
delta 2009-11-01.db.signature2 2009-11-01.db 2009-11-01.db.delta2 --table-name %
-0 +0 telephony_log
-0 +0 telephony_log_rating
-0 +0 telephony_log_error
-0 +0 telephony_log_counter

$ ./sqlite3-rdiff patch 2009-11-01.db.replicated 2009-11-01.db.delta2 2009-11-01.db.replicated2
patch 2009-11-01.db.replicated 2009-11-01.db.delta2 2009-11-01.db.replicated2 --table-name %
-0 +0 telephony_log (1 triggers disabled)
-0 +0 telephony_log_rating (0 triggers disabled)
-0 +0 telephony_log_error (0 triggers disabled)
-0 +0 telephony_log_counter (0 triggers disabled)


All right, the patch is empty.

Upd.

The test on the Core Quad desktop with single 10 000 rpm HDD. Database work.db.copy is used for some tests and has different schema but the sqlite3-rdiff work fine on it.


$ time ./sqlite3-rdiff signature work.db.copy work.db.signature
signature work.db.copy work.db.signature --table-name %
=7 system_config
=160 center
=7 role
=1 macroregion
=988 point
=7 region
=1841 user
=2863845 document_status
=25 comment
=250561 document_comment
=0 operation
=16 datatype
=7 type_template
=51 template
=370 template_field
=510515 document
=1591502 document_field
=755659 document_file
=7329 document_photo
=16 status_switch
=486684 message
=7 system_config_history
=160 center_history
=7 role_history
=1 macroregion_history
=988 point_history
=7 region_history
=1841 user_history
=2863845 document_status_history
=25 comment_history
=250561 document_comment_history
=0 operation_history
=16 datatype_history
=7 type_template_history
=51 template_history
=370 template_field_history
=510515 document_history
=1591502 document_field_history
=755659 document_file_history
=7329 document_photo_history
=16 status_switch_history

real 1m58.328s
user 0m53.179s
sys 0m4.604s


$ time ./sqlite3-rdiff delta work.db.signature work.db work.db.delta
delta work.db.signature work.db work.db.delta --table-name %
-0 +0 system_config
-0 +0 center
-0 +0 role
-0 +0 macroregion
-0 +0 point
-0 +0 region
-0 +0 user
-995 +4316 document_status
-0 +0 comment
-205 +629 document_comment
-0 +0 operation
-0 +0 datatype
-0 +0 type_template
-0 +0 template
-0 +0 template_field
-0 +820 document
-0 +1914 document_field
-0 +1356 document_file
-0 +7 document_photo
-0 +0 status_switch
-0 +4445 message
-0 +0 system_config_history
-0 +0 center_history
-0 +0 role_history
-0 +0 macroregion_history
-0 +0 point_history
-0 +0 region_history
-0 +0 user_history
-0 +6147 document_status_history
-0 +0 comment_history
-0 +630 document_comment_history
-0 +0 operation_history
-0 +0 datatype_history
-0 +0 type_template_history
-0 +0 template_history
-0 +0 template_field_history
-0 +820 document_history
-0 +1914 document_field_history
-0 +1356 document_file_history
-0 +7 document_photo_history
-0 +0 status_switch_history

real 5m11.592s
user 3m23.989s
sys 0m5.704s


$ time ./sqlite3-rdiff patch work.db.copy work.db.delta work.db.copy
patch work.db.copy work.db.delta work.db.copy --table-name %
-0 +0 system_config (3 triggers disabled)
-0 +0 center (3 triggers disabled)
-0 +0 role (3 triggers disabled)
-0 +0 macroregion (3 triggers disabled)
-0 +0 point (3 triggers disabled)
-0 +0 region (3 triggers disabled)
-0 +0 user (3 triggers disabled)
-995 +4316 document_status (3 triggers disabled)
-0 +0 comment (3 triggers disabled)
-205 +629 document_comment (3 triggers disabled)
-0 +0 operation (3 triggers disabled)
-0 +0 datatype (3 triggers disabled)
-0 +0 type_template (3 triggers disabled)
-0 +0 template (3 triggers disabled)
-0 +0 template_field (3 triggers disabled)
-0 +820 document (3 triggers disabled)
-0 +1914 document_field (3 triggers disabled)
-0 +1356 document_file (3 triggers disabled)
-0 +7 document_photo (3 triggers disabled)
-0 +0 status_switch (3 triggers disabled)
-0 +4445 message (0 triggers disabled)
system_config_versioning
center_versioning
role_versioning
macroregion_versioning
point_versioning
region_versioning
user_versioning
document_status_versioning
comment_versioning
document_comment_versioning
operation_versioning
datatype_versioning
type_template_versioning
template_versioning
template_field_versioning
document_versioning
document_field_versioning
document_file_versioning
document_photo_versioning
status_switch_versioning

real 0m0.726s
user 0m0.500s
sys 0m0.124s


$ ls -lh|grep work.db|awk '{print $5 "\t" $8}'
1.5G work.db
2.0G work.db.copy
4.0M work.db.delta
159M work.db.signature

четверг, 10 декабря 2009 г.

The SQLite History extension

The article is not completed!

The History SQLite extension may be used for table history logging.

This extension has been tested with SQLite versions 3.6.x running under Linux Debian.

The code is public domain.

Link: http://mobigroup.ru/files/sqlite-ext/history/

Suggests: ENV extension

Usage

There are anly 2 functions in the extension:

history(table ,column) - add history logging support for the table [table] by key column [column].
unhistory(table) - drop history logging triggers for the table [table]. The history table is saved! Only triggers will be drops.

The *_history table consists all fields of the [table] table without any checks or constraints and some additional fields:

CREATE TABLE [table]_history(
...
_date REAL,
_action TEXT,
_user TEXT,
_host TEXT
);

The _action field has action names where 'I' - insert, 'U' - update, 'D' - delete and '' - initial import.

The index for common queries to history table is built autimatically:

sqlite> explain query plan select * from sessions_history where key='test key 0' order by _date;
0|0|TABLE sessions_history WITH INDEX sessions_history_complex_idx ORDER BY


Note: the last record version does not have duplicate in the history table. So "insert" operations does not do logging but all "update" and "delete" operations do logged.

example.sql

g$ sqlite3
SQLite version 3.6.21
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .load './libsqliteversioning.so'

sqlite>
.load './libsqlitehistory.so'

CREATE TABLE IF NOT EXISTS sessions (
id INTEGER PRIMARY KEY,
key text not null unique on conflict replace,
value text not null
);
insert into sessions (key,value) values ('test key','test value');
select history('sessions', 'key'); -- create history table and triggers
select history('sessions', 'key'); -- no action
insert into sessions (key,value) values ('test key','test value');
.schema

insert into sessions (key,value) values ('test key 0','test value 0');
insert into sessions (key,value) values ('test key 1','test value 1');
insert into sessions (key,value) values ('test key 2','test value 2');

.header on
select * from sessions_history;

update sessions set value='new value' where key='test key 0';
delete from sessions where key='test key 0';
select datetime(_date),* from sessions_history where key='test key 0' order by _date desc;

.header off
explain query plan select * from sessions_history where key='test key 0' order by _date;
select unhistory('sessions');
.schema


Example results

sqlite> .read example.sql

CREATE TABLE sessions (
id INTEGER PRIMARY KEY,
key text not null unique on conflict replace,
value text not null
);
CREATE TABLE sessions_history(
id INT,
"key" TEXT,
value TEXT,
_date REAL,
_action TEXT,
_user TEXT,
_host TEXT
);
CREATE INDEX sessions_history_complex_idx on sessions_history(key, _date);
CREATE TRIGGER sessions_history_d BEFORE DELETE ON sessions
BEGIN INSERT INTO sessions_history SELECT *, julianday('now'), 'D', sqlite_user(),
sqlite_host() FROM sessions WHERE rowid=OLD.rowid;
END;
CREATE TRIGGER sessions_history_i AFTER INSERT ON sessions
BEGIN INSERT INTO sessions_history (key, _date, _action, _user, _host)
VALUES (NEW.key, julianday('now'), 'I', sqlite_user(), sqlite_host());
END;
CREATE TRIGGER sessions_history_u BEFORE UPDATE ON sessions BEGIN
SELECT RAISE (ABORT,'The key field "key" can not be modified') WHERE NEW.key != OLD.key;
INSERT INTO sessions_history SELECT *, julianday('now'), 'U', sqlite_user(),
sqlite_host() FROM sessions WHERE rowid=OLD.rowid;
END;
id|key|value|_date|_action|_user|_host
|test key||2455178.31008024|||
1|test key|test value|2455178.31008028|D||
|test key||2455178.31008028|I||
|test key 0||2455178.31008029|I||
|test key 1||2455178.31008029|I||
|test key 2||2455178.31008029|I||
datetime(_date)|id|key|value|_date|_action|_user|_host
2009-12-12 19:26:30|3|test key 0|new value|2455178.3100803|D||
2009-12-12 19:26:30|3|test key 0|test value 0|2455178.3100803|U||
2009-12-12 19:26:30||test key 0||2455178.31008029|I||
0|0|TABLE sessions_history WITH INDEX sessions_history_complex_idx ORDER BY

CREATE TABLE sessions (
id INTEGER PRIMARY KEY,
key text not null unique on conflict replace,
value text not null
);
CREATE TABLE sessions_history(
id INT,
"key" TEXT,
value TEXT,
_date REAL,
_action TEXT,
_user TEXT,
_host TEXT
);
CREATE INDEX sessions_history_complex_idx on sessions_history(key, _date);


Saving the host and user information
The SQLite is commonly used as embedded database engine and has no host and user access control mechanisms. But we can get this information from environment variables or by call of application functions. These two variants are described below.

ENV extension provided sqlite_* functions
The Env extension provide access to enviromnent variables, current user information, etc.

$ SQLITE_USER=USERNAME SQLITE_HOST=HOSTNAME sqlite3
SQLite version 3.6.21
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select sqlite_user();
USERNAME
sqlite> select sqlite_host();
HOSTNAME


Application-defined sqlite_* functions

# install functions to db
proc ::dataset::install {} {
foreach func {sqlite_user sqlite_host} {
$db function $func [namespace current]::$func
}
}
proc ::dataset::sqlite_user {} {
return [ns_user id]
}
proc ::dataset::sqlite_host {} {
return [ns_conn server]
}

::dataset::eval db :memory: {
ns_html::puts [db onecolumn {select sqlite_user()}]
ns_html::br
ns_html::puts [db onecolumn {select sqlite_host()}]
}


TODO

1. Multikey support
versioning(table ,column1, column2, ...)

2. No-key support
versioning(table)

О master-master репликации

Нежданно-негаданно встретилась ссылка на следующий проект: rubyrep - репликация для PostgreSQL и MySQL.

Непосредственно реализацию не смотрел, а вот дизайн системы мне понравился. А вот насчет master-master режима есть разумное сомнение - разработчики предлагают некоторые алгоритмы разрешения конфликтов, но, по сути, если архитектура реплицируемой базы позволяет возникновение конфликтов, это проблема архитектуры, а отнюдь не репликатора. Вместо того, чтобы изменять одну и ту же таблицу на разных хостах, можно создать набор схем (schema) так, что каждый хост работает со своей схемой, тогда при репликации каждый хост будет мастером для "своей" схемы и слэйвом для "чужих". В эскулайте то же самое реализуется созданием набора баз данных (притом часть идентификаторов должны иметь или сквозную нумерацию, или быть уникальными uuid).

среда, 9 декабря 2009 г.

Трэйсер запросов в tclsqlite

В новом релизе 3.6.21 появилась возможность полноценной трассировки запросов (раньше значения забинденных переменных не выводились). Тест у меня для этой функции давно написан, его и привожу.

Кстати, нюанс - функция трассировки имеет возможность отменить выполнение запроса, так что аудит запросов в эскулайт легко делается встроенными средствами. Замечу, что инструкции PRAGMA не выводятся.


$ ./test_trace.tcl
{DROP TABLE IF EXISTS events}
{CREATE TABLE events (id INTEGER PRIMARY KEY,value INTEGER)}
{SAVEPOINT _tcl_transaction}
{insert into events (value) values (0)}
{insert into events (value) values (1)}
{insert into events (value) values (2)}
{insert into events (value) values (3)}
{insert into events (value) values (4)}
{insert into events (value) values (5)}
{insert into events (value) values (6)}
{insert into events (value) values (7)}
{insert into events (value) values (8)}
{insert into events (value) values (9)}
{insert into events (value) values ('value')}
COMMIT
{CREATE INDEX events_value_idx ON events(value);}
{select value from events}


test_trace.tcl

#!/usr/bin/tclsh8.5
package require sqlite3
sqlite3 db trace.db

proc trace {args} {
puts "$args"
return SQLITE_OK
}
db trace trace
db eval {PRAGMA legacy_file_format = off}
db eval {DROP TABLE IF EXISTS events}
db eval {CREATE TABLE events (id INTEGER PRIMARY KEY,value INTEGER)}
db transaction {
for {set i 0} {$i<10} {incr i} {
set value [expr {$i % 500000}]
db eval {insert into events (value) values ($value)}
}
set value value
db eval {insert into events (value) values ($value)}
}
db eval {CREATE INDEX events_value_idx ON events(value);}
db onecolumn {select value from events}
db close


Что интересно, системные запросы тоже выводятся. Даже не знаю пока, баг это или фича, написал в рассылку эскулайта, посмотрим, что ответят. С одной стороны, полезно для отладки, с другой - придется как-то отделять введенные пользователем запросы от сгенерированных в "движке".


ANALYZE
{SELECT idx, stat FROM 'main'.sqlite_stat1}
{SELECT idx,sampleno,sample FROM 'main'.sqlite_stat2}
VACUUM
{ATTACH '' AS vacuum_db;}
{BEGIN EXCLUSIVE;}
{SELECT 'CREATE TABLE vacuum_db.' || substr(sql,14) FROM sqlite_master WHERE type='table'
AND name!='sqlite_sequence' AND rootpage>0}
{CREATE TABLE vacuum_db.events (id INTEGER PRIMARY KEY,value INTEGER)}
{CREATE TABLE vacuum_db.sqlite_stat1(tbl,idx,stat)}
{CREATE TABLE vacuum_db.sqlite_stat2(tbl,idx,sampleno,sample)}
{SELECT 'CREATE INDEX vacuum_db.' || substr(sql,14) FROM sqlite_master WHERE sql LIKE
'CREATE INDEX %' }
{CREATE INDEX vacuum_db.events_value_idx ON events(value)}
{SELECT 'CREATE UNIQUE INDEX vacuum_db.' || substr(sql,21) FROM sqlite_master WHERE sql
LIKE 'CREATE UNIQUE INDEX %'}
{SELECT 'INSERT INTO vacuum_db.' || quote(name) || ' SELECT * FROM main.' || quote(name) ||
';'FROM main.sqlite_master WHERE type = 'table' AND name!='sqlite_sequence' AND rootpage>0}
{INSERT INTO vacuum_db.'events' SELECT * FROM main.'events';}
{INSERT INTO vacuum_db.'sqlite_stat1' SELECT * FROM main.'sqlite_stat1';}
{INSERT INTO vacuum_db.'sqlite_stat2' SELECT * FROM main.'sqlite_stat2';}
{SELECT 'DELETE FROM vacuum_db.' || quote(name) || ';' FROM vacuum_db.sqlite_master WHERE
name='sqlite_sequence' }
{SELECT 'INSERT INTO vacuum_db.' || quote(name) || ' SELECT * FROM main.' || quote(name) ||
';' FROM vacuum_db.sqlite_master WHERE name=='sqlite_sequence';}
{INSERT INTO vacuum_db.sqlite_master SELECT type, name, tbl_name, rootpage, sql FROM
main.sqlite_master WHERE type='view' OR type='trigger' OR (type='table' AND rootpage=0)}


As you can see internal system queries are visualized too. It's may be useful for developers but very bad for users becouse it's impossible to log only user-applied queries.

The test script:

#!/usr/bin/tclsh8.5
package require sqlite3
sqlite3 db trace.db

proc trace {args} {
puts "$args"
return SQLITE_OK
}
db trace trace
db eval {ANALYZE}
db eval {VACUUM}
db close


Upd.
In sqlite-users maillist Roger Binns answer by my question about universal mechanism for logging user queries:


> I want to build a universal extension for logging all user queries. I can write
> this with trace function but it's difficult or impossible to split system and
> user queries.

There are two ways that can be done. The first is to require your extension
and its API be used at which you can then filter and log as needed.

The way I would go about it is not using a SQLite extension, but instead
provide a shared library with logging functions. You can ensure the shared
library is loaded before SQLite (use LD_PRELOAD on Linux/Unix and something
similar on Windows) and hook the SQLite APIs. This requires no change to
the application.

For apswtrace I can output all SQL and bindings, rows returned, how long
each query took, most popular queries, queries that took the longest
(individually and ones that have the same text) etc. This doesn't require
any code changes in the Python application.

This unobtrusive mechanism also allows you to get more detailed. For
example you can track how often SQLite is busy (install a busy handler,
transparently call the user one if needed). You can even use it to fake
things. For example historically this mechanism has been used to fake the
current time, but you could fake busy results to help with testing. If you
also intercept read and write system calls you can even correspond I/O
activity to the queries that cause it. You can even capture stack
backtraces when tying them to SQL calls. There are so many possibilities.

(C) Alexey Pechnikov aka MBG, mobigroup.ru