Initial Commit

This commit is contained in:
Riley Schneider
2025-12-03 16:38:10 +01:00
parent c5e26bf594
commit b732d8d4b5
17680 changed files with 5977495 additions and 2 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,172 @@
=head1 NAME
DBD::SQLite::Cookbook - The DBD::SQLite Cookbook
=head1 DESCRIPTION
This is the L<DBD::SQLite> cookbook.
It is intended to provide a place to keep a variety of functions and
formals for use in callback APIs in L<DBD::SQLite>.
=head1 AGGREGATE FUNCTIONS
=head2 Variance
This is a simple aggregate function which returns a variance. It is
adapted from an example implementation in pysqlite.
package variance;
sub new { bless [], shift; }
sub step {
my ( $self, $value ) = @_;
push @$self, $value;
}
sub finalize {
my $self = $_[0];
my $n = @$self;
# Variance is NULL unless there is more than one row
return undef unless $n || $n == 1;
my $mu = 0;
foreach my $v ( @$self ) {
$mu += $v;
}
$mu /= $n;
my $sigma = 0;
foreach my $v ( @$self ) {
$sigma += ($v - $mu)**2;
}
$sigma = $sigma / ($n - 1);
return $sigma;
}
# NOTE: If you use an older DBI (< 1.608),
# use $dbh->func(..., "create_aggregate") instead.
$dbh->sqlite_create_aggregate( "variance", 1, 'variance' );
The function can then be used as:
SELECT group_name, variance(score)
FROM results
GROUP BY group_name;
=head2 Variance (Memory Efficient)
A more efficient variance function, optimized for memory usage at the
expense of precision:
package variance2;
sub new { bless {sum => 0, count=>0, hash=> {} }, shift; }
sub step {
my ( $self, $value ) = @_;
my $hash = $self->{hash};
# by truncating and hashing, we can comsume many more data points
$value = int($value); # change depending on need for precision
# use sprintf for arbitrary fp precision
if (exists $hash->{$value}) {
$hash->{$value}++;
} else {
$hash->{$value} = 1;
}
$self->{sum} += $value;
$self->{count}++;
}
sub finalize {
my $self = $_[0];
# Variance is NULL unless there is more than one row
return undef unless $self->{count} > 1;
# calculate avg
my $mu = $self->{sum} / $self->{count};
my $sigma = 0;
while (my ($h, $v) = each %{$self->{hash}}) {
$sigma += (($h - $mu)**2) * $v;
}
$sigma = $sigma / ($self->{count} - 1);
return $sigma;
}
The function can then be used as:
SELECT group_name, variance2(score)
FROM results
GROUP BY group_name;
=head2 Variance (Highly Scalable)
A third variable implementation, designed for arbitrarily large data sets:
package variance3;
sub new { bless {mu=>0, count=>0, S=>0}, shift; }
sub step {
my ( $self, $value ) = @_;
$self->{count}++;
my $delta = $value - $self->{mu};
$self->{mu} += $delta/$self->{count};
$self->{S} += $delta*($value - $self->{mu});
}
sub finalize {
my $self = $_[0];
return $self->{S} / ($self->{count} - 1);
}
The function can then be used as:
SELECT group_name, variance3(score)
FROM results
GROUP BY group_name;
=head1 SUPPORT
Bugs should be reported via the CPAN bug tracker at
L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DBD-SQLite>
=head1 TO DO
=over
=item *
Add more and varied cookbook recipes, until we have enough to
turn them into a separate CPAN distribution.
=item *
Create a series of tests scripts that validate the cookbook recipes.
=back
=head1 AUTHOR
Adam Kennedy E<lt>adamk@cpan.orgE<gt>
=head1 COPYRIGHT
Copyright 2009 - 2012 Adam Kennedy.
This program is free software; you can redistribute
it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the
LICENSE file included with this module.

View File

@@ -0,0 +1,514 @@
=head1 NAME
DBD::SQLite::Fulltext_search - Using fulltext searches with DBD::SQLite
=head1 DESCRIPTION
=head2 Introduction
SQLite is bundled with an extension module called "FTS" for full-text
indexing. Tables with this feature enabled can be efficiently queried
to find rows that contain one or more instances of some specified
words (also called "tokens"), in any column, even if the table contains many
large documents.
The first full-text search modules for SQLite were called C<FTS1> and C<FTS2>
and are now obsolete. The latest version is C<FTS4>, but it shares many
features with the former module C<FTS3>, which is why parts of the
API and parts of the documentation still refer to C<FTS3>; from a client
point of view, both can be considered largely equivalent.
Detailed documentation can be found
at L<http://www.sqlite.org/fts3.html>.
=head2 Short example
Here is a very short example of using FTS :
$dbh->do(<<"") or die DBI::errstr;
CREATE VIRTUAL TABLE fts_example USING fts4(content)
my $sth = $dbh->prepare("INSERT INTO fts_example(content) VALUES (?)");
$sth->execute($_) foreach @docs_to_insert;
my $results = $dbh->selectall_arrayref(<<"");
SELECT docid, snippet(fts_example) FROM fts_example WHERE content MATCH 'foo'
The key points in this example are :
=over
=item *
The syntax for creating FTS tables is
CREATE VIRTUAL TABLE <table_name> USING fts4(<columns>)
where C<< <columns> >> is a list of column names. Columns may be
typed, but the type information is ignored. If no columns
are specified, the default is a single column named C<content>.
In addition, FTS tables have an implicit column called C<docid>
(or also C<rowid>) for numbering the stored documents.
=item *
Statements for inserting, updating or deleting records
use the same syntax as for regular SQLite tables.
=item *
Full-text searches are specified with the C<MATCH> operator, and an
operand which may be a single word, a word prefix ending with '*', a
list of words, a "phrase query" in double quotes, or a boolean combination
of the above.
=item *
The builtin function C<snippet(...)> builds a formatted excerpt of the
document text, where the words pertaining to the query are highlighted.
=back
There are many more details to building and searching
FTS tables, so we strongly invite you to read
the full documentation at L<http://www.sqlite.org/fts3.html>.
=head1 QUERY SYNTAX
Here are some explanation about FTS queries, borrowed from
the sqlite documentation.
=head2 Token or token prefix queries
An FTS table may be queried for all documents that contain a specified
term, or for all documents that contain a term with a specified
prefix. The query expression for a specific term is simply the term
itself. The query expression used to search for a term prefix is the
prefix itself with a '*' character appended to it. For example:
-- Virtual table declaration
CREATE VIRTUAL TABLE docs USING fts3(title, body);
-- Query for all documents containing the term "linux":
SELECT * FROM docs WHERE docs MATCH 'linux';
-- Query for all documents containing a term with the prefix "lin".
SELECT * FROM docs WHERE docs MATCH 'lin*';
If a search token (on the right-hand side of the MATCH operator)
begins with "^" then that token must be the first in its field of
the document : so for example C<^lin*> matches
'linux kernel changes ...' but does not match 'new linux implementation'.
=head2 Column specifications
Normally, a token or token prefix query is matched against the FTS
table column specified as the right-hand side of the MATCH
operator. Or, if the special column with the same name as the FTS
table itself is specified, against all columns. This may be overridden
by specifying a column-name followed by a ":" character before a basic
term query. There may be space between the ":" and the term to query
for, but not between the column-name and the ":" character. For
example:
-- Query the database for documents for which the term "linux" appears in
-- the document title, and the term "problems" appears in either the title
-- or body of the document.
SELECT * FROM docs WHERE docs MATCH 'title:linux problems';
-- Query the database for documents for which the term "linux" appears in
-- the document title, and the term "driver" appears in the body of the document
-- ("driver" may also appear in the title, but this alone will not satisfy the.
-- query criteria).
SELECT * FROM docs WHERE body MATCH 'title:linux driver';
=head2 Phrase queries
A phrase query is a query that retrieves all documents that contain a
nominated set of terms or term prefixes in a specified order with no
intervening tokens. Phrase queries are specified by enclosing a space
separated sequence of terms or term prefixes in double quotes ("). For
example:
-- Query for all documents that contain the phrase "linux applications".
SELECT * FROM docs WHERE docs MATCH '"linux applications"';
-- Query for all documents that contain a phrase that matches "lin* app*".
-- As well as "linux applications", this will match common phrases such
-- as "linoleum appliances" or "link apprentice".
SELECT * FROM docs WHERE docs MATCH '"lin* app*"';
=head2 NEAR queries.
A NEAR query is a query that returns documents that contain a two or
more nominated terms or phrases within a specified proximity of each
other (by default with 10 or less intervening terms). A NEAR query is
specified by putting the keyword "NEAR" between two phrase, term or
prefix queries. To specify a proximity other than the default, an
operator of the form "NEAR/<N>" may be used, where <N> is the maximum
number of intervening terms allowed. For example:
-- Virtual table declaration.
CREATE VIRTUAL TABLE docs USING fts4();
-- Virtual table data.
INSERT INTO docs VALUES('SQLite is an ACID compliant embedded relational database management system');
-- Search for a document that contains the terms "sqlite" and "database" with
-- not more than 10 intervening terms. This matches the only document in
-- table docs (since there are only six terms between "SQLite" and "database"
-- in the document).
SELECT * FROM docs WHERE docs MATCH 'sqlite NEAR database';
-- Search for a document that contains the terms "sqlite" and "database" with
-- not more than 6 intervening terms. This also matches the only document in
-- table docs. Note that the order in which the terms appear in the document
-- does not have to be the same as the order in which they appear in the query.
SELECT * FROM docs WHERE docs MATCH 'database NEAR/6 sqlite';
-- Search for a document that contains the terms "sqlite" and "database" with
-- not more than 5 intervening terms. This query matches no documents.
SELECT * FROM docs WHERE docs MATCH 'database NEAR/5 sqlite';
-- Search for a document that contains the phrase "ACID compliant" and the term
-- "database" with not more than 2 terms separating the two. This matches the
-- document stored in table docs.
SELECT * FROM docs WHERE docs MATCH 'database NEAR/2 "ACID compliant"';
-- Search for a document that contains the phrase "ACID compliant" and the term
-- "sqlite" with not more than 2 terms separating the two. This also matches
-- the only document stored in table docs.
SELECT * FROM docs WHERE docs MATCH '"ACID compliant" NEAR/2 sqlite';
More than one NEAR operator may appear in a single query. In this case
each pair of terms or phrases separated by a NEAR operator must appear
within the specified proximity of each other in the document. Using
the same table and data as in the block of examples above:
-- The following query selects documents that contains an instance of the term
-- "sqlite" separated by two or fewer terms from an instance of the term "acid",
-- which is in turn separated by two or fewer terms from an instance of the term
-- "relational".
SELECT * FROM docs WHERE docs MATCH 'sqlite NEAR/2 acid NEAR/2 relational';
-- This query matches no documents. There is an instance of the term "sqlite" with
-- sufficient proximity to an instance of "acid" but it is not sufficiently close
-- to an instance of the term "relational".
SELECT * FROM docs WHERE docs MATCH 'acid NEAR/2 sqlite NEAR/2 relational';
Phrase and NEAR queries may not span multiple columns within a row.
=head2 Set operations
The three basic query types described above may be used to query the
full-text index for the set of documents that match the specified
criteria. Using the FTS query expression language it is possible to
perform various set operations on the results of basic queries. There
are currently three supported operations:
=over
=item *
The AND operator determines the intersection of two sets of documents.
=item *
The OR operator calculates the union of two sets of documents.
=item *
The NOT operator may be used to compute the relative complement of one
set of documents with respect to another.
=back
The AND, OR and NOT binary set operators must be entered using capital
letters; otherwise, they are interpreted as basic term queries instead
of set operators. Each of the two operands to an operator may be a
basic FTS query, or the result of another AND, OR or NOT set
operation. Parenthesis may be used to control precedence and grouping.
The AND operator is implicit for adjacent basic queries without any
explicit operator. For example, the query expression "implicit
operator" is a more succinct version of "implicit AND operator".
Boolean operations as just described correspond to the so-called
"enhanced query syntax" of sqlite; this is the version compiled
with C<DBD::SQLite>, starting from version 1.31.
A former version, called the "standard query syntax", used to
support tokens prefixed with '+' or '-' signs (for token inclusion
or exclusion); if your application needs to support this old
syntax, use L<DBD::SQLite::FTS3Transitional> (published
in a separate distribution) for doing the conversion.
=head1 TOKENIZERS
=head2 Concept
The behaviour of full-text indexes strongly depends on how
documents are split into I<tokens>; therefore FTS table
declarations can explicitly specify how to perform
tokenization:
CREATE ... USING fts4(<columns>, tokenize=<tokenizer>)
where C<< <tokenizer> >> is a sequence of space-separated
words that triggers a specific tokenizer. Tokenizers can
be SQLite builtins, written in C code, or Perl tokenizers.
Both are as explained below.
=head2 SQLite builtin tokenizers
SQLite comes with some builtin tokenizers (see
L<http://www.sqlite.org/fts3.html#tokenizer>) :
=over
=item simple
Under the I<simple> tokenizer, a term is a contiguous sequence of
eligible characters, where eligible characters are all alphanumeric
characters, the "_" character, and all characters with UTF codepoints
greater than or equal to 128. All other characters are discarded when
splitting a document into terms. They serve only to separate adjacent
terms.
All uppercase characters within the ASCII range (UTF codepoints less
than 128), are transformed to their lowercase equivalents as part of
the tokenization process. Thus, full-text queries are case-insensitive
when using the simple tokenizer.
=item porter
The I<porter> tokenizer uses the same rules to separate the input
document into terms, but as well as folding all terms to lower case it
uses the Porter Stemming algorithm to reduce related English language
words to a common root.
=item icu
The I<icu> tokenizer uses the ICU library to decide how to
identify word characters in different languages; however, this
requires SQLite to be compiled with the C<SQLITE_ENABLE_ICU>
pre-processor symbol defined. So, to use this tokenizer, you need
edit F<Makefile.PL> to add this flag in C<@CC_DEFINE>, and then
recompile C<DBD::SQLite>; of course, the prerequisite is to have
an ICU library available on your system.
=item unicode61
The I<unicode61> tokenizer works very much like "simple" except that it
does full unicode case folding according to rules in Unicode Version
6.1 and it recognizes unicode space and punctuation characters and
uses those to separate tokens. By contrast, the simple tokenizer only
does case folding of ASCII characters and only recognizes ASCII space
and punctuation characters as token separators.
By default, "unicode61" also removes all diacritics from Latin script
characters. This behaviour can be overridden by adding the tokenizer
argument C<"remove_diacritics=0">. For example:
-- Create tables that remove diacritics from Latin script characters
-- as part of tokenization.
CREATE VIRTUAL TABLE txt1 USING fts4(tokenize=unicode61);
CREATE VIRTUAL TABLE txt2 USING fts4(tokenize=unicode61 "remove_diacritics=1");
-- Create a table that does not remove diacritics from Latin script
-- characters as part of tokenization.
CREATE VIRTUAL TABLE txt3 USING fts4(tokenize=unicode61 "remove_diacritics=0");
Additional options can customize the set of codepoints that unicode61
treats as separator characters or as token characters -- see the
documentation in L<http://www.sqlite.org/fts3.html#unicode61>.
=back
If a more complex tokenizing algorithm is required, for example to
implement stemming, discard punctuation, or to recognize compound words,
use the perl tokenizer to implement your own logic, as explained below.
=head2 Perl tokenizers
=head3 Declaring a perl tokenizer
In addition to the builtin SQLite tokenizers, C<DBD::SQLite>
implements a I<perl> tokenizer, that can hook to any tokenizing
algorithm written in Perl. This is specified as follows :
CREATE ... USING fts4(<columns>, tokenize=perl '<perl_function>')
where C<< <perl_function> >> is a fully qualified Perl function name
(i.e. prefixed by the name of the package in which that function is
declared). So for example if the function is C<my_func> in the main
program, write
CREATE ... USING fts4(<columns>, tokenize=perl 'main::my_func')
=head3 Writing a perl tokenizer by hand
That function should return a code reference that takes a string as
single argument, and returns an iterator (another function), which
returns a tuple C<< ($term, $len, $start, $end, $index) >> for each
term. Here is a simple example that tokenizes on words according to
the current perl locale
sub locale_tokenizer {
return sub {
my $string = shift;
use locale;
my $regex = qr/\w+/;
my $term_index = 0;
return sub { # closure
$string =~ /$regex/g or return; # either match, or no more token
my ($start, $end) = ($-[0], $+[0]);
my $len = $end-$start;
my $term = substr($string, $start, $len);
return ($term, $len, $start, $end, $term_index++);
}
};
}
There must be three levels of subs, in a kind of "Russian dolls" structure,
because :
=over
=item *
the external, named sub is called whenever accessing a FTS table
with that tokenizer
=item *
the inner, anonymous sub is called whenever a new string
needs to be tokenized (either for inserting new text into the table,
or for analyzing a query).
=item *
the innermost, anonymous sub is called repeatedly for retrieving
all terms within that string.
=back
=head3 Using Search::Tokenizer
Instead of writing tokenizers by hand, you can grab one of those
already implemented in the L<Search::Tokenizer> module. For example,
if you want ignore differences between accented characters, you can
write :
use Search::Tokenizer;
$dbh->do(<<"") or die DBI::errstr;
CREATE ... USING fts4(<columns>,
tokenize=perl 'Search::Tokenizer::unaccent')
Alternatively, you can use L<Search::Tokenizer/new> to build
your own tokenizer. Here is an example that treats compound
words (words with an internal dash or dot) as single tokens :
sub my_tokenizer {
return Search::Tokenizer->new(
regex => qr{\p{Word}+(?:[-./]\p{Word}+)*},
);
}
=head1 Fts4aux - Direct Access to the Full-Text Index
The content of a full-text index can be accessed through the
virtual table module "fts4aux". For example, assuming that
our database contains a full-text indexed table named "ft",
we can declare :
CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft)
and then query the C<ft_terms> table to access the
list of terms, their frequency, etc.
Examples are documented in
L<http://www.sqlite.org/fts3.html#fts4aux>.
=head1 How to spare database space
By default, FTS stores a complete copy of the indexed documents,
together with the fulltext index. On a large collection of documents,
this can consume quite a lot of disk space. However, FTS has some
options for compressing the documents, or even for not storing them at
all -- see L<http://www.sqlite.org/fts3.html#fts4_options>.
In particular, the option for I<contentless FTS tables> only stores
the fulltext index, without the original document content. This is
specified as C<content="">, like in the following example :
CREATE VIRTUAL TABLE t1 USING fts4(content="", a, b)
Data can be inserted into such an FTS4 table using an INSERT
statements. However, unlike ordinary FTS4 tables, the user must supply
an explicit integer docid value. For example:
-- This statement is Ok:
INSERT INTO t1(docid, a, b) VALUES(1, 'a b c', 'd e f');
-- This statement causes an error, as no docid value has been provided:
INSERT INTO t1(a, b) VALUES('j k l', 'm n o');
Of course your application will need an algorithm for finding
the external resource corresponding to any I<docid> stored within
SQLite.
When using placeholders, the docid must be explicitly typed to
INTEGER, because this is a "hidden column" for which sqlite
is not able to automatically infer the proper type. So the following
doesn't work :
my $sth = $dbh->prepare("INSERT INTO t1(docid, a, b) VALUES(?, ?, ?)");
$sth->execute(2, 'aa', 'bb'); # constraint error
but it works with an explicitly cast :
my $sql = "INSERT INTO t1(docid, a, b) VALUES(CAST(? AS INTEGER), ?, ?)",
my $sth = $dbh->prepare(sql);
$sth->execute(2, 'aa', 'bb');
or with an explicitly typed L<DBI/bind_param> :
use DBI qw/SQL_INTEGER/;
my $sql = "INSERT INTO t1(docid, a, b) VALUES(?, ?, ?)";
my $sth = $dbh->prepare(sql);
$sth->bind_param(1, 2, SQL_INTEGER);
$sth->bind_param(2, "aa");
$sth->bind_param(3, "bb");
$sth->execute();
It is not possible to UPDATE or DELETE a row stored in a contentless
FTS4 table. Attempting to do so is an error.
Contentless FTS4 tables also support SELECT statements. However, it is
an error to attempt to retrieve the value of any table column other
than the docid column. The auxiliary function C<matchinfo()> may be
used, but C<snippet()> and C<offsets()> may not, so if such
functionality is needed, it has to be directly programmed within the
Perl application.
=head1 AUTHOR
Laurent Dami E<lt>dami@cpan.orgE<gt>
=head1 COPYRIGHT
Copyright 2014 Laurent Dami.
Some parts borrowed from the L<http://sqlite.org> documentation, copyright 2014.
This documentation is in the public domain; you can redistribute
it and/or modify it under the same terms as Perl itself.

View File

@@ -0,0 +1,288 @@
package DBD::SQLite::GetInfo;
use 5.006;
use strict;
use warnings;
use DBD::SQLite;
# SQL_DRIVER_VER should be formatted as dd.dd.dddd
my $dbdversion = $DBD::SQLite::VERSION;
$dbdversion .= '_00' if $dbdversion =~ /^\d+\.\d+$/;
my $sql_driver_ver = sprintf("%02d.%02d.%04d", split(/[\._]/, $dbdversion));
# Full list of keys and their return types: DBI::Const::GetInfo::ODBC
# Most of the key definitions can be gleaned from:
#
# https://docs.microsoft.com/en-us/sql/odbc/reference/syntax/sqlgetinfo-function
our %info = (
20 => 'N', # SQL_ACCESSIBLE_PROCEDURES - No stored procedures to access
19 => 'Y', # SQL_ACCESSIBLE_TABLES - SELECT access to all tables in table_info
0 => 0, # SQL_ACTIVE_CONNECTIONS - No maximum connection limit
116 => 0, # SQL_ACTIVE_ENVIRONMENTS - No "active environment" limit
1 => 0, # SQL_ACTIVE_STATEMENTS - No concurrent activity limit
169 => 127, # SQL_AGGREGATE_FUNCTIONS - Supports all SQL-92 aggregrate functions
117 => 0, # SQL_ALTER_DOMAIN - No ALTER DOMAIN support
86 => 1, # SQL_ALTER_TABLE - Only supports ADD COLUMN and table rename (not listed in enum) in ALTER TABLE statements
10021 => 0, # SQL_ASYNC_MODE - No asynchronous support (in vanilla SQLite)
120 => 0, # SQL_BATCH_ROW_COUNT - No special row counting access
121 => 0, # SQL_BATCH_SUPPORT - No batches
82 => 0, # SQL_BOOKMARK_PERSISTENCE - No bookmark support
114 => 1, # SQL_CATALOG_LOCATION - Database comes first in identifiers
10003 => 'Y', # SQL_CATALOG_NAME - Supports database names
41 => '.', # SQL_CATALOG_NAME_SEPARATOR - Separated by dot
42 => 'database', # SQL_CATALOG_TERM - SQLite calls catalogs databases
92 => 1+4+8, # SQL_CATALOG_USAGE - Supported in calls to DML & table/index definiton (no procedures or permissions)
10004 => 'UTF-8', # SQL_COLLATION_SEQ - SQLite 3 uses UTF-8 by default
87 => 'Y', # SQL_COLUMN_ALIAS - Supports column aliases
22 => 0, # SQL_CONCAT_NULL_BEHAVIOR - 'a'||NULL = NULL
# SQLite has no CONVERT function, only CAST. However, it converts to every "affinity" it supports.
#
# The only SQL_CVT_* types it doesn't support are date/time types, as it has no concept of
# date/time values once inserted. These are only convertable to text-like types. GUIDs are in
# the same boat, having no real means of switching to a numeric format.
#
# text/binary types = 31723265
# numeric types = 28926
# date/time types = 1802240
# total = 33554431
48 => 1, # SQL_CONVERT_FUNCTIONS - CAST only
53 => 31723265+28926, # SQL_CONVERT_BIGINT
54 => 31723265+28926, # SQL_CONVERT_BINARY
55 => 31723265+28926, # SQL_CONVERT_BIT
56 => 33554431, # SQL_CONVERT_CHAR
57 => 31723265+1802240, # SQL_CONVERT_DATE
58 => 31723265+28926, # SQL_CONVERT_DECIMAL
59 => 31723265+28926, # SQL_CONVERT_DOUBLE
60 => 31723265+28926, # SQL_CONVERT_FLOAT
173 => 31723265, # SQL_CONVERT_GUID
61 => 31723265+28926, # SQL_CONVERT_INTEGER
123 => 31723265+1802240, # SQL_CONVERT_INTERVAL_DAY_TIME
124 => 31723265+1802240, # SQL_CONVERT_INTERVAL_YEAR_MONTH
71 => 31723265+28926, # SQL_CONVERT_LONGVARBINARY
62 => 31723265+28926, # SQL_CONVERT_LONGVARCHAR
63 => 31723265+28926, # SQL_CONVERT_NUMERIC
64 => 31723265+28926, # SQL_CONVERT_REAL
65 => 31723265+28926, # SQL_CONVERT_SMALLINT
66 => 31723265+1802240, # SQL_CONVERT_TIME
67 => 31723265+1802240, # SQL_CONVERT_TIMESTAMP
68 => 31723265+28926, # SQL_CONVERT_TINYINT
69 => 33554431, # SQL_CONVERT_VARBINARY
70 => 33554431, # SQL_CONVERT_VARCHAR
122 => 33554431, # SQL_CONVERT_WCHAR
125 => 33554431, # SQL_CONVERT_WLONGVARCHAR
126 => 33554431, # SQL_CONVERT_WVARCHAR
74 => 1, # SQL_CORRELATION_NAME - Table aliases are supported, but must be named differently
127 => 0, # SQL_CREATE_ASSERTION - No CREATE ASSERTION support
128 => 0, # SQL_CREATE_CHARACTER_SET - No CREATE CHARACTER SET support
129 => 0, # SQL_CREATE_COLLATION - No CREATE COLLATION support
130 => 0, # SQL_CREATE_DOMAIN - No CREATE DOMAIN support
131 => 0, # SQL_CREATE_SCHEMA - No CREATE SCHEMA support
132 => 16383-2-8-4096, # SQL_CREATE_TABLE - Most of the functionality of CREATE TABLE support
133 => 0, # SQL_CREATE_TRANSLATION - No CREATE TRANSLATION support
134 => 1, # SQL_CREATE_VIEW - CREATE VIEW, no WITH CHECK OPTION support
23 => 2, # SQL_CURSOR_COMMIT_BEHAVIOR - Cursors are preserved
24 => 2, # SQL_CURSOR_ROLLBACK_BEHAVIOR - Cursors are preserved
10001 => 0, # SQL_CURSOR_SENSITIVITY - Cursors have a concept of snapshots, though this depends on the transaction type
2 => \&sql_data_source_name, # SQL_DATA_SOURCE_NAME - The DSN
25 => \&sql_data_source_read_only, # SQL_DATA_SOURCE_READ_ONLY - Might have a SQLITE_OPEN_READONLY flag
16 => \&sql_database_name, # SQL_DATABASE_NAME - Self-explanatory
119 => 0, # SQL_DATETIME_LITERALS - No support for SQL-92's super weird date/time literal format (ie: {d '2999-12-12'})
17 => 'SQLite', # SQL_DBMS_NAME - You are here
18 => \&sql_dbms_ver, # SQL_DBMS_VER - This driver version
170 => 1+2, # SQL_DDL_INDEX - Supports CREATE/DROP INDEX
26 => 8, # SQL_DEFAULT_TXN_ISOLATION - Default is SERIALIZABLE (See "PRAGMA read_uncommitted")
10002 => 'N', # SQL_DESCRIBE_PARAMETER - No DESCRIBE INPUT support
# XXX: MySQL/Oracle fills in HDBC and HENV, but information on what should actually go there is
# hard to acquire.
# 171 => undef, # SQL_DM_VER - Not a Driver Manager
# 3 => undef, # SQL_DRIVER_HDBC - Not a Driver Manager
# 135 => undef, # SQL_DRIVER_HDESC - Not a Driver Manager
# 4 => undef, # SQL_DRIVER_HENV - Not a Driver Manager
# 76 => undef, # SQL_DRIVER_HLIB - Not a Driver Manager
# 5 => undef, # SQL_DRIVER_HSTMT - Not a Driver Manager
6 => 'libsqlite3odbc.so', # SQL_DRIVER_NAME - SQLite3 ODBC driver (if installed)
77 => '03.00', # SQL_DRIVER_ODBC_VER - Same as sqlite3odbc.c
7 => $sql_driver_ver, # SQL_DRIVER_VER - Self-explanatory
136 => 0, # SQL_DROP_ASSERTION - No DROP ASSERTION support
137 => 0, # SQL_DROP_CHARACTER_SET - No DROP CHARACTER SET support
138 => 0, # SQL_DROP_COLLATION - No DROP COLLATION support
139 => 0, # SQL_DROP_DOMAIN - No DROP DOMAIN support
140 => 0, # SQL_DROP_SCHEMA - No DROP SCHEMA support
141 => 1, # SQL_DROP_TABLE - DROP TABLE support, no RESTRICT/CASCADE
142 => 0, # SQL_DROP_TRANSLATION - No DROP TRANSLATION support
143 => 1, # SQL_DROP_VIEW - DROP VIEW support, no RESTRICT/CASCADE
# NOTE: This is based purely on what sqlite3odbc supports.
#
# Static CA1: NEXT, ABSOLUTE, RELATIVE, BOOKMARK, LOCK_NO_CHANGE, POSITION, UPDATE, DELETE, REFRESH,
# BULK_ADD, BULK_UPDATE_BY_BOOKMARK, BULK_DELETE_BY_BOOKMARK = 466511
#
# Forward-only CA1: NEXT, BOOKMARK
#
# CA2: READ_ONLY_CONCURRENCY, LOCK_CONCURRENCY
144 => 0, # SQL_DYNAMIC_CURSOR_ATTRIBUTES1 - No dynamic cursor support
145 => 0, # SQL_DYNAMIC_CURSOR_ATTRIBUTES2 - No dynamic cursor support
146 => 1+8, # SQL_FORWARD_ONLY_CURSOR_ATTRIBUTES1
147 => 1+2, # SQL_FORWARD_ONLY_CURSOR_ATTRIBUTES2
150 => 0, # SQL_KEYSET_CURSOR_ATTRIBUTES1 - No keyset cursor support
151 => 0, # SQL_KEYSET_CURSOR_ATTRIBUTES2 - No keyset cursor support
167 => 466511, # SQL_STATIC_CURSOR_ATTRIBUTES1
168 => 1+2, # SQL_STATIC_CURSOR_ATTRIBUTES2
27 => 'Y', # SQL_EXPRESSIONS_IN_ORDERBY - ORDER BY allows expressions
8 => 63, # SQL_FETCH_DIRECTION - Cursors support next, first, last, prior, absolute, relative
84 => 2, # SQL_FILE_USAGE - Single-tier driver, treats files as databases
81 => 1+2+8, # SQL_GETDATA_EXTENSIONS - Same as sqlite3odbc.c
88 => 3, # SQL_GROUP_BY - SELECT columns are independent of GROUP BY columns
28 => 4, # SQL_IDENTIFIER_CASE - Not case-sensitive, stored in mixed case
29 => '"', # SQL_IDENTIFIER_QUOTE_CHAR - Uses " for identifiers, though supports [] and ` as well
148 => 0, # SQL_INDEX_KEYWORDS - No support for ASC/DESC/ALL for CREATE INDEX
149 => 0, # SQL_INFO_SCHEMA_VIEWS - No support for INFORMATION_SCHEMA
172 => 1+2, # SQL_INSERT_STATEMENT - INSERT...VALUES & INSERT...SELECT
73 => 'N', # SQL_INTEGRITY - No support for "Integrity Enhancement Facility"
89 => \&sql_keywords, # SQL_KEYWORDS - List of non-ODBC keywords
113 => 'Y', # SQL_LIKE_ESCAPE_CLAUSE - Supports LIKE...ESCAPE
78 => 1, # SQL_LOCK_TYPES - Only NO_CHANGE
10022 => 0, # SQL_MAX_ASYNC_CONCURRENT_STATEMENTS - No async mode
112 => 1_000_000, # SQL_MAX_BINARY_LITERAL_LEN - SQLITE_MAX_SQL_LENGTH
34 => 1_000_000, # SQL_MAX_CATALOG_NAME_LEN - SQLITE_MAX_SQL_LENGTH
108 => 1_000_000, # SQL_MAX_CHAR_LITERAL_LEN - SQLITE_MAX_SQL_LENGTH
97 => 2000, # SQL_MAX_COLUMNS_IN_GROUP_BY - SQLITE_MAX_COLUMN
98 => 2000, # SQL_MAX_COLUMNS_IN_INDEX - SQLITE_MAX_COLUMN
99 => 2000, # SQL_MAX_COLUMNS_IN_ORDER_BY - SQLITE_MAX_COLUMN
100 => 2000, # SQL_MAX_COLUMNS_IN_SELECT - SQLITE_MAX_COLUMN
101 => 2000, # SQL_MAX_COLUMNS_IN_TABLE - SQLITE_MAX_COLUMN
30 => 1_000_000, # SQL_MAX_COLUMN_NAME_LEN - SQLITE_MAX_SQL_LENGTH
1 => 1021, # SQL_MAX_CONCURRENT_ACTIVITIES - Typical filehandle limits
31 => 1_000_000, # SQL_MAX_CURSOR_NAME_LEN - SQLITE_MAX_SQL_LENGTH
0 => 1021, # SQL_MAX_DRIVER_CONNECTIONS - Typical filehandle limits
10005 => 1_000_000, # SQL_MAX_IDENTIFIER_LEN - SQLITE_MAX_SQL_LENGTH
102 => 2147483646*65536, # SQL_MAX_INDEX_SIZE - Tied to DB size, which is theortically 140TB
32 => 1_000_000, # SQL_MAX_OWNER_NAME_LEN - SQLITE_MAX_SQL_LENGTH
33 => 1_000_000, # SQL_MAX_PROCEDURE_NAME_LEN - SQLITE_MAX_SQL_LENGTH
34 => 1_000_000, # SQL_MAX_QUALIFIER_NAME_LEN - SQLITE_MAX_SQL_LENGTH
104 => 1_000_000, # SQL_MAX_ROW_SIZE - SQLITE_MAX_SQL_LENGTH (since INSERT has to be used)
103 => 'Y', # SQL_MAX_ROW_SIZE_INCLUDES_LONG
32 => 1_000_000, # SQL_MAX_SCHEMA_NAME_LEN - SQLITE_MAX_SQL_LENGTH
105 => 1_000_000, # SQL_MAX_STATEMENT_LEN - SQLITE_MAX_SQL_LENGTH
106 => 64, # SQL_MAX_TABLES_IN_SELECT - 64 tables, because of the bitmap in the query optimizer
35 => 1_000_000, # SQL_MAX_TABLE_NAME_LEN - SQLITE_MAX_SQL_LENGTH
107 => 0, # SQL_MAX_USER_NAME_LEN - No user support
37 => 'Y', # SQL_MULTIPLE_ACTIVE_TXN - Supports mulitple txns, though not nested
36 => 'N', # SQL_MULT_RESULT_SETS - No batches
111 => 'N', # SQL_NEED_LONG_DATA_LEN - Doesn't care about LONG
75 => 1, # SQL_NON_NULLABLE_COLUMNS - Supports NOT NULL
85 => 1, # SQL_NULL_COLLATION - NULLs first on ASC (low end)
49 => 4194304+1, # SQL_NUMERIC_FUNCTIONS - Just ABS & ROUND (has RANDOM, but not RAND)
9 => 1, # SQL_ODBC_API_CONFORMANCE - Same as sqlite3odbc.c
152 => 1, # SQL_ODBC_INTERFACE_CONFORMANCE - Same as sqlite3odbc.c
12 => 0, # SQL_ODBC_SAG_CLI_CONFORMANCE - Same as sqlite3odbc.c
15 => 0, # SQL_ODBC_SQL_CONFORMANCE - Same as sqlite3odbc.c
10 => '03.00', # SQL_ODBC_VER - Same as sqlite3odbc.c
115 => 1+8+16+32+64, # SQL_OJ_CAPABILITIES - Supports all OUTER JOINs except RIGHT & FULL
90 => 'N', # SQL_ORDER_BY_COLUMNS_IN_SELECT - ORDER BY columns don't have to be in the SELECT list
38 => 'Y', # SQL_OUTER_JOINS - Supports OUTER JOINs
153 => 2, # SQL_PARAM_ARRAY_ROW_COUNTS - Only has row counts for executed statements
154 => 3, # SQL_PARAM_ARRAY_SELECTS - No support for arrays of parameters
80 => 0, # SQL_POSITIONED_STATEMENTS - No support for positioned statements (WHERE CURRENT OF or SELECT FOR UPDATE)
79 => 31, # SQL_POS_OPERATIONS - Supports all SQLSetPos operations
21 => 'N', # SQL_PROCEDURES - No procedures
40 => '', # SQL_PROCEDURE_TERM - No procedures
93 => 4, # SQL_QUOTED_IDENTIFIER_CASE - Even quoted identifiers are case-insensitive
11 => 'N', # SQL_ROW_UPDATES - No fancy cursor update support
39 => '', # SQL_SCHEMA_TERM - No schemas
91 => 0, # SQL_SCHEMA_USAGE - No schemas
43 => 2, # SQL_SCROLL_CONCURRENCY - Updates/deletes on cursors lock the database
44 => 1+16, # SQL_SCROLL_OPTIONS - Only supports static & forward-only cursors
14 => '\\', # SQL_SEARCH_PATTERN_ESCAPE - Default escape character for LIKE is \
13 => \&sql_server_name, # SQL_SERVER_NAME - Just $dbh->{Name}
94 => '', # SQL_SPECIAL_CHARACTERS - Other drivers tend to stick to the ASCII/Latin-1 range, and SQLite uses all of
# the lower 7-bit punctuation for other things
155 => 7, # SQL_SQL92_DATETIME_FUNCTIONS - Supports CURRENT_(DATE|TIME|TIMESTAMP)
156 => 1+2+4+8, # SQL_SQL92_FOREIGN_KEY_DELETE_RULE - Support all ON DELETE options
157 => 1+2+4+8, # SQL_SQL92_FOREIGN_KEY_UPDATE_RULE - Support all ON UPDATE options
158 => 0, # SQL_SQL92_GRANT - No users; no support for GRANT
159 => 0, # SQL_SQL92_NUMERIC_VALUE_FUNCTIONS - No support for any of the listed functions
160 => 1+2+4+512+1024+2048+4096+8192, # SQL_SQL92_PREDICATES - Supports the important comparison operators
161 => 2+16+64+128, # SQL_SQL92_RELATIONAL_JOIN_OPERATORS - Supports the important ones except RIGHT/FULL OUTER JOINs
162 => 0, # SQL_SQL92_REVOKE - No users; no support for REVOKE
163 => 1+2+8, # SQL_SQL92_ROW_VALUE_CONSTRUCTOR - Supports most row value constructors
164 => 2+4, # SQL_SQL92_STRING_FUNCTIONS - Just UPPER & LOWER (has SUBSTR, but not SUBSTRING and SQL-92's weird TRIM syntax)
165 => 1+2+4+8, # SQL_SQL92_VALUE_EXPRESSIONS - Supports all SQL-92 value expressions
118 => 1, # SQL_SQL_CONFORMANCE - SQL-92 Entry level
83 => 0, # SQL_STATIC_SENSITIVITY - Cursors would lock the DB, so only old data is visible
50 => 8+16+256+1024+16384+131072, # SQL_STRING_FUNCTIONS - LTRIM, LENGTH, REPLACE, RTRIM, CHAR, SOUNDEX
95 => 1+2+4+8+16, # SQL_SUBQUERIES - Supports all of the subquery types
51 => 4, # SQL_SYSTEM_FUNCTIONS - Only IFNULL
45 => 'table', # SQL_TABLE_TERM - Tables are called tables
109 => 0, # SQL_TIMEDATE_ADD_INTERVALS - No support for INTERVAL
110 => 0, # SQL_TIMEDATE_DIFF_INTERVALS - No support for INTERVAL
52 => 0x20000+0x40000+0x80000, # SQL_TIMEDATE_FUNCTIONS - Only supports CURRENT_(DATE|TIME|TIMESTAMP)
46 => 2, # SQL_TXN_CAPABLE - Full transaction support for both DML & DDL
72 => 1+8, # SQL_TXN_ISOLATION_OPTION - Supports read uncommitted and serializable
96 => 1+2, # SQL_UNION - Supports UNION and UNION ALL
47 => '', # SQL_USER_NAME - No users
166 => 1, # SQL_STANDARD_CLI_CONFORMANCE - X/Open CLI Version 1.0
10000 => 1992, # SQL_XOPEN_CLI_YEAR - Year for V1.0
);
sub sql_dbms_ver {
my $dbh = shift;
return $dbh->FETCH('sqlite_version');
}
sub sql_data_source_name {
my $dbh = shift;
return "dbi:SQLite:".$dbh->{Name};
}
sub sql_data_source_read_only {
my $dbh = shift;
my $flags = $dbh->FETCH('sqlite_open_flags') || 0;
return $dbh->{ReadOnly} || ($flags & DBD::SQLite::OPEN_READONLY()) ? 'Y' : 'N';
}
sub sql_database_name {
my $dbh = shift;
my $databases = $dbh->selectall_hashref('PRAGMA database_list', 'seq');
return $databases->{0}{name};
}
sub sql_keywords {
# SQLite keywords minus ODBC keywords
return join ',', (qw<
ABORT AFTER ANALYZE ATTACH AUTOINCREMENT BEFORE CONFLICT DATABASE DETACH EACH EXCLUSIVE
EXPLAIN FAIL GLOB IF IGNORE INDEXED INSTEAD ISNULL LIMIT NOTNULL OFFSET
PLAN PRAGMA QUERY RAISE RECURSIVE REGEXP REINDEX RELEASE RENAME REPLACE ROW
SAVEPOINT TEMP TRIGGER VACUUM VIRTUAL WITHOUT
>);
}
sub sql_server_name {
my $dbh = shift;
return $dbh->{Name};
}
1;
__END__

View File

@@ -0,0 +1,824 @@
#======================================================================
package DBD::SQLite::VirtualTable;
#======================================================================
use strict;
use warnings;
use Scalar::Util qw/weaken/;
our $VERSION = '1.66';
our @ISA;
#----------------------------------------------------------------------
# methods for registering/destroying the module
#----------------------------------------------------------------------
sub CREATE_MODULE { my ($class, $mod_name) = @_; }
sub DESTROY_MODULE { my ($class, $mod_name) = @_; }
#----------------------------------------------------------------------
# methods for creating/destroying instances
#----------------------------------------------------------------------
sub CREATE { my $class = shift; return $class->NEW(@_); }
sub CONNECT { my $class = shift; return $class->NEW(@_); }
sub _PREPARE_SELF {
my ($class, $dbh_ref, $module_name, $db_name, $vtab_name, @args) = @_;
my @columns;
my %options;
# args containing '=' are options; others are column declarations
foreach my $arg (@args) {
if ($arg =~ /^([^=\s]+)\s*=\s*(.*)/) {
my ($key, $val) = ($1, $2);
$val =~ s/^"(.*)"$/$1/;
$options{$key} = $val;
}
else {
push @columns, $arg;
}
}
# build $self
my $self = {
dbh_ref => $dbh_ref,
module_name => $module_name,
db_name => $db_name,
vtab_name => $vtab_name,
columns => \@columns,
options => \%options,
};
weaken $self->{dbh_ref};
return $self;
}
sub NEW {
my $class = shift;
my $self = $class->_PREPARE_SELF(@_);
bless $self, $class;
}
sub VTAB_TO_DECLARE {
my $self = shift;
local $" = ", ";
my $sql = "CREATE TABLE $self->{vtab_name}(@{$self->{columns}})";
return $sql;
}
sub DROP { my $self = shift; }
sub DISCONNECT { my $self = shift; }
#----------------------------------------------------------------------
# methods for initiating a search
#----------------------------------------------------------------------
sub BEST_INDEX {
my ($self, $constraints, $order_by) = @_;
my $ix = 0;
foreach my $constraint (grep {$_->{usable}} @$constraints) {
$constraint->{argvIndex} = $ix++;
$constraint->{omit} = 0;
}
# stupid default values -- subclasses should put real values instead
my $outputs = {
idxNum => 1,
idxStr => "",
orderByConsumed => 0,
estimatedCost => 1.0,
estimatedRows => undef,
};
return $outputs;
}
sub OPEN {
my $self = shift;
my $class = ref $self;
my $cursor_class = $class . "::Cursor";
return $cursor_class->NEW($self, @_);
}
#----------------------------------------------------------------------
# methods for insert/delete/update
#----------------------------------------------------------------------
sub _SQLITE_UPDATE {
my ($self, $old_rowid, $new_rowid, @values) = @_;
if (! defined $old_rowid) {
return $self->INSERT($new_rowid, @values);
}
elsif (!@values) {
return $self->DELETE($old_rowid);
}
else {
return $self->UPDATE($old_rowid, $new_rowid, @values);
}
}
sub INSERT {
my ($self, $new_rowid, @values) = @_;
die "INSERT() should be redefined in subclass";
}
sub DELETE {
my ($self, $old_rowid) = @_;
die "DELETE() should be redefined in subclass";
}
sub UPDATE {
my ($self, $old_rowid, $new_rowid, @values) = @_;
die "UPDATE() should be redefined in subclass";
}
#----------------------------------------------------------------------
# remaining methods of the sqlite API
#----------------------------------------------------------------------
sub BEGIN_TRANSACTION {return 0}
sub SYNC_TRANSACTION {return 0}
sub COMMIT_TRANSACTION {return 0}
sub ROLLBACK_TRANSACTION {return 0}
sub SAVEPOINT {return 0}
sub RELEASE {return 0}
sub ROLLBACK_TO {return 0}
sub FIND_FUNCTION {return 0}
sub RENAME {return 0}
#----------------------------------------------------------------------
# utility methods
#----------------------------------------------------------------------
sub dbh {
my $self = shift;
return ${$self->{dbh_ref}};
}
sub sqlite_table_info {
my $self = shift;
my $sql = "PRAGMA table_info($self->{vtab_name})";
return $self->dbh->selectall_arrayref($sql, {Slice => {}});
}
#======================================================================
package DBD::SQLite::VirtualTable::Cursor;
#======================================================================
use strict;
use warnings;
sub NEW {
my ($class, $vtable, @args) = @_;
my $self = {vtable => $vtable,
args => \@args};
bless $self, $class;
}
sub FILTER {
my ($self, $idxNum, $idxStr, @values) = @_;
die "FILTER() should be redefined in cursor subclass";
}
sub EOF {
my ($self) = @_;
die "EOF() should be redefined in cursor subclass";
}
sub NEXT {
my ($self) = @_;
die "NEXT() should be redefined in cursor subclass";
}
sub COLUMN {
my ($self, $idxCol) = @_;
die "COLUMN() should be redefined in cursor subclass";
}
sub ROWID {
my ($self) = @_;
die "ROWID() should be redefined in cursor subclass";
}
1;
__END__
=head1 NAME
DBD::SQLite::VirtualTable -- SQLite virtual tables implemented in Perl
=head1 SYNOPSIS
# register the virtual table module within sqlite
$dbh->sqlite_create_module(mod_name => "DBD::SQLite::VirtualTable::Subclass");
# create a virtual table
$dbh->do("CREATE VIRTUAL TABLE vtbl USING mod_name(arg1, arg2, ...)")
# use it as any regular table
my $sth = $dbh->prepare("SELECT * FROM vtbl WHERE ...");
B<Note> : VirtualTable subclasses or instances are not called
directly from Perl code; everything happens indirectly through SQL
statements within SQLite.
=head1 DESCRIPTION
This module is an abstract class for implementing SQLite virtual tables,
written in Perl. Such tables look like regular tables, and are accessed
through regular SQL instructions and regular L<DBI> API; but the implementation
is done through hidden calls to a Perl class.
This is the same idea as Perl's L<tied variables|perltie>, but
at the SQLite level.
The current abstract class cannot be used directly, so the
synopsis above is just to give a general idea. Concrete, usable
classes bundled with the present distribution are :
=over
=item *
L<DBD::SQLite::VirtualTable::FileContent> : implements a virtual
column that exposes file contents. This is especially useful
in conjunction with a fulltext index; see L<DBD::SQLite::Fulltext_search>.
=item *
L<DBD::SQLite::VirtualTable::PerlData> : binds to a Perl array
within the Perl program. This can be used for simple import/export
operations, for debugging purposes, for joining data from different
sources, etc.
=back
Other Perl virtual tables may also be published separately on CPAN.
The following chapters document the structure of the abstract class
and explain how to write new subclasses; this is meant for
B<module authors>, not for end users. If you just need to use a
virtual table module, refer to that module's documentation.
=head1 ARCHITECTURE
=head2 Classes
A virtual table module for SQLite is implemented through a pair
of classes :
=over
=item *
the B<table> class implements methods for creating or connecting
a virtual table, for destroying it, for opening new searches, etc.
=item *
the B<cursor> class implements methods for performing a specific
SQL statement
=back
=head2 Methods
Most methods in both classes are not called directly from Perl
code : instead, they are callbacks, called from the sqlite kernel.
Following common Perl conventions, such methods have names in
uppercase.
=head1 TABLE METHODS
=head2 Class methods for registering the module
=head3 CREATE_MODULE
$class->CREATE_MODULE($sqlite_module_name);
Called when the client code invokes
$dbh->sqlite_create_module($sqlite_module_name => $class);
The default implementation is empty.
=head3 DESTROY_MODULE
$class->DESTROY_MODULE();
Called automatically when the database handle is disconnected.
The default implementation is empty.
=head2 Class methods for creating a vtable instance
=head3 CREATE
$class->CREATE($dbh_ref, $module_name, $db_name, $vtab_name, @args);
Called when sqlite receives a statement
CREATE VIRTUAL TABLE $db_name.$vtab_name USING $module_name(@args)
The default implementation just calls L</NEW>.
=head3 CONNECT
$class->CONNECT($dbh_ref, $module_name, $db_name, $vtab_name, @args);
Called when attempting to access a virtual table that had been created
during previous database connection. The creation arguments were stored
within the sqlite database and are passed again to the CONNECT method.
The default implementation just calls L</NEW>.
=head3 _PREPARE_SELF
$class->_PREPARE_SELF($dbh_ref, $module_name, $db_name, $vtab_name, @args);
Prepares the datastructure for a virtual table instance. C<@args> is
just the collection of strings (comma-separated) that were given
within the C<CREATE VIRTUAL TABLE> statement; each subclass should
decide what to do with this information,
The method parses C<@args> to differentiate between I<options>
(strings of shape C<$key>=C<$value> or C<$key>=C<"$value">, stored in
C<< $self->{options} >>), and I<columns> (other C<@args>, stored in
C<< $self->{columns} >>). It creates a hashref with the following fields :
=over
=item C<dbh_ref>
a weak reference to the C<$dbh> database handle (see
L<Scalar::Util> for an explanation of weak references).
=item C<module_name>
name of the module as declared to sqlite (not to be confounded
with the Perl class name).
=item C<db_name>
name of the database (usuallly C<'main'> or C<'temp'>), but it
may also be an attached database
=item C<vtab_name>
name of the virtual table
=item C<columns>
arrayref of column declarations
=item C<options>
hashref of option declarations
=back
This method should not be redefined, since it performs
general work which is supposed to be useful for all subclasses.
Instead, subclasses may override the L</NEW> method.
=head3 NEW
$class->NEW($dbh_ref, $module_name, $db_name, $vtab_name, @args);
Instantiates a virtual table.
=head2 Instance methods called from the sqlite kernel
=head3 DROP
Called whenever a virtual table is destroyed from the
database through the C<DROP TABLE> SQL instruction.
Just after the C<DROP()> call, the Perl instance
will be destroyed (and will therefore automatically
call the C<DESTROY()> method if such a method is present).
The default implementation for DROP is empty.
B<Note> : this corresponds to the C<xDestroy> method
in the SQLite documentation; here it was not named
C<DESTROY>, to avoid any confusion with the standard
Perl method C<DESTROY> for object destruction.
=head3 DISCONNECT
Called for every virtual table just before the database handle
is disconnected.
Just after the C<DISCONNECT()> call, the Perl instance
will be destroyed (and will therefore automatically
call the C<DESTROY()> method if such a method is present).
The default implementation for DISCONNECT is empty.
=head3 VTAB_TO_DECLARE
This method is called automatically just after L</CREATE> or L</CONNECT>,
to register the columns of the virtual table within the sqlite kernel.
The method should return a string containing a SQL C<CREATE TABLE> statement;
but only the column declaration parts will be considered.
Columns may be declared with the special keyword "HIDDEN", which means that
they are used internally for the the virtual table implementation, and are
not visible to users -- see L<http://sqlite.org/c3ref/declare_vtab.html>
and L<http://www.sqlite.org/vtab.html#hiddencol> for detailed explanations.
The default implementation returns:
CREATE TABLE $self->{vtab_name}(@{$self->{columns}})
=head3 BEST_INDEX
my $index_info = $vtab->BEST_INDEX($constraints, $order_by)
This is the most complex method to redefined in subclasses.
This method will be called at the beginning of a new query on the
virtual table; the job of the method is to assemble some information
that will be used
=over
=item a)
by the sqlite kernel to decide about the best search strategy
=item b)
by the cursor L</FILTER> method to produce the desired subset
of rows from the virtual table.
=back
By calling this method, the SQLite core is saying to the virtual table
that it needs to access some subset of the rows in the virtual table
and it wants to know the most efficient way to do that access. The
C<BEST_INDEX> method replies with information that the SQLite core can
then use to conduct an efficient search of the virtual table.
The method takes as input a list of C<$constraints> and a list
of C<$order_by> instructions. It returns a hashref of indexing
properties, described below; furthermore, the method also adds
supplementary information within the input C<$constraints>.
Detailed explanations are given in
L<http://sqlite.org/vtab.html#xbestindex>.
=head4 Input constraints
Elements of the C<$constraints> arrayref correspond to
specific clauses of the C<WHERE ...> part of the SQL query.
Each constraint is a hashref with keys :
=over
=item C<col>
the integer index of the column on the left-hand side of the constraint
=item C<op>
the comparison operator, expressed as string containing
C<< '=' >>, C<< '>' >>, C<< '>=' >>, C<< '<' >>, C<< '<=' >> or C<< 'MATCH' >>.
=item C<usable>
a boolean indicating if that constraint is usable; some constraints
might not be usable because of the way tables are ordered in a join.
=back
The C<$constraints> arrayref is used both for input and for output.
While iterating over the array, the method should
add the following keys into usable constraints :
=over
=item C<argvIndex>
An index into the C<@values> array that will be passed to
the cursor's L</FILTER> method. In other words, if the current
constraint corresponds to the SQL fragment C<WHERE ... AND foo < 123 ...>,
and the corresponding C<argvIndex> takes value 5, this means that
the C<FILTER> method will receive C<123> in C<$values[5]>.
=item C<omit>
A boolean telling to the sqlite core that it can safely omit
to double check that constraint before returning the resultset
to the calling program; this means that the FILTER method has fulfilled
the filtering job on that constraint and there is no need to do any
further checking.
=back
The C<BEST_INDEX> method will not necessarily receive all constraints
from the SQL C<WHERE> clause : for example a constraint like
C<< col1 < col2 + col3 >> cannot be handled at this level.
Furthemore, the C<BEST_INDEX> might decide to ignore some of the
received constraints. This is why a second pass over the results
will be performed by the sqlite core.
=head4 "order_by" input information
The C<$order_by> arrayref corresponds to the C<ORDER BY> clauses
in the SQL query. Each entry is a hashref with keys :
=over
=item C<col>
the integer index of the column being ordered
=item C<desc>
a boolean telling of the ordering is DESCending or ascending
=back
This information could be used by some subclasses for
optimizing the query strategfy; but usually the sqlite core will
perform another sorting pass once all results are gathered.
=head4 Hashref information returned by BEST_INDEX
The method should return a hashref with the following keys :
=over
=item C<idxNum>
An arbitrary integer associated with that index; this information will
be passed back to L</FILTER>.
=item C<idxStr>
An arbitrary str associated with that index; this information will
be passed back to L</FILTER>.
=item C<orderByConsumed>
A boolean telling the sqlite core if the C<$order_by> information
has been taken into account or not.
=item C<estimatedCost>
A float that should be set to the estimated number of disk access
operations required to execute this query against the virtual
table. The SQLite core will often call BEST_INDEX multiple times with
different constraints, obtain multiple cost estimates, then choose the
query plan that gives the lowest estimate.
=item C<estimatedRows>
An integer giving the estimated number of rows returned by that query.
=back
=head3 OPEN
Called to instantiate a new cursor.
The default implementation appends C<"::Cursor"> to the current
classname and calls C<NEW()> within that cursor class.
=head3 _SQLITE_UPDATE
This is the dispatch method implementing the C<xUpdate()> callback
for virtual tables. The default implementation applies the algorithm
described in L<http://sqlite.org/vtab.html#xupdate> to decide
to call L</INSERT>, L</DELETE> or L</UPDATE>; so there is no reason
to override this method in subclasses.
=head3 INSERT
my $rowid = $vtab->INSERT($new_rowid, @values);
This method should be overridden in subclasses to implement
insertion of a new row into the virtual table.
The size of the C<@values> array corresponds to the
number of columns declared through L</VTAB_TO_DECLARE>.
The C<$new_rowid> may be explicitly given, or it may be
C<undef>, in which case the method must compute a new id
and return it as the result of the method call.
=head3 DELETE
$vtab->INSERT($old_rowid);
This method should be overridden in subclasses to implement
deletion of a row from the virtual table.
=head3 UPDATE
$vtab->UPDATE($old_rowid, $new_rowid, @values);
This method should be overridden in subclasses to implement
a row update within the virtual table. Usually C<$old_rowid> is equal
to C<$new_rowid>, which is a regular update; however, the rowid
could be changed from a SQL statement such as
UPDATE table SET rowid=rowid+1 WHERE ...;
=head3 FIND_FUNCTION
$vtab->FIND_FUNCTION($num_args, $func_name);
When a function uses a column from a virtual table as its first
argument, this method is called to see if the virtual table would like
to overload the function. Parameters are the number of arguments to
the function, and the name of the function. If no overloading is
desired, this method should return false. To overload the function,
this method should return a coderef to the function implementation.
Each virtual table keeps a cache of results from L<FIND_FUNCTION> calls,
so the method will be called only once for each pair
C<< ($num_args, $func_name) >>.
=head3 BEGIN_TRANSACTION
Called to begin a transaction on the virtual table.
=head3 SYNC_TRANSACTION
Called to signal the start of a two-phase commit on the virtual table.
=head3 SYNC_TRANSACTION
Called to commit a virtual table transaction.
=head3 ROLLBACK_TRANSACTION
Called to rollback a virtual table transaction.
=head3 RENAME
$vtab->RENAME($new_name)
Called to rename a virtual table.
=head3 SAVEPOINT
$vtab->SAVEPOINT($savepoint)
Called to signal the virtual table to save its current state
at savepoint C<$savepoint> (an integer).
=head3 ROLLBACK_TO
$vtab->ROLLBACK_TO($savepoint)
Called to signal the virtual table to return to the state
C<$savepoint>. This will invalidate all savepoints with values
greater than C<$savepoint>.
=head3 RELEASE
$vtab->RELEASE($savepoint)
Called to invalidate all savepoints with values
greater or equal to C<$savepoint>.
=head2 Utility instance methods
Methods in this section are in lower case, because they
are not called directly from the sqlite kernel; these
are utility methods to be called from other methods
described above.
=head3 dbh
This method returns the database handle (C<$dbh>) associated with
the current virtual table.
=head1 CURSOR METHODS
=head2 Class methods
=head3 NEW
my $cursor = $cursor_class->NEW($vtable, @args)
Instantiates a new cursor.
The default implementation just returns a blessed hashref
with keys C<vtable> and C<args>.
=head2 Instance methods
=head3 FILTER
$cursor->FILTER($idxNum, $idxStr, @values);
This method begins a search of a virtual table.
The C<$idxNum> and C<$idxStr> arguments correspond to values returned
by L</BEST_INDEX> for the chosen index. The specific meanings of
those values are unimportant to SQLite, as long as C<BEST_INDEX> and
C<FILTER> agree on what that meaning is.
The C<BEST_INDEX> method may have requested the values of certain
expressions using the C<argvIndex> values of the
C<$constraints> list. Those values are passed to C<FILTER> through
the C<@values> array.
If the virtual table contains one or more rows that match the search
criteria, then the cursor must be left point at the first
row. Subsequent calls to L</EOF> must return false. If there are
no rows match, then the cursor must be left in a state that will cause
L</EOF> to return true. The SQLite engine will use the
L</COLUMN> and L</ROWID> methods to access that row content. The L</NEXT>
method will be used to advance to the next row.
=head3 EOF
This method must return false if the cursor currently points to a
valid row of data, or true otherwise. This method is called by the SQL
engine immediately after each L</FILTER> and L</NEXT> invocation.
=head3 NEXT
This method advances the cursor to the next row of a
result set initiated by L</FILTER>. If the cursor is already pointing at
the last row when this method is called, then the cursor no longer
points to valid data and a subsequent call to the L</EOF> method must
return true. If the cursor is successfully advanced to
another row of content, then subsequent calls to L</EOF> must return
false.
=head3 COLUMN
my $value = $cursor->COLUMN($idxCol);
The SQLite core invokes this method in order to find the value for the
N-th column of the current row. N is zero-based so the first column is
numbered 0.
=head3 ROWID
my $value = $cursor->ROWID;
Returns the I<rowid> of row that the cursor is currently pointing at.
=head1 SEE ALSO
L<SQLite::VirtualTable> is another module for virtual tables written
in Perl, but designed for the reverse use case : instead of starting a
Perl program, and embedding the SQLite library into it, the intended
use is to start an sqlite program, and embed the Perl interpreter
into it.
=head1 AUTHOR
Laurent Dami E<lt>dami@cpan.orgE<gt>
=head1 COPYRIGHT AND LICENSE
Copyright Laurent Dami, 2014.
Parts of the code are borrowed from L<SQLite::VirtualTable>,
copyright (C) 2006, 2009 by Qindel Formacion y Servicios, S. L.
This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.
=cut

View File

@@ -0,0 +1,333 @@
#======================================================================
package DBD::SQLite::VirtualTable::FileContent;
#======================================================================
use strict;
use warnings;
use base 'DBD::SQLite::VirtualTable';
my %option_ok = map {($_ => 1)} qw/source content_col path_col
expose root get_content/;
my %defaults = (
content_col => "content",
path_col => "path",
expose => "*",
get_content => "DBD::SQLite::VirtualTable::FileContent::get_content",
);
#----------------------------------------------------------------------
# object instanciation
#----------------------------------------------------------------------
sub NEW {
my $class = shift;
my $self = $class->_PREPARE_SELF(@_);
local $" = ", "; # for array interpolation in strings
# initial parameter check
!@{$self->{columns}}
or die "${class}->NEW(): illegal options: @{$self->{columns}}";
$self->{options}{source}
or die "${class}->NEW(): missing (source=...)";
my @bad_options = grep {!$option_ok{$_}} keys %{$self->{options}};
!@bad_options
or die "${class}->NEW(): bad options: @bad_options";
# defaults ... tempted to use //= but we still want to support perl 5.8 :-(
foreach my $k (keys %defaults) {
defined $self->{options}{$k}
or $self->{options}{$k} = $defaults{$k};
}
# get list of columns from the source table
my $src_table = $self->{options}{source};
my $sql = "PRAGMA table_info($src_table)";
my $dbh = ${$self->{dbh_ref}}; # can't use method ->dbh, not blessed yet
my $src_info = $dbh->selectall_arrayref($sql, {Slice => [1, 2]});
@$src_info
or die "${class}->NEW(source=$src_table): no such table in database";
# associate each source colname with its type info or " " (should eval true)
my %src_col = map { ($_->[0] => $_->[1] || " ") } @$src_info;
# check / complete the exposed columns
my @exposed_cols;
if ($self->{options}{expose} eq '*') {
@exposed_cols = map {$_->[0]} @$src_info;
}
else {
@exposed_cols = split /\s*,\s*/, $self->{options}{expose};
my @bad_cols = grep { !$src_col{$_} } @exposed_cols;
die "table $src_table has no column named @bad_cols" if @bad_cols;
}
for (@exposed_cols) {
die "$class: $self->{options}{content_col} cannot be both the "
. "content_col and an exposed col" if $_ eq $self->{options}{content_col};
}
# build the list of columns for this table
$self->{columns} = [ "$self->{options}{content_col} TEXT",
map {"$_ $src_col{$_}"} @exposed_cols ];
# acquire a coderef to the get_content() implementation, which
# was given as a symbolic reference in %options
no strict 'refs';
$self->{get_content} = \ &{$self->{options}{get_content}};
bless $self, $class;
}
sub _build_headers {
my $self = shift;
my $cols = $self->sqlite_table_info;
# headers : names of columns, without type information
$self->{headers} = [ map {$_->{name}} @$cols ];
}
#----------------------------------------------------------------------
# method for initiating a search
#----------------------------------------------------------------------
sub BEST_INDEX {
my ($self, $constraints, $order_by) = @_;
$self->_build_headers if !$self->{headers};
my @conditions;
my $ix = 0;
foreach my $constraint (grep {$_->{usable}} @$constraints) {
my $col = $constraint->{col};
# if this is the content column, skip because we can't filter on it
next if $col == 0;
# for other columns, build a fragment for SQL WHERE on the underlying table
my $colname = $col == -1 ? "rowid" : $self->{headers}[$col];
push @conditions, "$colname $constraint->{op} ?";
$constraint->{argvIndex} = $ix++;
$constraint->{omit} = 1; # SQLite doesn't need to re-check the op
}
# TODO : exploit $order_by to add ordering clauses within idxStr
my $outputs = {
idxNum => 1,
idxStr => join(" AND ", @conditions),
orderByConsumed => 0,
estimatedCost => 1.0,
estimatedRows => undef,
};
return $outputs;
}
#----------------------------------------------------------------------
# method for preventing updates
#----------------------------------------------------------------------
sub _SQLITE_UPDATE {
my ($self, $old_rowid, $new_rowid, @values) = @_;
die "attempt to update a readonly virtual table";
}
#----------------------------------------------------------------------
# file slurping function (not a method!)
#----------------------------------------------------------------------
sub get_content {
my ($path, $root) = @_;
$path = "$root/$path" if $root;
my $content = "";
if (open my $fh, "<", $path) {
local $/; # slurp the whole file into a scalar
$content = <$fh>;
close $fh;
}
else {
warn "can't open $path";
}
return $content;
}
#======================================================================
package DBD::SQLite::VirtualTable::FileContent::Cursor;
#======================================================================
use strict;
use warnings;
use base "DBD::SQLite::VirtualTable::Cursor";
sub FILTER {
my ($self, $idxNum, $idxStr, @values) = @_;
my $vtable = $self->{vtable};
# build SQL
local $" = ", ";
my @cols = @{$vtable->{headers}};
$cols[0] = 'rowid'; # replace the content column by the rowid
push @cols, $vtable->{options}{path_col}; # path col in last position
my $sql = "SELECT @cols FROM $vtable->{options}{source}";
$sql .= " WHERE $idxStr" if $idxStr;
# request on the index table
my $dbh = $vtable->dbh;
$self->{sth} = $dbh->prepare($sql)
or die DBI->errstr;
$self->{sth}->execute(@values);
$self->{row} = $self->{sth}->fetchrow_arrayref;
return;
}
sub EOF {
my ($self) = @_;
return !$self->{row};
}
sub NEXT {
my ($self) = @_;
$self->{row} = $self->{sth}->fetchrow_arrayref;
}
sub COLUMN {
my ($self, $idxCol) = @_;
return $idxCol == 0 ? $self->file_content : $self->{row}[$idxCol];
}
sub ROWID {
my ($self) = @_;
return $self->{row}[0];
}
sub file_content {
my ($self) = @_;
my $root = $self->{vtable}{options}{root};
my $path = $self->{row}[-1];
my $get_content_func = $self->{vtable}{get_content};
return $get_content_func->($path, $root);
}
1;
__END__
=head1 NAME
DBD::SQLite::VirtualTable::FileContent -- virtual table for viewing file contents
=head1 SYNOPSIS
Within Perl :
$dbh->sqlite_create_module(fcontent => "DBD::SQLite::VirtualTable::FileContent");
Then, within SQL :
CREATE VIRTUAL TABLE tbl USING fcontent(
source = src_table,
content_col = content,
path_col = path,
expose = "path, col1, col2, col3", -- or "*"
root = "/foo/bar"
get_content = Foo::Bar::read_from_file
);
SELECT col1, path, content FROM tbl WHERE ...;
=head1 DESCRIPTION
A "FileContent" virtual table is bound to some underlying I<source
table>, which has a column containing paths to files. The virtual
table behaves like a database view on the source table, with an added
column which exposes the content from those files.
This is especially useful as an "external content" to some
fulltext table (see L<DBD::SQLite::Fulltext_search>) : the index
table stores some metadata about files, and then the fulltext engine
can index both the metadata and the file contents.
=head1 PARAMETERS
Parameters for creating a C<FileContent> virtual table are
specified within the C<CREATE VIRTUAL TABLE> statement, just
like regular column declarations, but with an '=' sign.
Authorized parameters are :
=over
=item C<source>
The name of the I<source table>.
This parameter is mandatory. All other parameters are optional.
=item C<content_col>
The name of the virtual column exposing file contents.
The default is C<content>.
=item C<path_col>
The name of the column in C<source> that contains paths to files.
The default is C<path>.
=item C<expose>
A comma-separated list (within double quotes) of source column names
to be exposed by the virtual table. The default is C<"*">, which means
all source columns.
=item C<root>
An optional root directory that will be prepended to the I<path> column
when opening files.
=item C<get_content>
Fully qualified name of a Perl function for reading file contents.
The default implementation just slurps the entire file into a string;
but this hook can point to more sophisticated implementations, like for
example a function that would remove html tags. The hooked function is
called like this :
$file_content = $get_content->($path, $root);
=back
=head1 AUTHOR
Laurent Dami E<lt>dami@cpan.orgE<gt>
=head1 COPYRIGHT AND LICENSE
Copyright Laurent Dami, 2014.
This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.
=cut

View File

@@ -0,0 +1,488 @@
#======================================================================
package DBD::SQLite::VirtualTable::PerlData;
#======================================================================
use strict;
use warnings;
use base 'DBD::SQLite::VirtualTable';
use DBD::SQLite;
use constant SQLITE_3010000 => $DBD::SQLite::sqlite_version_number >= 3010000 ? 1 : 0;
use constant SQLITE_3021000 => $DBD::SQLite::sqlite_version_number >= 3021000 ? 1 : 0;
# private data for translating comparison operators from Sqlite to Perl
my $TXT = 0;
my $NUM = 1;
my %SQLOP2PERLOP = (
# TXT NUM
'=' => [ 'eq', '==' ],
'<' => [ 'lt', '<' ],
'<=' => [ 'le', '<=' ],
'>' => [ 'gt', '>' ],
'>=' => [ 'ge', '>=' ],
'MATCH' => [ '=~', '=~' ],
(SQLITE_3010000 ? (
'LIKE' => [ 'DBD::SQLite::strlike', 'DBD::SQLite::strlike' ],
'GLOB' => [ 'DBD::SQLite::strglob', 'DBD::SQLite::strglob' ],
'REGEXP'=> [ '=~', '=~' ],
) : ()),
(SQLITE_3021000 ? (
'NE' => [ 'ne', '!=' ],
'ISNOT' => [ 'defined', 'defined' ],
'ISNOTNULL' => [ 'defined', 'defined' ],
'ISNULL' => [ '!defined', '!defined' ],
'IS' => [ '!defined', '!defined' ],
) : ()),
);
#----------------------------------------------------------------------
# instanciation methods
#----------------------------------------------------------------------
sub NEW {
my $class = shift;
my $self = $class->_PREPARE_SELF(@_);
# verifications
my $n_cols = @{$self->{columns}};
$n_cols > 0
or die "$class: no declared columns";
!$self->{options}{colref} || $n_cols == 1
or die "$class: must have exactly 1 column when using 'colref'";
my $symbolic_ref = $self->{options}{arrayrefs}
|| $self->{options}{hashrefs}
|| $self->{options}{colref}
or die "$class: missing option 'arrayrefs' or 'hashrefs' or 'colref'";
# bind to the Perl variable
no strict "refs";
defined ${$symbolic_ref}
or die "$class: can't find global variable \$$symbolic_ref";
$self->{rows} = \ ${$symbolic_ref};
bless $self, $class;
}
sub _build_headers_optypes {
my $self = shift;
my $cols = $self->sqlite_table_info;
# headers : names of columns, without type information
$self->{headers} = [ map {$_->{name}} @$cols ];
# optypes : either $NUM or $TEXT for each column
# (applying algorithm from datatype3.html" for type affinity)
$self->{optypes}
= [ map {$_->{type} =~ /INT|REAL|FLOA|DOUB/i ? $NUM : $TXT} @$cols ];
}
#----------------------------------------------------------------------
# method for initiating a search
#----------------------------------------------------------------------
sub BEST_INDEX {
my ($self, $constraints, $order_by) = @_;
$self->_build_headers_optypes if !$self->{headers};
# for each constraint, build a Perl code fragment. Those will be gathered
# in FILTER() for deciding which rows match the constraints.
my @conditions;
my $ix = 0;
foreach my $constraint (grep {$_->{usable} and exists $SQLOP2PERLOP{ $_->{op} } } @$constraints) {
my $col = $constraint->{col};
my ($member, $optype);
# build a Perl code fragment. Those fragments will be gathered
# and eval-ed in FILTER(), for deciding which rows match the constraints.
if ($col == -1) {
# constraint on rowid
$member = '$i';
$optype = $NUM;
}
else {
# constraint on regular column
my $opts = $self->{options};
$member = $opts->{arrayrefs} ? "\$row->[$col]"
: $opts->{hashrefs} ? "\$row->{$self->{headers}[$col]}"
: $opts->{colref} ? "\$row"
: die "corrupted data in ->{options}";
$optype = $self->{optypes}[$col];
}
my $op = $SQLOP2PERLOP{$constraint->{op}}[$optype];
if (SQLITE_3021000 && $op =~ /defined/) {
if ($constraint->{op} =~ /NULL/) {
push @conditions,
"($op($member))";
} else {
push @conditions,
"($op($member) && !defined(\$vals[$ix]))";
}
} elsif (SQLITE_3010000 && $op =~ /str/) {
push @conditions,
"(defined($member) && defined(\$vals[$ix]) && !$op(\$vals[$ix], $member))";
} else {
push @conditions,
"(defined($member) && defined(\$vals[$ix]) && $member $op \$vals[$ix])";
}
# Note : $vals[$ix] refers to an array of values passed to the
# FILTER method (see below); so the eval-ed perl code will be a
# closure on those values
# info passed back to the SQLite core -- see vtab.html in sqlite doc
$constraint->{argvIndex} = $ix++;
$constraint->{omit} = 1;
}
# further info for the SQLite core
my $outputs = {
idxNum => 1,
idxStr => (join(" && ", @conditions) || "1"),
orderByConsumed => 0,
estimatedCost => 1.0,
estimatedRows => undef,
};
return $outputs;
}
#----------------------------------------------------------------------
# methods for data update
#----------------------------------------------------------------------
sub _build_new_row {
my ($self, $values) = @_;
my $opts = $self->{options};
return $opts->{arrayrefs} ? $values
: $opts->{hashrefs} ? { map {$self->{headers}->[$_], $values->[$_]}
(0 .. @{$self->{headers}} - 1) }
: $opts->{colref} ? $values->[0]
: die "corrupted data in ->{options}";
}
sub INSERT {
my ($self, $new_rowid, @values) = @_;
my $new_row = $self->_build_new_row(\@values);
if (defined $new_rowid) {
not ${$self->{rows}}->[$new_rowid]
or die "can't INSERT : rowid $new_rowid already in use";
${$self->{rows}}->[$new_rowid] = $new_row;
}
else {
push @${$self->{rows}}, $new_row;
return $#${$self->{rows}};
}
}
sub DELETE {
my ($self, $old_rowid) = @_;
delete ${$self->{rows}}->[$old_rowid];
}
sub UPDATE {
my ($self, $old_rowid, $new_rowid, @values) = @_;
my $new_row = $self->_build_new_row(\@values);
if ($new_rowid == $old_rowid) {
${$self->{rows}}->[$old_rowid] = $new_row;
}
else {
delete ${$self->{rows}}->[$old_rowid];
${$self->{rows}}->[$new_rowid] = $new_row;
}
}
#======================================================================
package DBD::SQLite::VirtualTable::PerlData::Cursor;
#======================================================================
use strict;
use warnings;
use base "DBD::SQLite::VirtualTable::Cursor";
sub row {
my ($self, $i) = @_;
return ${$self->{vtable}{rows}}->[$i];
}
sub FILTER {
my ($self, $idxNum, $idxStr, @vals) = @_;
# build a method coderef to fetch matching rows
my $perl_code = 'sub {my ($self, $i) = @_; my $row = $self->row($i); '
. $idxStr
. '}';
# print STDERR "PERL CODE:\n", $perl_code, "\n";
$self->{is_wanted_row} = do { no warnings; eval $perl_code }
or die "couldn't eval q{$perl_code} : $@";
# position the cursor to the first matching row (or to eof)
$self->{row_ix} = -1;
$self->NEXT;
}
sub EOF {
my ($self) = @_;
return $self->{row_ix} > $#${$self->{vtable}{rows}};
}
sub NEXT {
my ($self) = @_;
do {
$self->{row_ix} += 1
} until $self->EOF
|| eval {$self->{is_wanted_row}->($self, $self->{row_ix})};
# NOTE: the eval above is required for cases when user data, injected
# into Perl comparison operators, generates errors; for example
# WHERE col MATCH '(foo' will die because the regex is not well formed
# (no matching parenthesis). In such cases no row is selected and the
# query just returns an empty list.
}
sub COLUMN {
my ($self, $idxCol) = @_;
my $row = $self->row($self->{row_ix});
my $opts = $self->{vtable}{options};
return $opts->{arrayrefs} ? $row->[$idxCol]
: $opts->{hashrefs} ? $row->{$self->{vtable}{headers}[$idxCol]}
: $opts->{colref} ? $row
: die "corrupted data in ->{options}";
}
sub ROWID {
my ($self) = @_;
return $self->{row_ix} + 1; # rowids start at 1 in SQLite
}
1;
__END__
=head1 NAME
DBD::SQLite::VirtualTable::PerlData -- virtual table hooked to Perl data
=head1 SYNOPSIS
Within Perl :
$dbh->sqlite_create_module(perl => "DBD::SQLite::VirtualTable::PerlData");
Then, within SQL :
CREATE VIRTUAL TABLE atbl USING perl(foo, bar, etc,
arrayrefs="some::global::var::aref")
CREATE VIRTUAL TABLE htbl USING perl(foo, bar, etc,
hashrefs="some::global::var::href")
CREATE VIRTUAL TABLE ctbl USING perl(single_col
colref="some::global::var::ref")
SELECT foo, bar FROM atbl WHERE ...;
=head1 DESCRIPTION
A C<PerlData> virtual table is a database view on some datastructure
within a Perl program. The data can be read or modified both from SQL
and from Perl. This is useful for simple import/export
operations, for debugging purposes, for joining data from different
sources, etc.
=head1 PARAMETERS
Parameters for creating a C<PerlData> virtual table are specified
within the C<CREATE VIRTUAL TABLE> statement, mixed with regular
column declarations, but with an '=' sign.
The only authorized (and mandatory) parameter is the one that
specifies the Perl datastructure to which the virtual table is bound.
It must be given as the fully qualified name of a global variable;
the parameter can be one of three different kinds :
=over
=item C<arrayrefs>
arrayref that contains an arrayref for each row.
Each such row will have a size equivalent to the number
of columns declared for the virtual table.
=item C<hashrefs>
arrayref that contains a hashref for each row.
Keys in each hashref should correspond to the
columns declared for the virtual table.
=item C<colref>
arrayref that contains a single scalar for each row;
obviously, this is a single-column virtual table.
=back
=head1 USAGE
=head2 Common part of all examples : declaring the module
In all examples below, the common part is that the Perl
program should connect to the database and then declare the
C<PerlData> virtual table module, like this
# connect to the database
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile", '', '',
{RaiseError => 1, AutoCommit => 1});
# or any other options suitable to your needs
# register the module
$dbh->sqlite_create_module(perl => "DBD::SQLite::VirtualTable::PerlData");
Then create a global arrayref variable, using C<our> instead of C<my>,
so that the variable is stored in the symbol table of the enclosing module.
package Foo::Bar; # could as well be just "main"
our $rows = [ ... ];
Finally, create the virtual table and bind it to the global
variable (here we assume that C<@$rows> contains arrayrefs) :
$dbh->do('CREATE VIRTUAL TABLE temp.vtab'
.' USING perl(col1 INT, col2 TEXT, etc,
arrayrefs="Foo::Bar::rows');
In most cases, the virtual table will be for temporary use, which is
the reason why this example prepends C<temp.> in front of the table
name : this tells SQLite to cleanup that table when the database
handle will be disconnected, without the need to emit an explicit DROP
statement.
Column names (and optionally their types) are specified in the
virtual table declaration, just like for any regular table.
=head2 Arrayref example : statistics from files
Let's suppose we want to perform some searches over a collection of
files, where search constraints may be based on some of the fields
returned by L<stat>, such as the size of the file or its last modify
time. Here is a way to do it with a virtual table :
my @files = ... ; # list of files to inspect
# apply the L<stat> function to each file
our $file_stats = [ map { [ $_, stat $_ ] } @files];
# create a temporary virtual table
$dbh->do(<<"");
CREATE VIRTUAL TABLE temp.file_stats'
USING perl(path, dev, ino, mode, nlink, uid, gid, rdev, size,
atime, mtime, ctime, blksize, blocks,
arrayrefs="main::file_stats");
# search files
my $sth = $dbh->prepare(<<"");
SELECT * FROM file_stats
WHERE mtime BETWEEN ? AND ?
AND uid IN (...)
=head2 Hashref example : unicode characters
Given any unicode character, the L<Unicode::UCD/charinfo> function
returns a hashref with various bits of information about that character.
So this can be exploited in a virtual table :
use Unicode::UCD 'charinfo';
our $chars = [map {charinfo($_)} 0x300..0x400]; # arbitrary subrange
# create a temporary virtual table
$dbh->do(<<"");
CREATE VIRTUAL TABLE charinfo USING perl(
code, name, block, script, category,
hashrefs="main::chars"
)
# search characters
my $sth = $dbh->prepare(<<"");
SELECT * FROM charinfo
WHERE script='Greek'
AND name LIKE '%SIGMA%'
=head2 Colref example: SELECT WHERE ... IN ...
I<Note: The idea for the following example is borrowed from the
C<test_intarray.h> file in SQLite's source
(L<http://www.sqlite.org/src>).>
A C<colref> virtual table is designed to facilitate using an
array of values as the right-hand side of an IN operator. The
usual syntax for IN is to prepare a statement like this:
SELECT * FROM table WHERE x IN (?,?,?,...,?);
and then bind individual values to each of the ? slots; but this has
the disadvantage that the number of values must be known in
advance. Instead, we can store values in a Perl array, bind that array
to a virtual table, and then write a statement like this
SELECT * FROM table WHERE x IN perl_array;
Here is how such a program would look like :
# connect to the database
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile", '', '',
{RaiseError => 1, AutoCommit => 1});
# Declare a global arrayref containing the values. Here we assume
# they are taken from @ARGV, but any other datasource would do.
# Note the use of "our" instead of "my".
our $values = \@ARGV;
# register the module and declare the virtual table
$dbh->sqlite_create_module(perl => "DBD::SQLite::VirtualTable::PerlData");
$dbh->do('CREATE VIRTUAL TABLE temp.intarray'
.' USING perl(i INT, colref="main::values');
# now we can SELECT from another table, using the intarray as a constraint
my $sql = "SELECT * FROM some_table WHERE some_col IN intarray";
my $result = $dbh->selectall_arrayref($sql);
Beware that the virtual table is read-write, so the statement below
would push 99 into @ARGV !
INSERT INTO intarray VALUES (99);
=head1 AUTHOR
Laurent Dami E<lt>dami@cpan.orgE<gt>
=head1 COPYRIGHT AND LICENSE
Copyright Laurent Dami, 2014.
This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.
=cut