# NAME

Search::Fulltext - Fulltext search module

# SYNOPSIS

    use Search::Fulltext;
    

    my @docs = (
        'I like beer the best',
        'Wine makes people saticefied',  # does not include beer
        'Beer makes people happy',
    );
    

    my $fts = Search::Fulltext->new({
        docs => \@docs,
    });
    my $results = $fts->search('beer');
    is_deeply($results, [0, 2]);         # 1st & 3rd doc include 'beer'
    my $results = $fts->search('beer AND happy');
    is_deeply($results, [2]);            # 3rd doc includes both 'beer' & 'happy'

# DESCRIPTION

[Search::Fulltext](http://search.cpan.org/perldoc?Search::Fulltext) is a fulltext search module. It can be used in a few steps.

[Search::Fulltext](http://search.cpan.org/perldoc?Search::Fulltext) has __pluggable tokenizer__ feature, which possibly provides fulltext search for any language.
Currently, __English__ and __Japanese__ fulltext search are officially supported,
although any other languages which have spaces for separating words could be also used.
See [CUSTOM TOKENIZERS](#CUSTOM\_TOKENIZERS) section to learn how to search non-English languages.

__SQLite__'s __FTS4__ is used as an indexer.
Various queries supported by FTS4 (`AND`, `OR`, `NEAR`, ...) are fully provided.
See ["QUERIES"](#QUERIES) section for details.

# METHODS

## Search::Fulltext->new

Creates fulltext index for documents.

- `@param docs` __\[required\]__

    Reference to array whose contents are document to be searched.

- `@param index_file` __\[optional\]__

    File path to write fulltext index. By default, on-memory index is used.

- `@param tokenizer` __\[optional\]__

    Tokenizer name to use. `simple` (default) and `porter` must be supported.
    `icu` and `unicode61` could be used if your SQLite libarary used via [DBD::SQLite](http://search.cpan.org/perldoc?DBD::SQLite) module support them.
    See [http://www.sqlite.org/fts3.html\#tokenizer](http://www.sqlite.org/fts3.html\#tokenizer) for more details on FTS4 tokenizers.

    Japanese tokenizer `perl 'Search::Fulltext::Tokenizer::MeCab::tokenizer'` is also available after you install
    [Search::Fulltext::Tokenizer::MeCab](http://search.cpan.org/perldoc?Search::Fulltext::Tokenizer::MeCab) module.

    See [CUSTOM TOKENIZERS](#CUSTOM\_TOKENIZERS) section for developing other tokenizers.

## Search::Fulltext->search

Search terms in documents by query language.

- `@returns`

    Array of indexes of `docs` passed through `Search::Fulltext->new` in which `query` is matched.

- `@param query`

    Query to search from documents.
    See ["QUERIES"](#QUERIES) section for types of queries.

# QUERIES

The simplest query would be a term.

    my $results = $fts->search('beer');

Other queries below and combination of them can be also used.

    my $results = $fts->search('beer AND happy');
    my $results = $fts->search('saticefied OR happy');
    my $results = $fts->search('people NOT beer');
    my $results = $fts->search('make*');
    my $results = $fts->search('"makes people"');
    my $results = $fts->search('beer NEAR happy');
    my $results = $fts->search('beer NEAR/1 happy');

See [http://www.sqlite.org/fts3.html\#section\_3](http://www.sqlite.org/fts3.html\#section\_3) for an explanation of each type of query.

__NOTE:__ Some custom tokenizers might not support full of these queries above.
Check the document of each tokenizer before using complex queries.

# CUSTOM TOKENIZERS

Custom tokenizers can be implemented by pure perl thanks to ["Perl\_tokenizers" in DBD::SQLite](http://search.cpan.org/perldoc?DBD::SQLite#Perl\_tokenizers).
[Search::Fulltext::Tokenizer::MeCab](http://search.cpan.org/perldoc?Search::Fulltext::Tokenizer::MeCab) is an example of custom tokenizers.

See ["Perl\_tokenizers" in DBD::SQLite](http://search.cpan.org/perldoc?DBD::SQLite#Perl\_tokenizers) and [Search::Fulltext::Tokenizer::MeCab](http://search.cpan.org/perldoc?Search::Fulltext::Tokenizer::MeCab) module to learn how to develop custom tokenizers.

# SUPPORTS

Bug reports and pull requests are welcome at [https://github.com/laysakura/Search-Fulltext](https://github.com/laysakura/Search-Fulltext) !

# VERSION

Version 1.03

# AUTHOR

Sho Nakatani <lay.sakura@gmail.com>, a.k.a. @laysakura