Home > mailing lists

Re: [HACKERS] WIP: BRIN bloom indexes - Mailing list pgsql-hackers

From	Sokolov Yura
Subject	Re: [HACKERS] WIP: BRIN bloom indexes
Date	October 27, 2017 22:27:43
Msg-id	[email protected] Whole thread Raw
In response to	Re: [HACKERS] WIP: BRIN bloom indexes (Nico Williams <[email protected]>)
List	pgsql-hackers

Tree view

On 2017-10-27 20:17, Nico Williams wrote:
> On Thu, Oct 19, 2017 at 10:15:32PM +0200, Tomas Vondra wrote:
> 
> A bloom filter index would, indeed, be wonderful.
> 
> Comments:
> 
> + * We use an optimisation that initially we store the uint32 values 
> directly,
> + * without the extra hashing step. And only later filling the bitmap 
> space,
> + * we switch to the regular bloom filter mode.
> 
> I don't think that optimization is worthwhile.  If I'm going to be 
> using
> a Bloom filter, it's because I expect to have a lot of rows.
> 
> (Granted, if I CREATE TABLE .. LIKE .. INCLUDING INDEXES, and the new
> table just won't have lots of rows, then I might want this 
> optimization,
> but I can always just drop the Bloom filter index, or not include
> indexes in the LIKE.)
> 
> + * XXX Perhaps we could save a few bytes by using different data 
> types, but
> + * considering the size of the bitmap, the difference is negligible.
> 
> A bytea is all that's needed.  See below.
> 
> + * XXX We could also implement "sparse" bloom filters, keeping only 
> the
> + * bytes that are not entirely 0. That might make the "sorted" phase
> + * mostly unnecessary.
> 
> Filter compression is not worthwhile.  You want to have a fairly 
> uniform
> hash distribution, and you want to end up with a Bloom filter that's 
> not
> much more than 50% full.  That means that when near 50% full the filter
> will not be very sparse at all.  Optimizing for the not common case
> doesn't seem worthwhile, and adds complexity.
> 
> + * XXX We can also watch the number of bits set in the bloom filter, 
> and
> + * then stop using it (and not store the bitmap, to save space) when 
> the
> + * false positive rate gets too high.
> 
> Ah, right, what you want is a "Scalable Bloom Filter".
> 
> A Scalable Bloom filter is actually a series of Bloom filters where all
> but the newest filter are closed to additions, and the newest filter is
> where you do all the additions.  You generally want to make each new
> filter bigger than the preceding one because when searching a Scalable
> Bloom filter you have to search *all* of them, so you want to minimize
> the number of filters.
> 
> Eventually, when you have enough sub-filters, you may want to re-write
> the whole thing so that you start with a single sub-filter that is 
> large
> enough.
> 
> The format of the bytea might then be something like:
> 
> <size><bitmap>[[<size><bitmap>[...]]
> 
> where the last bitmap is the filter that is open to additions.
> 
> I wonder if there are write concurrency performance considerations
> here...
> 
> It might be better to have a bytea value per-sub-filter so that there 
> is
> no lock contention for the closed filters.  The closed sub-filters are
> constant, thus not even shared locks are needed for them, and 
> especially
> not exclusive locks.
> 
> Writing to the filter will necessitate locking the entire open filter,
> or else byte-range locking on it.  Something to think about.
> 
>> Now, what about query performance? Unlike the "minmax" indexes, the
>> "bloom" filter can only handle equality conditions.
> 
> A Bloom filter has non-zero false positives for existence, but zero
> false positives for non-existence.
> 
> This means that a Bloom filter index can only be used for:
> 
> a) non-existence tests (with equality predicates, as you point out),
> 
> b) as an optimization to avoid slower index checks (or heap scans) when
>    the Bloom filter indicates non-existence.
> 
> (b) is really just an internal application of (a).
> 
> There might be applications where false positives might be ok in a 
> query
> like:
> 
>   SELECT a.* FROM a a JOIN b b USING (some_col);
> 
> but for most real-world queries like that, allowing false positives by
> default would be very bad.  An option in the index declaration could be
> used to enable existence equality predicates, but I would not include
> such an option initially -- let's see if anyone actually has a use case
> for it first.
> 
> Of course, for something like:
> 
>   SELECT a.*, b.* FROM a a JOIN b b USING (some_col);
> 
> a Bloom filter can only be used as an optimization to avoid using a
> slower index (or heap scan) on the inner table source.
> 
> What I'm getting at is that the query planner really needs to 
> understand
> that a Bloom filter is a probabilistic data structure.
> 
> Nico
> --

PostgreSQL has a lot of probabilistic indexes.

Some GIST opclasses returns false positives and tells to executor to
recheck their results.
Bitmap-index-scan, when there are a lot of result tuples, falls back to
storing only page numbers, without actual tid's, and executer then scans
heap-pages to find necessary tuples.

BRIN index at all just "recommends executor to scan that heap page". 
Thus
BRIN index is at whole just an optimisation (regardless is it `minmax` 
or
`bloom`).
So that is ok.

-- 
Sokolov Yura
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

From: Tom Lane
Date: 27 October 2017, 21:54:32
Subject: Re: [HACKERS] Index only scan for cube and seg

From: Sokolov Yura
Date: 27 October 2017, 22:53:08
Subject: Re: [HACKERS] logical decoding of two-phase transactions

Re: [HACKERS] WIP: BRIN bloom indexes - Mailing list pgsql-hackers

Previous

Next