Re: [HACKERS] [PATCH]: fix bug in SP-GiST box_ops - Mailing list pgsql-hackers
From | Nikita Glukhov |
---|---|
Subject | Re: [HACKERS] [PATCH]: fix bug in SP-GiST box_ops |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | Re: [HACKERS] [PATCH]: fix bug in SP-GiST box_ops (Kyotaro HORIGUCHI <[email protected]>) |
Responses |
Re: [HACKERS] [PATCH]: fix bug in SP-GiST box_ops
|
List | pgsql-hackers |
On 30.01.2017 12:04, Kyotaro HORIGUCHI wrote: > Hello, > > At Mon, 30 Jan 2017 07:12:03 +0300, Nikita Glukhov <[email protected]> wrote > in <[email protected]>: >> Working on the tests for new SP-GiST opclasses for polygons and >> circles, I've found a bug in the SP-GiST box_ops (added in 9.6): some operators >> (&<, &>, $<|, |&>) have wrong tests in spg_box_quad_inner_consistent(). >> This obviously leads to incorrect results of a SP-GiST index scan (see tests >> in the attached patch, their results were taken from a sequential scan). > > Your problem is not necessarily evident for others. Perhaps you > have to provide an explanation and/or a test case that describes > what is wrong for you so that others can get a clue for this > problem. Simpler test is better. The problem is that there are mixed low/high values for x/y axes. For example, overLeft4D() compares x-RangeBox rect_box->range_box_x with y-Range &query->right, and also lower2D() here must use query->high instead of query->low. The corresponding test is very simple: insert 10000 nearly arbitrary boxes and see the difference between results of sequential scan and results of index scan. regression=# CREATE TEMPORARY TABLE quad_box_tbl (b box); regression=# INSERT INTO quad_box_tbl SELECT box(point(x * 10, y * 10), point(x * 10 + 5, y * 10 + 5)) FROM generate_series(1, 100) x, generate_series(1, 100) y; regression=# CREATE INDEX quad_box_tbl_idx ON quad_box_tbl USING spgist(b); regression=# SET enable_seqscan = ON; regression=# SET enable_indexscan = OFF; regression=# SET enable_bitmapscan = OFF; regression=# SELECT count(*) FROM quad_box_tbl WHERE b &< box '((100,200),(300,500))'; count ------- 2900 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b &> box '((100,200),(300,500))'; count ------- 9100 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b &<| box '((100,200),(300,500))'; count ------- 4900 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b |&> box '((100,200),(300,500))'; count ------- 8100 (1 row) regression=# SET enable_seqscan = OFF; regression=# SET enable_indexscan = ON; regression=# SET enable_bitmapscan = ON; regression=# SELECT count(*) FROM quad_box_tbl WHERE b &< box '((100,200),(300,500))'; count ------- 2430 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b &> box '((100,200),(300,500))'; count ------- 6208 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b &<| box '((100,200),(300,500))'; count ------- 1048 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b |&> box '((100,200),(300,500))'; count ------- 5084 (1 row) > The test, > > | +INSERT INTO quad_box_tbl > | + SELECT i, box(point(x, y), point(x + w, y + h)) > | + FROM (SELECT i, > | + random() * 1000 as x, random() * 1000 as y, > | + random() * 20 as w, random() * 20 as h > > is inserting quad_boxes generated using random numbers then, > > | +SELECT count(*) FROM quad_box_tbl WHERE b << box '((100,200),(300,500))'; > | + count > | +------- > | + 891 > > counting them in this way is unstable. Even though this were > stable due to a fixed initial, this would be unacceptable, I > think. This kind of test should use nonrandom numbers. I have replaced random data in this test with stable uniformly distributed data. I would also like to use existing box data from rect.data that is loaded to slow_emp4000 table in copy.sql test, but box.sql test is executed before copy.sql. > Even though I don't understand this in depth, the following > change seems somewhat wrong in direction. Changing the second > argument type seems breaking the basis of the design. > > | -lower2D(RangeBox *range_box, Range *query) > | +lower2D(RangeBox *range_box, double query) Maybe. I also thought that these changes are quite doubtful, so I decided to replace lowerEqual2D()/higherEqual2D() with overlower2D()/overhigher2D() and not to touch lower2D()/higher2D(). -- Nikita Glukhov Postgres Professional:http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: