SlideShare a Scribd company logo
JSONB in PostgreSQL
Working with JSON in PostgreSQL vs. MongoDB
Dharshan Rangegowda
Founder, ScaleGrid.io | @dharshanrg
What is JSON?
● JSON stands for Javascript object
notation.
● Open standard format RFC 7159.
● Most popular format to store and
exchange documents.
Working with JSON in PostgreSQL vs. MongoDB
Why does PostgreSQL need to care about JSON?
• Schema flexibility
• Dealing with transient or changing columns.
• Nested objects
• Might not need to deserialize to query.
• Handling objects from other systems
• E.g. Stripe transaction
Working with JSON in PostgreSQL vs. MongoDB
PostgreSQL + JSON Timeline
Working with JSON in PostgreSQL vs. MongoDB
PostgreSQL JSON Support
• Wave 1: PostgreSQL 9.2 (2012) added support for the JSON datatype
• Text field with JSON validation
• No index support
• Wave 2: PostgreSQL 9.4 (2014) added support for JSONB datatype
• Binary data structure to store JSON
• Index support
Working with JSON in PostgreSQL vs. MongoDB
PostgreSQL JSON Support
• Wave 3: PostgreSQL 12 (2019) added support for SQL/JSON standard
• JSONPath support
• Powerful query and projection engine for JSON data
• Further improvements to JSONPath in PostgreSQL 13
• JSON roadmap
Working with JSON in PostgreSQL vs. MongoDB
JSON vs. JSONB
• JSONB is what you should be using (in most cases)
• However, there are some scenarios where JSON is useful:
• JSON preserves the original formatting (a.k.a whitespace)
• JSON preserves ordering of the keys
• JSON preserves duplicate keys
• JSON is faster to ingest vs. JSONB
Working with JSON in PostgreSQL vs. MongoDB
JSONB Anti Patterns
● What is the best way to use JSONB?
○ Do we even need columns any more?
○ Why not just use <int id, jsonb data>?
● JSONB has some high-level limitations you need to
be aware of:
○ Statistics collection
○ Storage bloat
● Commonly occurring fields should be stored as
columns.
○ Use JSONB for variable or intermittent columns.
Working with JSON in PostgreSQL vs. MongoDB
JSONB Anti Patterns
● PostgreSQL collect stats on column data distribution
○ Most common values (MCV)
○ Fraction of null values
○ Histogram of distribution
● No column statistics collected for JSONB
○ Query planner doesn’t have stats to make smart decisions
○ Could make wrong choice – cross join vs hash join etc
● More details in blog post - When To Avoid JSONB
In A PostgreSQL Schema
Working with JSON in PostgreSQL vs. MongoDB
JSONB Anti Patterns
● Storage bloat
○ Keys are stored in the data (Similar to MongoDB mmapv1)
○ Use smaller key names to reduce footprint
○ Relies on TOAST compression
○ Sample table with 1M rows (11GB of data)
○ PostgreSQL - 8.7 GB
○ MongoDB Snappy – 8GB, Zlib – 5.3 GB
Working with JSON in PostgreSQL vs. MongoDB
JSONB & TOAST
● If the size of your column exceeds the
TOAST_TUPLE_THRESHOLD (2KB default) data
could be moved to out of line storage - TOAST
● TOAST also provides compression (pglz)
○ Decent Compression
○ MongoDB WiredTiger snappy/zlib is potentially better
● To access the data it needs to be De’TOASTed
○ Could result in performance overhead
Working with JSON in PostgreSQL vs. MongoDB
JSONB Data Structures
Working with JSON in PostgreSQL vs. MongoDB
Images courtesy: https://erthalion.info/2017/12/21/advanced-json-benchmarks/
BSON Data Structures
Working with JSON in PostgreSQL vs. MongoDB
JSONB Operators
Working with JSON in PostgreSQL vs. MongoDB
Operator Description
->, ->> Get JSON object field by key
@>, <@ Does the left JSONB value contain the right JSONB path/value entries at
the top level?
?, ?!, ?& Does the string exist as a top-level key within the JSON value?
@@, @@> JSONPath operators
Full list of operators can be found in the docs – JSONB op table
JSONB Functions
• PostgreSQL provides a wide variety of functions to create and process
JSON data
• Creation functions
• Processing functions
Working with JSON in PostgreSQL vs. MongoDB
MongoDB Query language
• Query language based on JSON syntax
• db.books.find( {} ) , db.books.find( { publisher: "D" } )
• Array operators
• db.books.find( { tags: ["red", "blank"] } )
• AND and OR operators
• db.books.find( { $or: [ { publisher: "A" }, { criticrating: { $lt: 30 } } ] } )
Working with JSON in PostgreSQL vs. MongoDB
MongoDB Query language
• Query nested documents
• db.books.find( { "size.uom": "in" } )
• Query an Array of objects
• db.books.find( { 'instock.qty': { $lte: 20 } } ))
• Project fields to return from query
• db.books.find( {prints: 1}, { $or: [ { publisher: "A" }, { criticrating: { $lt: 30 } } ] } )
Working with JSON in PostgreSQL vs. MongoDB
JSONB Indexes
• JSONB provides a wide array of options to index your JSON data.
• We are going to dig into three types of indexes:
• GIN
• BTREE
• HASH
Working with JSON in PostgreSQL vs. MongoDB
JSONB Indexes : GIN
• GIN stands for “Generalized
Inverted Indexes”
• GIN supports two operator classes
• jsonb_ops
• ?, ?|, ?&, @>, @@, @?
• [Index each key and value]
• jsonb_pathops
• @>, @@, @?
• [Index only the values]
Copyright © ScaleGrid.io
JSON sample data
Fictional book database (apologies to any librarians in the audience):
Working with JSON in PostgreSQL vs. MongoDB
demo=# d+ books
Table "public.books"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+------------------------+-----------+----------+---------+----------+--------------+-------------
id | integer | | not null | | plain | |
author | character varying(255) | | | | extended | |
isbn | character varying(25) | | | | extended | |
rating | integer | | | | plain | |
data | jsonb | | | | extended | |
Indexes:
"books_pkey" PRIMARY KEY, btree (id)
…..
JSON sample data
Working with JSON in PostgreSQL vs. MongoDB
demo=# select jsonb_pretty(data) from books where id = 1000021;
jsonb_pretty
------------------------------------
{ +
"tags": { +
"nk906585": { +
"ik844766": "iv364087"+
} +
}, +
"prints": [ +
{ +
"price": 100, +
"style": "hc" +
}, +
{ +
"price": 50, +
"style": "pb" +
} +
], +
"braille": false, +
"keywords": [ +
"abc", +
"kef", +
"keh" +
], +
"hardcover": true, +
"publisher": "nVxJVA8Bwx", +
"criticrating": 2 +
}
JSONB Indexes: GIN - ?
Find all books that are available in braille? Let’s create the GIN index on the ‘data’ JSONB column:
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX datagin ON books USING gin (data);
demo=# select * from books where data ? 'braille';
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
---------------------------------
------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
.....
demo=# explain analyze select * from books where data ? 'braille';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=12.75..1005.25 rows=1000 width=158) (actual time=0.033..0.039 rows=15 loops=1)
Recheck Cond: (data ? 'braille'::text)
Heap Blocks: exact=2
-> Bitmap Index Scan on datagin (cost=0.00..12.50 rows=1000 width=0) (actual time=0.022..0.022 rows=15 loops=1)
Index Cond: (data ? 'braille'::text)
Planning Time: 0.102 ms
Execution Time: 0.067 ms
(7 rows)
JSONB Indexes: GIN - ?
What if we wanted to find books that were in braille or in hardcover?
Working with JSON in PostgreSQL vs. MongoDB
demo=# explain analyze select * from books where data ?| array['braille','hardcover'];
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.029..0.035 rows=15 loops=1)
Recheck Cond: (data ?| '{braille,hardcover}'::text[])
Heap Blocks: exact=2
-> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.023..0.023 rows=15 loops=1)
Index Cond: (data ?| '{braille,hardcover}'::text[])
Planning Time: 0.138 ms
Execution Time: 0.057 ms
(7 rows)
JSONB Indexes: GIN
GIN index supports the “existence” operators only on “top level” keys. If the key is not at the top level, then
the index will not be used.
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data->'tags' ? 'nk455671';
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
---------------------------------
------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
685122 | GWfuvKfQ1PCe1IL | jnyhYYcF66 | 3 | {"tags": {"nk455671": {"ik615925": "iv253423"}}, "publisher": "b2NwVg7VY3", "criticrating": 0}
(2 rows)
demo=# explain analyze select * from books where data->'tags' ? 'nk455671';
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..38807.29 rows=1000 width=158) (actual time=0.018..270.641 rows=2 loops=1)
Filter: ((data -> 'tags'::text) ? 'nk455671'::text)
Rows Removed by Filter: 1000017
Planning Time: 0.078 ms
Execution Time: 270.728 ms
(5 rows)
JSONB Indexes: GIN
The way to check for existence in nested docs is to use “Expression indexes”. Let’s create an index on
data->tags:
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX datatagsgin ON books USING gin (data->'tags');
demo=# select * from books where data->'tags' ? 'nk455671';
id | author | isbn | rating | data
---------+-----------------+------------+--------+-----------------------------------------------------------------------------------------------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
685122 | GWfuvKfQ1PCe1IL | jnyhYYcF66 | 3 | {"tags": {"nk455671": {"ik615925": "iv253423"}}, "publisher": "b2NwVg7VY3", "criticrating": 0}
(2 rows)
demo=# explain analyze select * from books where data->'tags' ? 'nk455671';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=12.75..1007.75 rows=1000 width=158) (actual time=0.031..0.035 rows=2 loops=1)
Recheck Cond: ((data ->'tags'::text) ? 'nk455671'::text)
Heap Blocks: exact=2
-> Bitmap Index Scan on datatagsgin (cost=0.00..12.50 rows=1000 width=0) (actual time=0.021..0.021 rows=2 loops=1)
Index Cond: ((data ->'tags'::text) ? 'nk455671'::text)
Planning Time: 0.098 ms
Execution Time: 0.061 ms
(7 rows)
JSONB Indexes: GIN - @>
The “path” operator can be used for multi-level queries of your JSON data. Let’s use it similar to the ?
operator.
Working with JSON in PostgreSQL vs. MongoDB
select * from books where data @> '{"braille":true}'::jsonb;
demo=# explain analyze select * from books where data @> '{"braille":true}'::jsonb;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.040..0.048 rows=6 loops=1)
Recheck Cond: (data @> '{"braille": true}'::jsonb)
Rows Removed by Index Recheck: 9
Heap Blocks: exact=2
-> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.030..0.030 rows=15 loops=1)
Index Cond: (data @> '{"braille": true}'::jsonb)
Planning Time: 0.100 ms
Execution Time: 0.076 ms
(8 rows)
JSONB Indexes: GIN - @>
The "path" operator can be used for multi level queries of your JSON data.
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data @> '{"publisher":"XlekfkLOtL"}'::jsonb;
id | author | isbn | rating | data
-----+-----------------+------------+--------+-------------------------------------------------------------------------------------
346 | uD3QOvHfJdxq2ez | KiAaIRu8QE | 1 | {"tags": {"nk88": {"ik37": "iv161"}}, "publisher": "XlekfkLOtL", "criticrating": 3}
(1 row)
demo=# explain analyze select * from books where data @> '{"publisher":"XlekfkLOtL"}'::jsonb;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.491..0.492 rows=1 loops=1)
Recheck Cond: (data @> '{"publisher": "XlekfkLOtL"}'::jsonb)
Heap Blocks: exact=1
-> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.092..0.092 rows=1 loops=1)
Index Cond: (data @> '{"publisher": "XlekfkLOtL"}'::jsonb)
Planning Time: 0.090 ms
Execution Time: 0.523 ms
JSONB Indexes: GIN - @>
The JSON queries can be nested to many levels. You can also use the ># operation but GIN does not
support it.
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb;
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
---------------------------------
------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
(1 row)
JSONB Indexes: GIN - jsonb_pathops
GIN also supports a “pathops” option to reduce the size of the GIN index.
From the docs:
“The technical difference between a jsonb_ops and a jsonb_path_ops GIN index is that the former creates
independent index items for each key and value in the data, while the latter creates index items only for
each value in the data.”
On my small dataset of 1M books, you can see that the pathops GIN index is smaller – you should test
with your dataset to understand the savings.
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX dataginpathops ON books USING gin (data jsonb_path_ops);
public | dataginpathops | index | sgpostgres | books | 67 MB |
public | datatagsgin | index | sgpostgres | books | 84 MB |
JSONB Indexes: GIN - jsonb_pathops
Let’s rerun our query from before with the pathops index:
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb;
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
---------------------------------
------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
(1 row)
demo=# explain select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb;
QUERY PLAN
-----------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=12.75..1005.25 rows=1000 width=158)
Recheck Cond: (data @> '{"tags": {"nk455671": {"ik937456": "iv506075"}}}'::jsonb)
-> Bitmap Index Scan on dataginpathops (cost=0.00..12.50 rows=1000 width=0)
Index Cond: (data @> '{"tags": {"nk455671": {"ik937456": "iv506075"}}}'::jsonb)
(4 rows)
JSONB Indexes: GIN - jsonb_pathops
The “jsonb_pathops” option supports only the @> operator.
Smaller index but more limited scenarios.
The following queries below can no longer leverage the GIN index:
Working with JSON in PostgreSQL vs. MongoDB
select * from books where data ? 'tags'; => Sequential scan
select * from books where data @> '{"tags" :{}}'; => Sequential scan
select * from books where data @> '{"tags" :{"k7888":{}}}' => Sequential scan
JSONB Indexes: B-tree
• B-tree indexes are the most common index type in relational databases.
• If you index an entire JSONB column with a B-tree index, the only useful
operators are the comparison operators:
• =, <, <=, >, >=
• Can be used only for whole object comparisons.
• Very limited use case.
Working with JSON in PostgreSQL vs. MongoDB
JSONB Indexes: B-tree
• Use B-tree “Expression indexes”
• B-tree expression indexes can support the common comparison operators '=', '<', '>', '>=', '<=‘ (which
GIN doesn't support).
• Retrieve all books with a data->criticrating > 4.
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data->'criticrating' > 4;
ERROR: operator does not exist: jsonb >= integer
LINE 1: select * from books where data->'criticrating’ > 4;
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
#Lets cast JSONB to integer
demo=# select * from books where (data->'criticrating')::int4 > 4;
#If you are using a version prior to pg11 you need to query as text and then cast
demo=# select * from books where (data->>'criticrating')::int4 > 4;
JSONB Indexes: B-tree
For expression indexes, the index needs to be an exact match with the query expression:
Working with JSON in PostgreSQL vs. MongoDB
demo=# CREATE INDEX criticrating ON books USING BTREE (((data->'criticrating')::int4));
CREATE INDEX
demo=# explain analyze select * from books where (data->'criticrating')::int4 = 3;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Index Scan using criticrating on books (cost=0.42..4626.93 rows=5000 width=158) (actual time=0.069..70.221 rows=199883 loops=1)
Index Cond: (((data -> 'criticrating'::text))::integer = 3)
Planning Time: 0.103 ms
Execution Time: 79.019 ms
(4 rows)
demo=# explain analyze select * from books where (data->'criticrating')::int4 = 3;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Index Scan using criticrating on books (cost=0.42..4626.93 rows=5000 width=158) (actual time=0.069..70.221 rows=199883 loops=1)
Index Cond: (((data -> 'criticrating'::text))::integer = 3)
Planning Time: 0.103 ms
Execution Time: 79.019 ms
(4 rows)
1
From above we can see that the BTREE index is being used as expected.
JSONB Indexes: HASH
• If you are only interested in the "=" operator, then Hash indexes become interesting.
• Hash indexes tend to be smaller than B-tree indexes.
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX publisherhash ON books USING HASH ((data->'publisher'));
demo=# select * from books where data->'publisher' = 'XlekfkLOtL'
demo-# ;
id | author | isbn | rating | data
-----+-----------------+------------+--------+-------------------------------------------------------------------------------------
346 | uD3QOvHfJdxq2ez | KiAaIRu8QE | 1 | {"tags": {"nk88": {"ik37": "iv161"}}, "publisher": "XlekfkLOtL", "criticrating": 3}
(1 row)
demo=# explain analyze select * from books where data->'publisher' = 'XlekfkLOtL';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using publisherhash on books (cost=0.00..2.02 rows=1 width=158) (actual time=0.016..0.017 rows=1 loops=1)
Index Cond: ((data -> 'publisher'::text) = 'XlekfkLOtL'::text)
Planning Time: 0.080 ms
Execution Time: 0.035 ms
(4 rows)
JSONB Indexes: GIN - Trigram
• PostgreSQL supports string matching using Trigram indexes.
• Trigrams are basically words broken up into sequences of 3 letters.
• We can search for any arbitrary regex (not just left anchored).
Working with JSON in PostgreSQL vs. MongoDB
CREATE EXTENSION pg_trgm;
CREATE INDEX publisher ON books USING GIN ((data->'publisher') gin_trgm_ops);
demo=# select * from books where data->'publisher' LIKE '%I0UB%';
id | author | isbn | rating | data
----+-----------------+------------+--------+---------------------------------------------------------------------------------
4 | KiEk3xjqvTpmZeS | EYqXO9Nwmm | 0 | {"tags": {"nk3": {"ik1": "iv1"}}, "publisher": "MI0UBqZJDt", "criticrating": 1}
(1 row)
demo=# explain analyze select * from books where data->'publisher' LIKE '%I0UB%';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=9.78..111.28 rows=100 width=158) (actual time=0.033..0.033 rows=1 loops=1)
Recheck Cond: ((data -&gt; 'publisher'::text) ~~ '%I0UB%'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on publisher (cost=0.00..9.75 rows=100 width=0) (actual time=0.025..0.025 rows=1 loops=1)
Index Cond: ((data -&gt; 'publisher'::text) ~~ '%I0UB%'::text)
Planning Time: 0.213 ms
Execution Time: 0.058 ms
(7 rows)
JSONB Indexes: GIN - Arrays
• GIN indexes are great for indexing arrays.
• Indexing and searching the keyword array.
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX keywords ON books USING GIN ((data->'keywords') jsonb_path_ops);
demo=# select * from books where data->'keywords' @> '["abc", "keh"]'::jsonb;
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
--------------
1000003 | zEG406sLKQ2IU8O | viPdlu3DZm | 4 | {"tags": {"nk263020": {"ik203820": "iv817928"}}, "keywords": ["abc", "kef", "keh"], "publisher": "7NClevxuTM",
"criticrating": 2}
1000004 | GCe9NypHYKDH4rD | so6TQDYzZ3 | 4 | {"tags": {"nk780341": {"ik397357": "iv632731"}}, "keywords": ["abc", "kef", "keh"], "publisher": "fqaJuAdjP5",
"criticrating": 2}
(2 rows)
demo=# explain analyze select * from books where data->'keywords' @> '["abc", "keh"]'::jsonb;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=54.75..1049.75 rows=1000 width=158) (actual time=0.026..0.028 rows=2 loops=1)
Recheck Cond: ((data -> 'keywords'::text) @> '["abc", "keh"]'::jsonb)
Heap Blocks: exact=1
-> Bitmap Index Scan on keywords (cost=0.00..54.50 rows=1000 width=0) (actual time=0.014..0.014 rows=2 loops=1)
Index Cond: ((data -> 'keywords'::text) @&amp;amp;amp;amp;amp;gt; '["abc", "keh"]'::jsonb)
Planning Time: 0.131 ms
Execution Time: 0.063 ms
(7 rows)
SQL/JSON
• SQL standard added support for JSON – SQL Standard-2016 (SQL/JSON).
• SQL/JSON Data model
• JSONPath
• SQL/JSON functions
• With PG12 release, PostgreSQL has one of the best implementations of
SQL/JSON.
Working with JSON in PostgreSQL vs. MongoDB
SQL/JSON 2016
● A sequence of SQL/JSON items, each item can be (recursively) any of:
○ SQL/JSON scalar — non-null value of SQL types: Unicode character string, numeric, Boolean
or datetime.
○ SQL/JSON null, value that is distinct from any value of any SQL type (not the same as NULL).
○ SQL/JSON arrays, ordered list of zero or more SQL/JSON items — SQL/JSON element
○ SQL/JSON objects — unordered collections of zero or more SQL/JSON members.
■ (key, SQL/JSON item)
Working with JSON in PostgreSQL vs. MongoDB
JSONPath
Working with JSON in PostgreSQL vs. MongoDB
.key Returns an object member with the specified key
[*] Wildcard array element accessor that returns all array elements
.* Wildcard member accessor that returns the values of all members located at the top level of
the current object
.** Recursive wildcard member accessor that processes all levels of the JSON hierarchy of the
current object and returns all the member values, regardless of their nesting level
JSONPath allows you to specify an expression (using a syntax similar to the
property access notation in Javascript) to query or project your JSON data.
SQL/JSON Functions
● PG 12 provides several functions to use JSONPATH to query your JSON
data
○ jsonb_path_exists - Checks whether JSON path returns any item for the
specified JSON value
○ jsonb_path_match - Returns the result of JSON path predicate check for
the specified JSON value.
○ jsonb_path_query - Gets all JSON items returned by JSON path for the
specified JSON value.
Working with JSON in PostgreSQL vs. MongoDB
JSONPath
Finding books by publisher?
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data @@ '$.publisher == "ktjKEZ1tvq"';
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
-------------
1000001 | 4RNsovI2haTgU7l | GwSoX67gLS | 2 | {"tags": {"nk542369": {"ik55240": "iv305393"}}, "keywords": ["abc", "def", "geh"], "publisher": "ktjKEZ1tvq",
"criticrating": 0}
(1 row)
demo=# explain analyze select * from books where data @@ '$.publisher == "ktjKEZ1tvq"';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=21.75..1014.25 rows=1000 width=158) (actual time=0.123..0.124 rows=1 loops=1)
Recheck Cond: (data @@ '($."publisher" == "ktjKEZ1tvq")'::jsonpath)
Heap Blocks: exact=1
-&amp;amp;amp;amp;gt; Bitmap Index Scan on datagin (cost=0.00..21.50 rows=1000 width=0) (actual time=0.110..0.110 rows=1 loops=1)
Index Cond: (data @@ '($."publisher" == "ktjKEZ1tvq")'::jsonpath)
Planning Time: 0.137 ms
Execution Time: 0.194 ms
(7 rows)
JSONPath
Add a JSONPath filter:
Working with JSON in PostgreSQL vs. MongoDB
select * from books where jsonb_path_exists(data,'$.publisher ?(@ == "ktjKEZ1tvq")');
Build complicated filter expressions:
select * from books where jsonb_path_exists(data, '$.prints[*] ?(@.style=="hc" && @.price == 100)');
Index support for JSONPath is very limited.
demo=# explain analyze select * from books where jsonb_path_exists(data,'$.publisher ?(@ == "ktjKEZ1tvq")');
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..36307.24 rows=333340 width=158) (actual time=0.019..480.268 rows=1 loops=1)
Filter: jsonb_path_exists(data, '$."publisher"?(@ == "ktjKEZ1tvq")'::jsonpath, '{}'::jsonb, false)
Rows Removed by Filter: 1000028
Planning Time: 0.095 ms
Execution Time: 480.348 ms
(5 rows)
JSONPath: Projection JSON
Select the last element of the array
Working with JSON in PostgreSQL vs. MongoDB
demo=# select jsonb_path_query(data, '$.prints[$.size()]') from books where id = 1000029;
jsonb_path_query
------------------------------
{"price": 50, "style": "pb"}
(1 row)
Select only the hardcover prints from the array
demo=# select jsonb_path_query(data, '$.prints[*] ?(@.style=="hc")') from books where id = 1000029;
jsonb_path_query
-------------------------------
{"price": 100, "style": "hc"}
(1 row)
We can also chain the filters
demo=# select jsonb_path_query(data, '$.prints[*] ?(@.style=="hc") ?(@.price ==100)') from books where id = 1000029;
jsonb_path_query
-------------------------------
{"price": 100, "style": "hc"}
(1 row)
Roadmap
● Improvements to the JSONPath implementation in PG13
● Future Roadmap
Working with JSON in PostgreSQL vs. MongoDB
Questions?
Working with JSON in PostgreSQL vs. MongoDB

More Related Content

What's hot (20)

High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
Altinity Ltd
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
Nishith Agarwal
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
Databricks
 
Postgresql database administration volume 1
Postgresql database administration volume 1
Federico Campoli
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Databricks
 
Elasticsearch for beginners
Elasticsearch for beginners
Neil Baker
 
Postgresql
Postgresql
NexThoughts Technologies
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Jonathan Katz
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
rpolat
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
Hans-Jürgen Schönig
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Flink Forward
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
Altinity Ltd
 
MongoDB Performance Tuning
MongoDB Performance Tuning
Puneet Behl
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Altinity Ltd
 
Introduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
Introduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
Altinity Ltd
 
MariaDB 10.11 key features overview for DBAs
MariaDB 10.11 key features overview for DBAs
Federico Razzoli
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
Altinity Ltd
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
Nishith Agarwal
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
Databricks
 
Postgresql database administration volume 1
Postgresql database administration volume 1
Federico Campoli
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Databricks
 
Elasticsearch for beginners
Elasticsearch for beginners
Neil Baker
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Jonathan Katz
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
rpolat
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Flink Forward
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
Altinity Ltd
 
MongoDB Performance Tuning
MongoDB Performance Tuning
Puneet Behl
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Altinity Ltd
 
Introduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
Introduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
Altinity Ltd
 
MariaDB 10.11 key features overview for DBAs
MariaDB 10.11 key features overview for DBAs
Federico Razzoli
 

Similar to Working with JSON Data in PostgreSQL vs. MongoDB (20)

Oh, that ubiquitous JSON !
Oh, that ubiquitous JSON !
Alexander Korotkov
 
You, me, and jsonb
You, me, and jsonb
Arielle Simmons
 
Jsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing support
Alexander Korotkov
 
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Ontico
 
The NoSQL Way in Postgres
The NoSQL Way in Postgres
EDB
 
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
Ontico
 
Power JSON with PostgreSQL
Power JSON with PostgreSQL
EDB
 
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Jumping Bean
 
There is Javascript in my SQL
There is Javascript in my SQL
PGConf APAC
 
No sql way_in_pg
No sql way_in_pg
Vibhor Kumar
 
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
pgdayrussia
 
How to Use JSONB in PostgreSQL for Product Attributes Storage
How to Use JSONB in PostgreSQL for Product Attributes Storage
techprane
 
PostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and Operators
Nicholas Kiraly
 
Do More with Postgres- NoSQL Applications for the Enterprise
Do More with Postgres- NoSQL Applications for the Enterprise
EDB
 
NoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured Postgres
EDB
 
Postgre(No)SQL - A JSON journey
Postgre(No)SQL - A JSON journey
Nicola Moretto
 
Mathias test
Mathias test
Mathias Stjernström
 
NoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросов
CodeFest
 
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
Ryan B Harvey, CSDP, CSM
 
NoSQL on ACID: Meet Unstructured Postgres
NoSQL on ACID: Meet Unstructured Postgres
EDB
 
Jsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing support
Alexander Korotkov
 
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Ontico
 
The NoSQL Way in Postgres
The NoSQL Way in Postgres
EDB
 
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
Ontico
 
Power JSON with PostgreSQL
Power JSON with PostgreSQL
EDB
 
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Jumping Bean
 
There is Javascript in my SQL
There is Javascript in my SQL
PGConf APAC
 
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
pgdayrussia
 
How to Use JSONB in PostgreSQL for Product Attributes Storage
How to Use JSONB in PostgreSQL for Product Attributes Storage
techprane
 
PostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and Operators
Nicholas Kiraly
 
Do More with Postgres- NoSQL Applications for the Enterprise
Do More with Postgres- NoSQL Applications for the Enterprise
EDB
 
NoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured Postgres
EDB
 
Postgre(No)SQL - A JSON journey
Postgre(No)SQL - A JSON journey
Nicola Moretto
 
NoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросов
CodeFest
 
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
Ryan B Harvey, CSDP, CSM
 
NoSQL on ACID: Meet Unstructured Postgres
NoSQL on ACID: Meet Unstructured Postgres
EDB
 
Ad

More from ScaleGrid.io (7)

What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ScaleGrid.io
 
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
ScaleGrid.io
 
Introduction to Redis Data Structures: Sets
Introduction to Redis Data Structures: Sets
ScaleGrid.io
 
Introduction to Redis Data Structures: Sorted Sets
Introduction to Redis Data Structures: Sorted Sets
ScaleGrid.io
 
Cassandra vs. MongoDB
Cassandra vs. MongoDB
ScaleGrid.io
 
Introduction to Redis Data Structures: Hashes
Introduction to Redis Data Structures: Hashes
ScaleGrid.io
 
Introduction to Redis Data Structures
Introduction to Redis Data Structures
ScaleGrid.io
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ScaleGrid.io
 
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
ScaleGrid.io
 
Introduction to Redis Data Structures: Sets
Introduction to Redis Data Structures: Sets
ScaleGrid.io
 
Introduction to Redis Data Structures: Sorted Sets
Introduction to Redis Data Structures: Sorted Sets
ScaleGrid.io
 
Cassandra vs. MongoDB
Cassandra vs. MongoDB
ScaleGrid.io
 
Introduction to Redis Data Structures: Hashes
Introduction to Redis Data Structures: Hashes
ScaleGrid.io
 
Introduction to Redis Data Structures
Introduction to Redis Data Structures
ScaleGrid.io
 
Ad

Recently uploaded (20)

Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
Analysis of the changes in the attitude of the news comments caused by knowin...
Analysis of the changes in the attitude of the news comments caused by knowin...
Matsushita Laboratory
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
PyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent Integration
barqawicloud
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Alliance
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
 
Oracle Cloud and AI Specialization Program
Oracle Cloud and AI Specialization Program
VICTOR MAESTRE RAMIREZ
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Providing an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME Flow
Safe Software
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
Analysis of the changes in the attitude of the news comments caused by knowin...
Analysis of the changes in the attitude of the news comments caused by knowin...
Matsushita Laboratory
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
PyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent Integration
barqawicloud
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Alliance
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
 
Oracle Cloud and AI Specialization Program
Oracle Cloud and AI Specialization Program
VICTOR MAESTRE RAMIREZ
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Providing an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME Flow
Safe Software
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 

Working with JSON Data in PostgreSQL vs. MongoDB

  • 1. JSONB in PostgreSQL Working with JSON in PostgreSQL vs. MongoDB Dharshan Rangegowda Founder, ScaleGrid.io | @dharshanrg
  • 2. What is JSON? ● JSON stands for Javascript object notation. ● Open standard format RFC 7159. ● Most popular format to store and exchange documents. Working with JSON in PostgreSQL vs. MongoDB
  • 3. Why does PostgreSQL need to care about JSON? • Schema flexibility • Dealing with transient or changing columns. • Nested objects • Might not need to deserialize to query. • Handling objects from other systems • E.g. Stripe transaction Working with JSON in PostgreSQL vs. MongoDB
  • 4. PostgreSQL + JSON Timeline Working with JSON in PostgreSQL vs. MongoDB
  • 5. PostgreSQL JSON Support • Wave 1: PostgreSQL 9.2 (2012) added support for the JSON datatype • Text field with JSON validation • No index support • Wave 2: PostgreSQL 9.4 (2014) added support for JSONB datatype • Binary data structure to store JSON • Index support Working with JSON in PostgreSQL vs. MongoDB
  • 6. PostgreSQL JSON Support • Wave 3: PostgreSQL 12 (2019) added support for SQL/JSON standard • JSONPath support • Powerful query and projection engine for JSON data • Further improvements to JSONPath in PostgreSQL 13 • JSON roadmap Working with JSON in PostgreSQL vs. MongoDB
  • 7. JSON vs. JSONB • JSONB is what you should be using (in most cases) • However, there are some scenarios where JSON is useful: • JSON preserves the original formatting (a.k.a whitespace) • JSON preserves ordering of the keys • JSON preserves duplicate keys • JSON is faster to ingest vs. JSONB Working with JSON in PostgreSQL vs. MongoDB
  • 8. JSONB Anti Patterns ● What is the best way to use JSONB? ○ Do we even need columns any more? ○ Why not just use <int id, jsonb data>? ● JSONB has some high-level limitations you need to be aware of: ○ Statistics collection ○ Storage bloat ● Commonly occurring fields should be stored as columns. ○ Use JSONB for variable or intermittent columns. Working with JSON in PostgreSQL vs. MongoDB
  • 9. JSONB Anti Patterns ● PostgreSQL collect stats on column data distribution ○ Most common values (MCV) ○ Fraction of null values ○ Histogram of distribution ● No column statistics collected for JSONB ○ Query planner doesn’t have stats to make smart decisions ○ Could make wrong choice – cross join vs hash join etc ● More details in blog post - When To Avoid JSONB In A PostgreSQL Schema Working with JSON in PostgreSQL vs. MongoDB
  • 10. JSONB Anti Patterns ● Storage bloat ○ Keys are stored in the data (Similar to MongoDB mmapv1) ○ Use smaller key names to reduce footprint ○ Relies on TOAST compression ○ Sample table with 1M rows (11GB of data) ○ PostgreSQL - 8.7 GB ○ MongoDB Snappy – 8GB, Zlib – 5.3 GB Working with JSON in PostgreSQL vs. MongoDB
  • 11. JSONB & TOAST ● If the size of your column exceeds the TOAST_TUPLE_THRESHOLD (2KB default) data could be moved to out of line storage - TOAST ● TOAST also provides compression (pglz) ○ Decent Compression ○ MongoDB WiredTiger snappy/zlib is potentially better ● To access the data it needs to be De’TOASTed ○ Could result in performance overhead Working with JSON in PostgreSQL vs. MongoDB
  • 12. JSONB Data Structures Working with JSON in PostgreSQL vs. MongoDB Images courtesy: https://erthalion.info/2017/12/21/advanced-json-benchmarks/
  • 13. BSON Data Structures Working with JSON in PostgreSQL vs. MongoDB
  • 14. JSONB Operators Working with JSON in PostgreSQL vs. MongoDB Operator Description ->, ->> Get JSON object field by key @>, <@ Does the left JSONB value contain the right JSONB path/value entries at the top level? ?, ?!, ?& Does the string exist as a top-level key within the JSON value? @@, @@> JSONPath operators Full list of operators can be found in the docs – JSONB op table
  • 15. JSONB Functions • PostgreSQL provides a wide variety of functions to create and process JSON data • Creation functions • Processing functions Working with JSON in PostgreSQL vs. MongoDB
  • 16. MongoDB Query language • Query language based on JSON syntax • db.books.find( {} ) , db.books.find( { publisher: "D" } ) • Array operators • db.books.find( { tags: ["red", "blank"] } ) • AND and OR operators • db.books.find( { $or: [ { publisher: "A" }, { criticrating: { $lt: 30 } } ] } ) Working with JSON in PostgreSQL vs. MongoDB
  • 17. MongoDB Query language • Query nested documents • db.books.find( { "size.uom": "in" } ) • Query an Array of objects • db.books.find( { 'instock.qty': { $lte: 20 } } )) • Project fields to return from query • db.books.find( {prints: 1}, { $or: [ { publisher: "A" }, { criticrating: { $lt: 30 } } ] } ) Working with JSON in PostgreSQL vs. MongoDB
  • 18. JSONB Indexes • JSONB provides a wide array of options to index your JSON data. • We are going to dig into three types of indexes: • GIN • BTREE • HASH Working with JSON in PostgreSQL vs. MongoDB
  • 19. JSONB Indexes : GIN • GIN stands for “Generalized Inverted Indexes” • GIN supports two operator classes • jsonb_ops • ?, ?|, ?&, @>, @@, @? • [Index each key and value] • jsonb_pathops • @>, @@, @? • [Index only the values] Copyright © ScaleGrid.io
  • 20. JSON sample data Fictional book database (apologies to any librarians in the audience): Working with JSON in PostgreSQL vs. MongoDB demo=# d+ books Table "public.books" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+------------------------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | author | character varying(255) | | | | extended | | isbn | character varying(25) | | | | extended | | rating | integer | | | | plain | | data | jsonb | | | | extended | | Indexes: "books_pkey" PRIMARY KEY, btree (id) …..
  • 21. JSON sample data Working with JSON in PostgreSQL vs. MongoDB demo=# select jsonb_pretty(data) from books where id = 1000021; jsonb_pretty ------------------------------------ { + "tags": { + "nk906585": { + "ik844766": "iv364087"+ } + }, + "prints": [ + { + "price": 100, + "style": "hc" + }, + { + "price": 50, + "style": "pb" + } + ], + "braille": false, + "keywords": [ + "abc", + "kef", + "keh" + ], + "hardcover": true, + "publisher": "nVxJVA8Bwx", + "criticrating": 2 + }
  • 22. JSONB Indexes: GIN - ? Find all books that are available in braille? Let’s create the GIN index on the ‘data’ JSONB column: Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX datagin ON books USING gin (data); demo=# select * from books where data ? 'braille'; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- --------------------------------- ------------------ 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} ..... demo=# explain analyze select * from books where data ? 'braille'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=12.75..1005.25 rows=1000 width=158) (actual time=0.033..0.039 rows=15 loops=1) Recheck Cond: (data ? 'braille'::text) Heap Blocks: exact=2 -> Bitmap Index Scan on datagin (cost=0.00..12.50 rows=1000 width=0) (actual time=0.022..0.022 rows=15 loops=1) Index Cond: (data ? 'braille'::text) Planning Time: 0.102 ms Execution Time: 0.067 ms (7 rows)
  • 23. JSONB Indexes: GIN - ? What if we wanted to find books that were in braille or in hardcover? Working with JSON in PostgreSQL vs. MongoDB demo=# explain analyze select * from books where data ?| array['braille','hardcover']; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.029..0.035 rows=15 loops=1) Recheck Cond: (data ?| '{braille,hardcover}'::text[]) Heap Blocks: exact=2 -> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.023..0.023 rows=15 loops=1) Index Cond: (data ?| '{braille,hardcover}'::text[]) Planning Time: 0.138 ms Execution Time: 0.057 ms (7 rows)
  • 24. JSONB Indexes: GIN GIN index supports the “existence” operators only on “top level” keys. If the key is not at the top level, then the index will not be used. Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data->'tags' ? 'nk455671'; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- --------------------------------- ------------------ 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} 685122 | GWfuvKfQ1PCe1IL | jnyhYYcF66 | 3 | {"tags": {"nk455671": {"ik615925": "iv253423"}}, "publisher": "b2NwVg7VY3", "criticrating": 0} (2 rows) demo=# explain analyze select * from books where data->'tags' ? 'nk455671'; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Seq Scan on books (cost=0.00..38807.29 rows=1000 width=158) (actual time=0.018..270.641 rows=2 loops=1) Filter: ((data -> 'tags'::text) ? 'nk455671'::text) Rows Removed by Filter: 1000017 Planning Time: 0.078 ms Execution Time: 270.728 ms (5 rows)
  • 25. JSONB Indexes: GIN The way to check for existence in nested docs is to use “Expression indexes”. Let’s create an index on data->tags: Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX datatagsgin ON books USING gin (data->'tags'); demo=# select * from books where data->'tags' ? 'nk455671'; id | author | isbn | rating | data ---------+-----------------+------------+--------+----------------------------------------------------------------------------------------------------------- 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} 685122 | GWfuvKfQ1PCe1IL | jnyhYYcF66 | 3 | {"tags": {"nk455671": {"ik615925": "iv253423"}}, "publisher": "b2NwVg7VY3", "criticrating": 0} (2 rows) demo=# explain analyze select * from books where data->'tags' ? 'nk455671'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on books (cost=12.75..1007.75 rows=1000 width=158) (actual time=0.031..0.035 rows=2 loops=1) Recheck Cond: ((data ->'tags'::text) ? 'nk455671'::text) Heap Blocks: exact=2 -> Bitmap Index Scan on datatagsgin (cost=0.00..12.50 rows=1000 width=0) (actual time=0.021..0.021 rows=2 loops=1) Index Cond: ((data ->'tags'::text) ? 'nk455671'::text) Planning Time: 0.098 ms Execution Time: 0.061 ms (7 rows)
  • 26. JSONB Indexes: GIN - @> The “path” operator can be used for multi-level queries of your JSON data. Let’s use it similar to the ? operator. Working with JSON in PostgreSQL vs. MongoDB select * from books where data @> '{"braille":true}'::jsonb; demo=# explain analyze select * from books where data @> '{"braille":true}'::jsonb; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.040..0.048 rows=6 loops=1) Recheck Cond: (data @> '{"braille": true}'::jsonb) Rows Removed by Index Recheck: 9 Heap Blocks: exact=2 -> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.030..0.030 rows=15 loops=1) Index Cond: (data @> '{"braille": true}'::jsonb) Planning Time: 0.100 ms Execution Time: 0.076 ms (8 rows)
  • 27. JSONB Indexes: GIN - @> The "path" operator can be used for multi level queries of your JSON data. Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data @> '{"publisher":"XlekfkLOtL"}'::jsonb; id | author | isbn | rating | data -----+-----------------+------------+--------+------------------------------------------------------------------------------------- 346 | uD3QOvHfJdxq2ez | KiAaIRu8QE | 1 | {"tags": {"nk88": {"ik37": "iv161"}}, "publisher": "XlekfkLOtL", "criticrating": 3} (1 row) demo=# explain analyze select * from books where data @> '{"publisher":"XlekfkLOtL"}'::jsonb; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.491..0.492 rows=1 loops=1) Recheck Cond: (data @> '{"publisher": "XlekfkLOtL"}'::jsonb) Heap Blocks: exact=1 -> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.092..0.092 rows=1 loops=1) Index Cond: (data @> '{"publisher": "XlekfkLOtL"}'::jsonb) Planning Time: 0.090 ms Execution Time: 0.523 ms
  • 28. JSONB Indexes: GIN - @> The JSON queries can be nested to many levels. You can also use the ># operation but GIN does not support it. Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- --------------------------------- ------------------ 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} (1 row)
  • 29. JSONB Indexes: GIN - jsonb_pathops GIN also supports a “pathops” option to reduce the size of the GIN index. From the docs: “The technical difference between a jsonb_ops and a jsonb_path_ops GIN index is that the former creates independent index items for each key and value in the data, while the latter creates index items only for each value in the data.” On my small dataset of 1M books, you can see that the pathops GIN index is smaller – you should test with your dataset to understand the savings. Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX dataginpathops ON books USING gin (data jsonb_path_ops); public | dataginpathops | index | sgpostgres | books | 67 MB | public | datatagsgin | index | sgpostgres | books | 84 MB |
  • 30. JSONB Indexes: GIN - jsonb_pathops Let’s rerun our query from before with the pathops index: Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- --------------------------------- ------------------ 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} (1 row) demo=# explain select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb; QUERY PLAN ----------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=12.75..1005.25 rows=1000 width=158) Recheck Cond: (data @> '{"tags": {"nk455671": {"ik937456": "iv506075"}}}'::jsonb) -> Bitmap Index Scan on dataginpathops (cost=0.00..12.50 rows=1000 width=0) Index Cond: (data @> '{"tags": {"nk455671": {"ik937456": "iv506075"}}}'::jsonb) (4 rows)
  • 31. JSONB Indexes: GIN - jsonb_pathops The “jsonb_pathops” option supports only the @> operator. Smaller index but more limited scenarios. The following queries below can no longer leverage the GIN index: Working with JSON in PostgreSQL vs. MongoDB select * from books where data ? 'tags'; => Sequential scan select * from books where data @> '{"tags" :{}}'; => Sequential scan select * from books where data @> '{"tags" :{"k7888":{}}}' => Sequential scan
  • 32. JSONB Indexes: B-tree • B-tree indexes are the most common index type in relational databases. • If you index an entire JSONB column with a B-tree index, the only useful operators are the comparison operators: • =, <, <=, >, >= • Can be used only for whole object comparisons. • Very limited use case. Working with JSON in PostgreSQL vs. MongoDB
  • 33. JSONB Indexes: B-tree • Use B-tree “Expression indexes” • B-tree expression indexes can support the common comparison operators '=', '<', '>', '>=', '<=‘ (which GIN doesn't support). • Retrieve all books with a data->criticrating > 4. Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data->'criticrating' > 4; ERROR: operator does not exist: jsonb >= integer LINE 1: select * from books where data->'criticrating’ > 4; ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. #Lets cast JSONB to integer demo=# select * from books where (data->'criticrating')::int4 > 4; #If you are using a version prior to pg11 you need to query as text and then cast demo=# select * from books where (data->>'criticrating')::int4 > 4;
  • 34. JSONB Indexes: B-tree For expression indexes, the index needs to be an exact match with the query expression: Working with JSON in PostgreSQL vs. MongoDB demo=# CREATE INDEX criticrating ON books USING BTREE (((data->'criticrating')::int4)); CREATE INDEX demo=# explain analyze select * from books where (data->'criticrating')::int4 = 3; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Index Scan using criticrating on books (cost=0.42..4626.93 rows=5000 width=158) (actual time=0.069..70.221 rows=199883 loops=1) Index Cond: (((data -> 'criticrating'::text))::integer = 3) Planning Time: 0.103 ms Execution Time: 79.019 ms (4 rows) demo=# explain analyze select * from books where (data->'criticrating')::int4 = 3; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Index Scan using criticrating on books (cost=0.42..4626.93 rows=5000 width=158) (actual time=0.069..70.221 rows=199883 loops=1) Index Cond: (((data -> 'criticrating'::text))::integer = 3) Planning Time: 0.103 ms Execution Time: 79.019 ms (4 rows) 1 From above we can see that the BTREE index is being used as expected.
  • 35. JSONB Indexes: HASH • If you are only interested in the "=" operator, then Hash indexes become interesting. • Hash indexes tend to be smaller than B-tree indexes. Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX publisherhash ON books USING HASH ((data->'publisher')); demo=# select * from books where data->'publisher' = 'XlekfkLOtL' demo-# ; id | author | isbn | rating | data -----+-----------------+------------+--------+------------------------------------------------------------------------------------- 346 | uD3QOvHfJdxq2ez | KiAaIRu8QE | 1 | {"tags": {"nk88": {"ik37": "iv161"}}, "publisher": "XlekfkLOtL", "criticrating": 3} (1 row) demo=# explain analyze select * from books where data->'publisher' = 'XlekfkLOtL'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Scan using publisherhash on books (cost=0.00..2.02 rows=1 width=158) (actual time=0.016..0.017 rows=1 loops=1) Index Cond: ((data -> 'publisher'::text) = 'XlekfkLOtL'::text) Planning Time: 0.080 ms Execution Time: 0.035 ms (4 rows)
  • 36. JSONB Indexes: GIN - Trigram • PostgreSQL supports string matching using Trigram indexes. • Trigrams are basically words broken up into sequences of 3 letters. • We can search for any arbitrary regex (not just left anchored). Working with JSON in PostgreSQL vs. MongoDB CREATE EXTENSION pg_trgm; CREATE INDEX publisher ON books USING GIN ((data->'publisher') gin_trgm_ops); demo=# select * from books where data->'publisher' LIKE '%I0UB%'; id | author | isbn | rating | data ----+-----------------+------------+--------+--------------------------------------------------------------------------------- 4 | KiEk3xjqvTpmZeS | EYqXO9Nwmm | 0 | {"tags": {"nk3": {"ik1": "iv1"}}, "publisher": "MI0UBqZJDt", "criticrating": 1} (1 row) demo=# explain analyze select * from books where data->'publisher' LIKE '%I0UB%'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=9.78..111.28 rows=100 width=158) (actual time=0.033..0.033 rows=1 loops=1) Recheck Cond: ((data -&gt; 'publisher'::text) ~~ '%I0UB%'::text) Heap Blocks: exact=1 -> Bitmap Index Scan on publisher (cost=0.00..9.75 rows=100 width=0) (actual time=0.025..0.025 rows=1 loops=1) Index Cond: ((data -&gt; 'publisher'::text) ~~ '%I0UB%'::text) Planning Time: 0.213 ms Execution Time: 0.058 ms (7 rows)
  • 37. JSONB Indexes: GIN - Arrays • GIN indexes are great for indexing arrays. • Indexing and searching the keyword array. Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX keywords ON books USING GIN ((data->'keywords') jsonb_path_ops); demo=# select * from books where data->'keywords' @> '["abc", "keh"]'::jsonb; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- -------------- 1000003 | zEG406sLKQ2IU8O | viPdlu3DZm | 4 | {"tags": {"nk263020": {"ik203820": "iv817928"}}, "keywords": ["abc", "kef", "keh"], "publisher": "7NClevxuTM", "criticrating": 2} 1000004 | GCe9NypHYKDH4rD | so6TQDYzZ3 | 4 | {"tags": {"nk780341": {"ik397357": "iv632731"}}, "keywords": ["abc", "kef", "keh"], "publisher": "fqaJuAdjP5", "criticrating": 2} (2 rows) demo=# explain analyze select * from books where data->'keywords' @> '["abc", "keh"]'::jsonb; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=54.75..1049.75 rows=1000 width=158) (actual time=0.026..0.028 rows=2 loops=1) Recheck Cond: ((data -> 'keywords'::text) @> '["abc", "keh"]'::jsonb) Heap Blocks: exact=1 -> Bitmap Index Scan on keywords (cost=0.00..54.50 rows=1000 width=0) (actual time=0.014..0.014 rows=2 loops=1) Index Cond: ((data -> 'keywords'::text) @&amp;amp;amp;amp;amp;gt; '["abc", "keh"]'::jsonb) Planning Time: 0.131 ms Execution Time: 0.063 ms (7 rows)
  • 38. SQL/JSON • SQL standard added support for JSON – SQL Standard-2016 (SQL/JSON). • SQL/JSON Data model • JSONPath • SQL/JSON functions • With PG12 release, PostgreSQL has one of the best implementations of SQL/JSON. Working with JSON in PostgreSQL vs. MongoDB
  • 39. SQL/JSON 2016 ● A sequence of SQL/JSON items, each item can be (recursively) any of: ○ SQL/JSON scalar — non-null value of SQL types: Unicode character string, numeric, Boolean or datetime. ○ SQL/JSON null, value that is distinct from any value of any SQL type (not the same as NULL). ○ SQL/JSON arrays, ordered list of zero or more SQL/JSON items — SQL/JSON element ○ SQL/JSON objects — unordered collections of zero or more SQL/JSON members. ■ (key, SQL/JSON item) Working with JSON in PostgreSQL vs. MongoDB
  • 40. JSONPath Working with JSON in PostgreSQL vs. MongoDB .key Returns an object member with the specified key [*] Wildcard array element accessor that returns all array elements .* Wildcard member accessor that returns the values of all members located at the top level of the current object .** Recursive wildcard member accessor that processes all levels of the JSON hierarchy of the current object and returns all the member values, regardless of their nesting level JSONPath allows you to specify an expression (using a syntax similar to the property access notation in Javascript) to query or project your JSON data.
  • 41. SQL/JSON Functions ● PG 12 provides several functions to use JSONPATH to query your JSON data ○ jsonb_path_exists - Checks whether JSON path returns any item for the specified JSON value ○ jsonb_path_match - Returns the result of JSON path predicate check for the specified JSON value. ○ jsonb_path_query - Gets all JSON items returned by JSON path for the specified JSON value. Working with JSON in PostgreSQL vs. MongoDB
  • 42. JSONPath Finding books by publisher? Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data @@ '$.publisher == "ktjKEZ1tvq"'; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- ------------- 1000001 | 4RNsovI2haTgU7l | GwSoX67gLS | 2 | {"tags": {"nk542369": {"ik55240": "iv305393"}}, "keywords": ["abc", "def", "geh"], "publisher": "ktjKEZ1tvq", "criticrating": 0} (1 row) demo=# explain analyze select * from books where data @@ '$.publisher == "ktjKEZ1tvq"'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=21.75..1014.25 rows=1000 width=158) (actual time=0.123..0.124 rows=1 loops=1) Recheck Cond: (data @@ '($."publisher" == "ktjKEZ1tvq")'::jsonpath) Heap Blocks: exact=1 -&amp;amp;amp;amp;gt; Bitmap Index Scan on datagin (cost=0.00..21.50 rows=1000 width=0) (actual time=0.110..0.110 rows=1 loops=1) Index Cond: (data @@ '($."publisher" == "ktjKEZ1tvq")'::jsonpath) Planning Time: 0.137 ms Execution Time: 0.194 ms (7 rows)
  • 43. JSONPath Add a JSONPath filter: Working with JSON in PostgreSQL vs. MongoDB select * from books where jsonb_path_exists(data,'$.publisher ?(@ == "ktjKEZ1tvq")'); Build complicated filter expressions: select * from books where jsonb_path_exists(data, '$.prints[*] ?(@.style=="hc" && @.price == 100)'); Index support for JSONPath is very limited. demo=# explain analyze select * from books where jsonb_path_exists(data,'$.publisher ?(@ == "ktjKEZ1tvq")'); QUERY PLAN ------------------------------------------------------------------------------------------------------------ Seq Scan on books (cost=0.00..36307.24 rows=333340 width=158) (actual time=0.019..480.268 rows=1 loops=1) Filter: jsonb_path_exists(data, '$."publisher"?(@ == "ktjKEZ1tvq")'::jsonpath, '{}'::jsonb, false) Rows Removed by Filter: 1000028 Planning Time: 0.095 ms Execution Time: 480.348 ms (5 rows)
  • 44. JSONPath: Projection JSON Select the last element of the array Working with JSON in PostgreSQL vs. MongoDB demo=# select jsonb_path_query(data, '$.prints[$.size()]') from books where id = 1000029; jsonb_path_query ------------------------------ {"price": 50, "style": "pb"} (1 row) Select only the hardcover prints from the array demo=# select jsonb_path_query(data, '$.prints[*] ?(@.style=="hc")') from books where id = 1000029; jsonb_path_query ------------------------------- {"price": 100, "style": "hc"} (1 row) We can also chain the filters demo=# select jsonb_path_query(data, '$.prints[*] ?(@.style=="hc") ?(@.price ==100)') from books where id = 1000029; jsonb_path_query ------------------------------- {"price": 100, "style": "hc"} (1 row)
  • 45. Roadmap ● Improvements to the JSONPath implementation in PG13 ● Future Roadmap Working with JSON in PostgreSQL vs. MongoDB
  • 46. Questions? Working with JSON in PostgreSQL vs. MongoDB