Re: Shrinking TSvectors - Mailing list pgsql-general
From | Artur Zakirov |
---|---|
Subject | Re: Shrinking TSvectors |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | Shrinking TSvectors (Howard News <[email protected]>) |
Responses |
Re: Shrinking TSvectors
|
List | pgsql-general |
On 05.04.2016 14:37, Howard News wrote: > Hi, > > does anyone have any pointers for shrinking tsvectors > > I have looked at the contents of some of these fields and they contain > many details that are not needed. For example... > > "'+1':935,942 '-0500':72 '-0578':932 '-0667':938 '-266':937 '-873':944 > '-9972':945 '/partners/application.html':222 > '/partners/program/program-agreement.pdf':271 > '/partners/reseller.html':181,1073 '01756':50,1083 '07767':54,1087 > '1':753,771 '12':366 '14':66 (...)" > > I am not interested in keeping the numbers or urls in the indexes. > > Thanks, > > Howard. > > Hello, You need create a new text search configuration. Here is an example of commands: CREATE TEXT SEARCH CONFIGURATION public.english_cfg ( PARSER = default ); ALTER TEXT SEARCH CONFIGURATION public.english_cfg ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part WITH pg_catalog.english_stem; Instead of the "pg_catalog.english_stem" you can use your own dictionary. Lets compare new configuration with the embedded configuration "pg_catalog.english": postgres=# select to_tsvector('english_cfg', 'home -9972 /partners/application.html /partners/program/program-agreement.pdf'); to_tsvector ------------- 'home':1 (1 row) postgres=# select to_tsvector('english', 'home -9972 /partners/application.html /partners/program/program-agreement.pdf'); to_tsvector ----------------------------------------------------------------------------------------------- '-9972':2 '/partners/application.html':3 '/partners/program/program-agreement.pdf':4 'home':1 (1 row) You can get some additional information about configurations using \dF+: postgres=# \dF+ english Text search configuration "pg_catalog.english" Parser: "pg_catalog.default" Token | Dictionaries -----------------+-------------- asciihword | english_stem asciiword | english_stem email | simple file | simple float | simple host | simple hword | english_stem hword_asciipart | english_stem hword_numpart | simple hword_part | english_stem int | simple numhword | simple numword | simple sfloat | simple uint | simple url | simple url_path | simple version | simple word | english_stem postgres=# \dF+ english_cfg Text search configuration "public.english_cfg" Parser: "pg_catalog.default" Token | Dictionaries -----------------+-------------- asciihword | english_stem asciiword | english_stem hword | english_stem hword_asciipart | english_stem hword_part | english_stem word | english_stem -- Artur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
pgsql-general by date: