text-builder-core-0.1: Internals of "text-builder"
Safe HaskellSafe-Inferred
LanguageHaskell2010

TextBuilderCore

Synopsis

Documentation

data TextBuilder Source #

Specification of how to efficiently construct strict Text.

For this task it is much more efficient than Data.Text.Lazy.Builder.Builder and even the recently introduced Data.Text.Encoding.StrictTextBuilder.

Provides instances of Semigroup and Monoid, which have complexity of O(1).

Constructors

TextBuilder 

Fields

  • Int

    Estimated maximum size of the byte array to allocate.

    If the builder is empty it must be 0. Otherwise it must be greater than or equal to the amount of bytes to be written.

    Warning: Due to "text" switching from UTF-16 to UTF-8 since version 2, Word16 is used as the byte when "text" version is and 'Word8' is used when it's=2.

  • (forall s. MArray s -> Int -> ST s Int)

    Function that populates a preallocated byte array of the estimated maximum size specified above provided an offset into it and producing the offset after.

    Warning: The function must not write outside of the allocated array or bad things will happen to the running app.

    Warning: Keep in mind that the array is operating on Word8 values starting from text-2.0, but prior to it it operates on Word16. This is due to the "text" library switching from UTF-16 to UTF-8 after version 2. To deal with this you have the following options:

    1. Restrict the version of the "text" library in your package to >=2.
    2. Use helpers provided by this library, such as unsafeSeptets and unsafeReverseSeptets, which abstract over the differences in the underlying representation.
    3. Use CPP to conditionally compile your code for different versions of "text".

Destructors

isEmpty :: TextBuilder -> Bool Source #

Check whether the builder is empty.

toText :: TextBuilder -> Text Source #

Execute the builder producing a strict text.

Constructors

Text

text :: Text -> TextBuilder Source #

Strict text.

lazyText :: Text -> TextBuilder Source #

Lazy text.

Character

char :: Char -> TextBuilder Source #

Unicode character.

unicodeCodepoint :: Int -> TextBuilder Source #

Safe Unicode codepoint with invalid values replaced by the char (codepoint 0xfffd), which is the same as what Data.Text.pack does.

Primitives

unsafeSeptets Source #

Arguments

:: Int

Maximum size of the byte array to allocate.

Must be greater than or equal to the length of the list.

Warning: If it is smaller, bad things will happen. We'll be writing outside of the allocated array.

-> [Word8]

List of bytes to write.

Warning: It is your responsibility to ensure that the bytes are smaller than 128. Otherwise the produced text will have a broken encoding.

To ensure of optimization kicking in it is advised to construct the list using build.

-> TextBuilder 

Provides a unified way to deal with the byte array regardless of the version of the text library.

Keep in mind that prior to text-2.0, the array was operating on Word16 values due to the library abstracting over UTF-16. Starting from text-2.0, the array operates on Word8 values and the library abstracts over UTF-8.

This function is useful for building ASCII values.

>>> unsafeSeptets 3 (fmap (+48) [1, 2, 3])
"123"
>>> unsafeSeptets 4 (fmap (+48) [1, 2, 3])
"123"

unsafeReverseSeptets Source #

Arguments

:: Int

Precise amount of bytes in the list.

Needs to be precise, because writing happens in reverse order.

Warning: If it is smaller, bad things will happen. We'll be writing outside of the allocated array.

-> [Word8]

List of bytes to write in reverse order.

Warning: It is your responsibility to ensure that the bytes are smaller than 128. Otherwise the produced text will have a broken encoding.

To ensure of optimization kicking in it is advised to construct the list using build.

-> TextBuilder 

Same as unsafeSeptets, but writes the bytes in reverse order and requires the size to be precise.

>>> unsafeReverseSeptets 3 (fmap (+48) [1, 2, 3])
"321"