MySQL 9.4.0
Source Code Documentation
compressor.h
Go to the documentation of this file.
1/* Copyright (c) 2019, 2025, Oracle and/or its affiliates.
2
3 This program is free software; you can redistribute it and/or modify
4 it under the terms of the GNU General Public License, version 2.0,
5 as published by the Free Software Foundation.
6
7 This program is designed to work with certain software (including
8 but not limited to OpenSSL) that is licensed under separate terms,
9 as designated in a particular file or component or in included license
10 documentation. The authors of MySQL hereby grant you an additional
11 permission to link the program and your derivative works with the
12 separately licensed software that they have either included with
13 the program or referenced in the documentation.
14
15 This program is distributed in the hope that it will be useful,
16 but WITHOUT ANY WARRANTY; without even the implied warranty of
17 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18 GNU General Public License, version 2.0, for more details.
19
20 You should have received a copy of the GNU General Public License
21 along with this program; if not, write to the Free Software
22 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
23
24#ifndef MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
25#define MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
26
27#include <cstddef>
28#include <tuple>
30#include "mysql/containers/buffers/grow_constraint.h" // Grow_constraint
31#include "mysql/containers/buffers/managed_buffer_sequence.h" // Managed_buffer_sequence
32
33#include <limits> // std::numeric_limits
34
36
38
39/// Abstract base class for compressors.
40///
41/// Each subclass normally corresponds to a compression algorithm, and
42/// maintains the algorithm-specific state for it.
43///
44/// An instance of this class can be reused to compress several
45/// *frames*. A frame is a self-contained segment of data, in the
46/// sense that it can be decompressed without knowing about other
47/// frames, and compression does not take advantage of patterns that
48/// repeat between frames.
49///
50/// Input for a frame can be provided in pieces. All pieces for a
51/// frame will be compressed together; the decompressor will take
52/// advantage of patterns across in different pieces within the frame.
53/// Providing a frame in pieces is useful when not all input is known
54/// at once.
55///
56/// To compress one frame, use the API as follows:
57///
58/// 1. Repeat as many times as needed:
59/// 1.1. Call @c feed to provide a piece of input.
60/// 1.2. Call @c compress to consume the piece of input and possibly
61/// produce a prefix of the output.
62/// 2. Choose one of the following:
63/// 2.1. Call @c finish to produce the remainder of the output for this
64/// frame.
65/// 2.2. Call @c reset to abort this frame.
66///
67/// @note After 1.2, although the compression library has read all
68/// input given so far, it may not have produced all corresponding
69/// output. It usually holds some data in internal buffers, since it
70/// may be more compressible when more data has been given. Therefore,
71/// step 2.1 is always necessary in order to complete the frame.
72///
73/// @note To reuse the compressor object for another input, repeat the
74/// above procedure as many times as needed.
75///
76/// This class requires that the user provides a @c
77/// mysql::containers::buffers::Managed_buffer_sequence to
78/// store output.
80 public:
88
89 Compressor() = default;
90 Compressor(const Compressor &other) = delete;
91 Compressor(Compressor &&other) = delete;
92 Compressor &operator=(const Compressor &other) = delete;
93 Compressor &operator=(Compressor &&other) = delete;
94
95 virtual ~Compressor() = default;
96
97 /// @return the compression type.
98 type get_type_code() const;
99
100 /// Reset the frame.
101 ///
102 /// This cancels the current frame and starts a new one.
103 ///
104 /// This is allowed but unnecessary if the current frame has been
105 /// reset by @c finish or by an out_of_memory error from @c
106 /// compress.
107 void reset();
108
109 /// Submit data to be compressed.
110 ///
111 /// This will not consume any of the input; it should be followed by
112 /// a call to @c compress or @c finish.
113 ///
114 /// @note This object will not copy the input; the caller must
115 /// ensure that the input lives until it has been consumed or the
116 /// frame has been reset.
117 ///
118 /// @note Must not be called when there is still non-consumed input
119 /// left after a previous call to @c feed.
120 ///
121 /// @param input_data Data to be compressed. This object will keep a
122 /// shallow copy of the data and use it in subsequent calls to @c
123 /// compress or @c finish.
124 ///
125 /// @param input_size Size of data to be compressed.
126 template <class Input_char_t>
127 void feed(const Input_char_t *input_data, Size_t input_size) {
128 feed_char_t(reinterpret_cast<const Char_t *>(input_data), input_size);
129 }
130
131 /// Consume all input previously given in the feed function.
132 ///
133 /// This will consume the input, but may not produce all output;
134 /// there may be output still in compression library buffers. Use
135 /// the @c finish function to flush the output and end the frame.
136 ///
137 /// @param out Storage for compressed bytes. This may grow, if
138 /// needed.
139 ///
140 /// @retval success All input was consumed.
141 ///
142 /// @retval out_of_memory The operation failed due to an out of
143 /// memory error. The frame has been reset.
144 ///
145 /// @retval exceeds_max_size The @c out buffer was already at its
146 /// max capacity, and filled, and there were more bytes left to
147 /// produce. The frame has not been reset and it is not guaranteed
148 /// that all input has been consumed. The caller may resume
149 /// compression e.g. after increasing the capacity, or resetting
150 /// the output buffer (perhaps after moving existing data
151 /// elsewhere), or using a different output buffer, or similar.
153
154 /// Consume all input, produce all output, and end the frame.
155 ///
156 /// This will consume all input previously given by @c feed (it
157 /// internally calls @c compress). Then it ends the frame and
158 /// flushes the output, ensuring that all data that may reside in
159 /// the compression library's internal buffers gets compressed and
160 /// written to the output.
161 ///
162 /// The next call to @c feed will start a new frame.
163 ///
164 /// @param out Storage for compressed bytes. This may grow, if
165 /// needed.
166 ///
167 /// @retval success All input was consumed, all output was produced,
168 /// and the frame was reset.
169 ///
170 /// @retval out_of_memory The operation failed due to an out of
171 /// memory error, and the frame has been reset.
172 ///
173 /// @retval exceeds_max_size The @c out buffer was already at its
174 /// max capacity, and filled, and there were more bytes left to
175 /// produce. The frame has not been reset and it is not guaranteed
176 /// that all input has been consumed. The caller may resume
177 /// compression e.g. after increasing the capacity, or resetting
178 /// the output buffer (perhaps after moving existing data
179 /// elsewhere), or using a different output buffer, or similar.
181
182 /// Return a `Grow_constraint` that may be used with the
183 /// Managed_buffer_sequence storing the output, in order to
184 /// optimize memory usage for a particular compression algorithm.
185 ///
186 /// This may be implemented by subclasses such that it depends on
187 /// the pledged input size. Therefore, for the most optimal grow
188 /// constraint, call this after set_pledged_input_size.
190
191 /// Declare that the input size will be exactly as given.
192 ///
193 /// This may allow compressors and decompressors to use memory more
194 /// efficiently.
195 ///
196 /// This function may only be called if `feed` has never been
197 /// called, or if the compressor has been reset since the last call
198 /// to `feed`. The pledged size will be set back to
199 /// pledged_input_size_unset next time this compressor is reset.
200 ///
201 /// It is required that the total number of bytes passed to `feed`
202 /// before the call to `finish` matches the pledged number.
203 /// Otherwise, the behavior of `finish` is undefined.
205
206 /// Return the size previously provided to `set_pledged_input_size`,
207 /// or `pledged_input_size_unset` if no pledged size has been set.
209
210 private:
211 /// Worker function for @c feed, requiring the correct Char_t type.
212 ///
213 /// @see feed.
214 void feed_char_t(const Char_t *input_data, Size_t input_size);
215
216 /// implement @c get_type_code.
217 virtual type do_get_type_code() const = 0;
218
219 /// Implement @c reset.
220 virtual void do_reset() = 0;
221
222 /// Implement @c feed.
223 ///
224 /// This differs from @c feed in that it does not have to reset the
225 /// frame when returning out_of_memory; the caller does that.
226 virtual void do_feed(const Char_t *input_data, Size_t input_size) = 0;
227
228 /// Implement @c compress.
229 ///
230 /// This differs from @c compress in that it does not have to reset
231 /// the frame when returning out_of_memory; the caller does that.
232 [[nodiscard]] virtual Compress_status do_compress(
234
235 /// Implement @c finish.
236 ///
237 /// This differs from @c finish in that it does not have to reset
238 /// the frame when returning out_of_memory; the caller does that.
239 ///
240 /// Implementations may assume that @c compress has been called,
241 /// since @c finish does that.
242 [[nodiscard]] virtual Compress_status do_finish(
244
245 /// Implement @c get_grow_constraint_hint.
246 ///
247 /// In this base class, the function returns a default-constructed
248 /// Grow_constraint, i.e., one which does not limit the
249 /// Grow_calculator.
251
252 /// Implement @c set_pledged_input_size
253 ///
254 /// By default, this does nothing.
255 virtual void do_set_pledged_input_size([[maybe_unused]] Size_t size);
256
257 /// True when user has provided input that has not yet been consumed.
258 bool m_pending_input = false;
259
260 /// True when user has not provided any input since the last reset.
261 bool m_empty = true;
262
263 /// The number of bytes
265};
266
267} // namespace mysql::binlog::event::compression
268
269#endif // MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
Abstract base class for compressors.
Definition: compressor.h:79
Size_t m_pledged_input_size
The number of bytes.
Definition: compressor.h:264
void reset()
Reset the frame.
Definition: compressor.cpp:30
virtual Compress_status do_compress(Managed_buffer_sequence_t &out)=0
Implement compress.
Grow_constraint_t get_grow_constraint_hint() const
Return a Grow_constraint that may be used with the Managed_buffer_sequence storing the output,...
Definition: compressor.cpp:71
Managed_buffer_sequence_t::Size_t Size_t
Definition: compressor.h:84
void set_pledged_input_size(Size_t size)
Declare that the input size will be exactly as given.
Definition: compressor.cpp:79
virtual Grow_constraint_t do_get_grow_constraint_hint() const
Implement get_grow_constraint_hint.
Definition: compressor.cpp:75
virtual type do_get_type_code() const =0
implement get_type_code.
Compressor(const Compressor &other)=delete
virtual Compress_status do_finish(Managed_buffer_sequence_t &out)=0
Implement finish.
Compress_status finish(Managed_buffer_sequence_t &out)
Consume all input, produce all output, and end the frame.
Definition: compressor.cpp:56
Compressor & operator=(const Compressor &other)=delete
Size_t get_pledged_input_size() const
Return the size previously provided to set_pledged_input_size, or pledged_input_size_unset if no pled...
Definition: compressor.cpp:85
void feed_char_t(const Char_t *input_data, Size_t input_size)
Worker function for feed, requiring the correct Char_t type.
Definition: compressor.cpp:38
Compressor & operator=(Compressor &&other)=delete
bool m_empty
True when user has not provided any input since the last reset.
Definition: compressor.h:261
bool m_pending_input
True when user has provided input that has not yet been consumed.
Definition: compressor.h:258
type get_type_code() const
Definition: compressor.cpp:28
virtual void do_reset()=0
Implement reset.
Managed_buffer_sequence_t::Char_t Char_t
Definition: compressor.h:83
virtual void do_feed(const Char_t *input_data, Size_t input_size)=0
Implement feed.
mysql::containers::buffers::Grow_constraint Grow_constraint_t
Definition: compressor.h:85
mysql::containers::buffers::Managed_buffer_sequence<> Managed_buffer_sequence_t
Definition: compressor.h:82
Compress_status compress(Managed_buffer_sequence_t &out)
Consume all input previously given in the feed function.
Definition: compressor.cpp:47
static constexpr Size_t pledged_input_size_unset
Definition: compressor.h:86
void feed(const Input_char_t *input_data, Size_t input_size)
Submit data to be compressed.
Definition: compressor.h:127
virtual void do_set_pledged_input_size(Size_t size)
Implement set_pledged_input_size.
Definition: compressor.cpp:89
Description of a heuristic to determine how much memory to allocate.
Definition: grow_constraint.h:64
Owned, non-contiguous, growable memory buffer.
Definition: managed_buffer_sequence.h:114
typename Buffer_sequence_view_t::Size_t Size_t
Definition: rw_buffer_sequence.h:110
typename Buffer_sequence_view_t::Char_t Char_t
Definition: rw_buffer_sequence.h:109
Container class that provides a sequence of buffers to the caller.
ValueType max(X &&first)
Definition: gtid.h:103
Definition: base.cpp:27
Grow_status
Error statuses for classes that use Grow_calculator.
Definition: grow_status.h:38
size_t size(const char *const c)
Definition: base64.h:46