Description
Every time one calls captures
, a new allocation for storing the location of captures is created. This allocation has size proportional to the number of captures in the regex.
This is also true for captures_iter
, where every iteration results in a new allocation. An iterator could reuse the allocation in theory, but ownership of the captures is transferred to the caller. Even if we could reuse the capture locations, we couldn't give the caller a mutable borrow, since that immediately puts us in the "streaming iterator" conundrum.
The most sensible API I can think of is to:
- Permit the caller to build empty
Captures
values from a givenRegex
such that it has the right size. - Pass a mutable borrow to a
Captures
to a call tocaptures
, which lets the caller control the allocation.
It's not quite clear how to apply this to captures_iter
while still implementing Iterator
. I suspect we should probably borrow from the io::BufReader::read_line
style methods. e.g.,
impl Regex {
// Returns empty storage for captures for use with read_captures.
fn new_captures(&self) -> Captures { ... }
// On successful match, returns true and sets capture locations.
// Otherwise returns false.
fn read_captures(&self, caps: &mut Captures, text: &str) -> bool { ... }
fn read_captures_iter<'r, 't>(&'r self, text: &'t str) -> ReadCapturesIter<'r, 't> { ... }
}
struct ReadCapturesIter<'r, 't> { ... }
impl<'r, 't> ReadCapturesIter<'r, 't> {
// On successful match, returns true and sets capture locations.
// Otherwise returns false.
fn captures(&mut self, caps: &mut Captures) -> bool { ... }
}
And I think this would work well.
Main questions:
- Any alternatives?
- Do we replace the existing API with the one from above (in 1.0)? Or do we add it? My inclination is to add it, but I really hate expanding the API with more choices.