I am building a LD_PRELOAD library which has been working fine so far.
The library intercepts libc API like open/close, exec* etc.
I recently added some code to be run during the library constructor/initialization phase.
Basically
lazy_static! {
pub static ref APP64BITONLY_PATTERNS: RegexSet = {
event!(Level::INFO, "APP64BITONLY_PATTERNS Reading....");
let p: Vec<String> = if !CONFIG.app64bitonly_patterns.is_empty() {
CONFIG.app64bitonly_patterns.iter().map(|v| {
if v.starts_with("^") {
render(v,&TEMPLATEMAP)
} else {
let mut x = "^".to_owned();
x.push_str("{{RE_WSROOT}}");
x.push_str("/");
x.push_str(v);
// eprintln!("REGEX: {}", x.as_str());
render(x.as_str(),&TEMPLATEMAP)
}
}).collect()
} else {
vec!("^NOMATCH/.*$".to_owned())
};
// eprintln!("p: {:?}", p);
let x = RegexSet::new(&p).unwrap_or_else(|e| {
errorexit!("WISK_ERROR: Error compiling list of regex in app_64bitonly_match: {:?}", e);
});
event!(Level::INFO, "APP64BITONLY_PATTERNS Reading....Done");
let bt = Backtrace::new();
event!(Level::INFO, "APP64BITONLY_PATTERNS {:?}", bt);
x
};
}
fn initialize() {
lazy_static::initialize(&APP64BITONLY_PATTERNS);
}
which just compiles a very small list of regex patterns
ONFIG Reading....Done
APP64BITONLY_PATTERNS Reading....
p: ["^/nobackup/sarvi/xewisktest/"]
APP64BITONLY_PATTERNS Reading....Done
I also added a backache printing code after this
let bt = Backtrace::new();
event!(Level::INFO, "APP64BITONLY_PATTERNS {:?}", bt);
Strace shows that somewhere after the above, the shared library seems to be getting loaded/read again.
write(2, "APP64BITONLY_PATTERNS Reading....Done\n", 38) = 38
futex(0x7f0a67e97040, FUTEX_WAKE_PRIVATE, 2147483647) = 0
readlink("/proc/self/exe", "/usr/bin/bash", 256) = 13
openat(AT_FDCWD, "/ws/sarvi-sjc/wisktrack/lib64/libwisktrack.so", O_RDONLY|O_CLOEXEC) = 3
Questions:
Are things that are not allowed to be down within library constructor functions, like call backtrace() or may be things that might force large memory usage that might intern force brk/mmap operations?
Would something like the backtrace() call involve reading one or more of these library files again?
Also even without the backtrace code
The program the LD_PRELOAD library is run with segv's with almost no traceback.
When the List of the regex strings that are compiled are large. When the config has 2 short patterns the program run finds. If I add a very long pattern that is then compiled. The main program segv's in the main code, nothing related to the LD_PRELOAD library.
Question: Is there anything in the RUST RegexSet code that requires a lot of memory or that might make it not suitable for being called from within the library initializer?