Skip to content

Conversation

thurstond
Copy link
Contributor

@thurstond thurstond commented Jun 6, 2024

This test case shows a limitation of DFSan's sscanf implementation (introduced in https://reviews.llvm.org/D153775): it simply ignores ordinary characters in the format string, instead of actually comparing them against the input. This may change the semantics of instrumented programs.

Importantly, this also means that DFSan's release_shadow_space.c test, which relies on sscanf to scrape the RSS from /proc/maps output, will incorrectly match lines that don't contain RSS information. As a result, it adding together numbers from irrelevant output (e.g., base addresses), resulting in test flakiness
(#91287).

This test case shows a limitation of DFSan's sscanf implementation
(introduced in https://reviews.llvm.org/D153775): it simply ignores
ordinary characters in the format string, instead of actually comparing
them against the input. This may change the semantics of instrumented programs.

Importantly, this also means that DFSan's release_shadow_space.c test,
which relies on sscanf to scrape the RSS from /proc/maps output, will
incorrectly match lines that don't contain RSS information. As a result,
it is scraping numbers from irrelevant output (e.g., base addresses), and can
therefore result in test flakiness
(llvm#91287).
@llvmbot
Copy link
Member

llvmbot commented Jun 6, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Thurston Dang (thurstond)

Changes

This test case shows a limitation of DFSan's sscanf implementation (introduced in https://reviews.llvm.org/D153775): it simply ignores ordinary characters in the format string, instead of actually comparing them against the input. This may change the semantics of instrumented programs.

Importantly, this also means that DFSan's release_shadow_space.c test, which relies on sscanf to scrape the RSS from /proc/maps output, will incorrectly match lines that don't contain RSS information. As a result, it is scraping numbers from irrelevant output (e.g., base addresses), and can therefore result in test flakiness
(#91287).


Full diff: https://github.com/llvm/llvm-project/pull/94700.diff

1 Files Affected:

  • (added) compiler-rt/test/dfsan/sscanf.c (+19)
diff --git a/compiler-rt/test/dfsan/sscanf.c b/compiler-rt/test/dfsan/sscanf.c
new file mode 100644
index 0000000000000..dbc2de4ba96c1
--- /dev/null
+++ b/compiler-rt/test/dfsan/sscanf.c
@@ -0,0 +1,19 @@
+// RUN: %clang_dfsan %s -o %t && %run %t
+// XFAIL: *
+
+#include <assert.h>
+#include <stdio.h>
+
+int main(int argc, char *argv[]) {
+  char buf[256] = "10000000000-100000000000 rw-p 00000000 00:00 0";
+  long rss = 0;
+  // This test exposes a bug in DFSan's sscanf, that leads to flakiness
+  // in release_shadow_space.c (see
+  // https://github.com/llvm/llvm-project/issues/91287)
+  if (sscanf(buf, "Garbage text before, %ld, Garbage text after", &rss) == 1) {
+    printf("Error: matched %ld\n", rss);
+    return 1;
+  }
+
+  return 0;
+}

@thurstond
Copy link
Contributor Author

Relevant code in DFSan's scan_buffer:

static int scan_buffer(char *str, size_t size, const char *fmt,
                       dfsan_label *va_labels, dfsan_label *ret_label,
                       dfsan_origin *str_origin, dfsan_origin *ret_origin,
                       va_list ap) {
    ...
    if (*formatter.fmt_cur != '%') {
      // Ordinary character. Consume all the characters until a '%' or the end
      // of the string.
      for (; *(formatter.fmt_cur + 1) && *(formatter.fmt_cur + 1) != '%';
           ++formatter.fmt_cur) {
          // EDITOR'S NOTE: SHOULD THIS CHECK AGAINST THE INPUT STRING?
      }
      retval = formatter.scan();
      dfsan_set_label(0, formatter.str_cur(),
                      formatter.num_written_bytes(retval));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants