C/C++ tools for code safety #
Dynamic analysis #
Valgrind #
- On-the fly instrumentation of binaries
- Works with all binaries compiled by all compilers, even without source code/debug symbols
- Drawback: Not much information in binaries; no information about stack allocation (so no detection of stack buffer overflows)
LLVM sanitizers #
- Instrumentation of source code
- Provided by LLVM compiler suite (i.e., Clang)
- More information because source code is instrumented rather than binaries
Examples:
AddressSanitizer
- finds improper memory addressesLeakSanitizer
- finds memory leaksMemorySanitizer
- finds uses of uninitialized memoryUndefinedBehaviorSanitizer
- finds usage of null pointers, integer/float overflow, etcThreadSanitizer
- finds improper uses of threads
Weaknesses of dynamic analysis #
- Dynamic analysis can only report bad behavior that actually happened
- Requires inputs which would cause bad behavior to be input to catch such behavior
Fuzzing #
- Blind fuzzing: throw many random inputs at a program
- Coverage-guided fuzzing:
- Take normal input, run it through the program, observe control flow
- Semi-randomly mutate the normal input
- Run program again and observe control flow
- Keep mutated inputs that changed control flow
- Return to step 2 with these kept inputs, infinitely
Common fuzzers: AFL and libfuzzer
Still cannot guarantee that a program is bug-free (if the fuzzer didn’t find anything in some amount of time, maybe we didn’t run it long enough)
Static analysis #
Linting #
- Basic static analysis: simple techniques to find obvious mistakes
- Person running linter can configure rules to enforce
- ex.
clang-tidy
- can auto-fix some issues!
Dataflow analysis #
- Walks through every branch of the abstract syntax tree looking for issues
Limitations:
- False positives: dataflow analysis follows every branch, even if it’s impossible for some condition to be true in real life
- If there are a lot of false positives (low signal-to-noise ratio), it’s difficult to actually figure out which issues pointed out are real problems
- Many static analyzers only analyze a single file at a time, so some bugs won’t be found if they are split across files
Limitations of static analysis #
- Hard to tell whether code is safe without broader context if we can only look at a few lines of code
- Impossible to generally get broader context due to halting problem
How to verify small snippets of code in isolation without broader context?
- This can be done adding a little bit of information to the code
- This is what Rust does!