Exo 1 ====== Q1 --- Path B0-B1-B2-B5 : 1. Path predicate: y0=x0+1 and y0<=5 and x0<=0 and y1=y0*2 2. Condition To trigger the vulnerability (y0=x0+1 and y0<=5 and x0<=0 and y1=y0*2) and (y1 > 9 or y1<0) 3. solution ? y1 <= 2, e.g, y1=-2, thus y0=-1 and x0=-2 Q2 --- Path B0-B1-B2-B4-B1-B2-B5 1. Condition to follow this path and trigger the vuln.: y0=x0+1 and y0<=5 and x0>0 and y1=y0*3 and x1=x0+2 and y1<=5 and x1<=0 and y2=y1*2 and (y2 > 9 or y2<0) solutions ? x0>0 and x1=x0+2 et x1<0 and ... => no solution in Z Solution in signed integers (with wrap-around) : x0 = MAX_INT and y0=-1 and y1=-3 and x1=-2 and y2=-6 (assuming MAX_INT + 1 = -1) Q3 --- 1. Code with assertion: int x, y ; int T[10] ; read(x) ; assert (x 0) { assert (y=0 && y<10) (A4) T[y] = x ; } ; } ; 2. Abstract Interpretation with intervals iteration 1 out(B0) = x in [-inf, +inf] y in [-inf, +inf] in(B2) = x in [-inf, +inf] y in [-inf, 5] in(B4) = x in [1, +inf] y in [-inf, 5] in(B5) = x in [-inf, O] y in [-inf, 5] A0 is not discharged in B0 A1 and A2 are discharged in B4 A3 is discharged in B5 A4 is not discharged in B5 These results won't change in further iterations ... % ======================================================= DOWSER Q1 : Dowser objectives = - find buffer overflows in C/C++ pgms - based on a combination of static and dymanic analysis - static analysis to detect "hot spot", i.e., pgm locations where BoF are more likely to occure - dynamic analysis (actually concolic execution) to attempt to confirm the BoF, with two characteristics: - (pointer) value coverage, to increase the chance of getting inavlid memory access - taint analysis, to reduce the size of the symbolic inputs -> try to "stress" as much as possible the pointers used in loops to catch BoFs Workflow = 1. Static analysis on the source code to get loop ranks 2. Dynamic taint analysis to indentify the pgm inputs influencing the BoF prone loops 3. Symbolic execution to try to trigger the BoF 4. Use AdSan as an oracle, to detect BoF occurrences ... Q2: 1. input = source-level C/C++ pgm, translated in LLVM output = scores associated to pgm loops, indicating there "probability" to lead to a BoF 2. it's a static analysis 3. metric to select interesting loops = "complex array accesses", depending on the complexity of control and data-flow involved in the array index computations, such that - the pointer value is modified within the loop - complex "pointer arithmetic" (casts, non-constant operations, non local operations, etc.) - non trivial guards on pointer operations (i.e.. <> from comparusons with a constant) is modified in the loop 4. There exists at least trivial BoF which are not caught, e.g., int T[10] ; int x = 100 ; while (....) T[x] = ... /* x never modified in the loop */ Q3: 1. input = a pgm + ranked loops with a "vulnerable instruction group" + test inputs able to exercise these instrutions output = set of pgm inputs bytes influencing the execution of each vulnerable instruction group 2. dynamic analysis 3. track so-called "strict control dependencies", i.e., direct comparisons between taint variables and constants e.g : // x is tainted if (x == 'a') y = ... // y considered as tainted // x is tainted, z is not tainted if (x == z) y = ... // y not considered as tainted 4. false dependencies = nested if-then-else in an arbitary order (i.e, no real dependencies between the cases) partially solved byu shifting the order of unrelated input fields need some knowledge about he pgm input ... if (x == 'a') y = ... // y considered as tainted Q4: 1. input = a pgm + ranked loops with a "vulnerable instruction group" + set of input bytes able to influence the execution of these instructions output = BoF detected by AdSan 2. dynamic analysis (concolic execution) 3. Since symbolic execution can only execute *finite* paths, this technique cannot fully handle loops (the number of iterations has to be bounded), making the approach unsound in general (real bugs can be missed). Dowser proceeds in 2 steps: - learning, where paths traversing interesting loops are ranked according to their potential influence on pointer dereferences. This allows to prune "irrelevant" branches (the ones not leading to any new dereference addresses), and to flag branches according to their ability to exercise "unique" pointer access patterns. In this learning phase only a subset of the symbolic input is considered (for performance reasons). - symbolic execution, guided by the results from the learning phase, to cover at most as possible the various pointer access patterns in the target loop. However, Dowser can still miss interesting paths, the approach is therefore not sound. Conversly, even if false positives may happen, they do not lead to incorrect BoF reports (they only penalize the excution time). Q5: 1. Dowser cannot report false positives, but it can miss bugs (false negatives). 2. The same general approach (static analysis + guided symbolic execution) can be used for detecting other vulnerabilities, provided a suitable static analysis technique to identify the "hot spots" (+ a good guiding strategy). 3. In principle Dowser could work directly from the binary code, but its efficiency would certainly be lower (disassembling may lower a lot the accuracy of the static analysis step). 4. The vulnerability of Exercise 1 could be detected by Dowser (the weight of the loop is non-zero). Q5: