CS/SWE 795, Fall 2017, Homework 3, due 9/25/2017 4:00pm.

Introduction

In our discussions of JPF and Exe this week, we talked about how tools can represent the state of a program in terms of the constraints applied to “symbolic” variables. Exe’s approach for constraint tracking and propagation was to associate some symbolic structure with every variable (in C), and then instrument operations on those variables to make propagate constraints as appropriate.

This week, you will implement a simple system for tracking constraints on (concrete) integer variables in Java, using Phosphor to associate each variable with its symbolic data. Your tool does not need to do anything about these constraints (e.g. solve them, negate them or simplify them): it only needs to track them, and print them out at the end. Specifically, you will use Phosphor’s tainting functions to mark integer variables as “symbolic” (give them a taint identifying them as symbolic), propagate these marks through data flow, and track what constraints have been witnessed (through control flow).

As an example, in the code

we want i to be marked as simply a symbolic variable, j to be marked as a symbolic variable (with the symbolic value i + 5), and k to be a concrete variable. We also want to track the complete list of constraints that we have seen applied to these symbolic variables. At the end of this method, we’ve seen the constraint i + 5 > 0.

Getting Started

Academic honesty reminder: You may NOT share any of your code with anyone else. You may NOT post your code in a publicly viewable place (e.g. in a public GitHub repository).  You may face severe penalties for sharing your code, even “unintentionally.” Please review the course’s academic honesty policy.

This assignment totals 100 points, and will correspond to 10% of your final grade for this course.

Start out by importing the HW3 GitHub repository. Then, clone your private repository on your machine, and import it to your IDE. I’ve provided a starter project that uses Phosphor, configured with a lot of starter code. You will do all of your work for this assignment in this project.

Tracking data flow operators (70 points)

We need to track how variables are derived from each other. For instance, in the example code

We need to know that j is the result of the symbolic variable i being added to the concrete value 5. To set the scope of the rest of this section of the assignment, please know that we are considering only the following operators for this assignment:

  • IADD
  • ISUB
  • IDIV
  • IOR
  • IAND
  • IMUL
  • IXOR

We’re going to use Phosphor to track these data flow relationships. For this assignment, we’re going to interface with Phosphor in a new way. Last week, we used ASM to add code before Phosphor instrumented it. This week, we will be using ASM to add code while Phosphor instruments it, allowing us to change Phosphor’s handling of data flow propagation for these particular operations.

Part 1: Marking variables as symbolic

I’ve provided code in your repository to mark a variable as symbolic – you can use it out-of-the-box (in the bottom of PathUtils.java). Note that I am making the label of each taint be an “Expression” (imported from a JPF package), this will simplify the rest of the assignment.

Part 2: Overriding the behavior of integer instructions

Instead of writing a MethodVisitor to change what IADD does, we’re going to write a specialized (and mainly undocumented) TaintTagFactory, which is Phosphor’s API for changing taint tag propagation semantics. For this step, we will modify the handling of the simple integer operators outlined above, in the  stackOp method of PathConstraintTagFactory.java.

For these instructions (and only these instructions), change the behavior (remove the super.stackOp call), so that the corresponding edu.gmu.cs795.hw3.runtime.PathUtils method is called (e.g. edu.gmu.cs795.hw3.runtime.PathUtils.IADD). Note that because you are now inserting code within Phosphor’s instrumentation process, it will not automatically change your method name/descriptors. Instead, you have full access to everything going on (e.g. if you used the analyzer object to inspect the stack contents, you’d find the taint tags there). In particular, this gets a little sticky with that last argument that we pass around – the pre-allocated return type. That variable needs to (1) be passed to our I*** methods, and (2) unwrapped after invoking our method (to get the taint tag and return value on the stack). There is sample code to do this for IADD in a comment in your assignment.

Check your progress: These functions are currently implemented correctly (to add/subtract/etc and then return the value, but not to propagate the symbolic tag). Once you’ve implemented this step, the example should run (and not crash), but the symbolic tag won’t be propagated.

Part 3: Adding support to propagate symbolic tags

The last step of the data-flow tracking aspect of the assignment is to propagate symbolic tag information through the various PathUtils.I*** functions. We’re going to continue to use the JPF data structures to hold this information (the library is already included in your HW3 project).

Add code to each of the I*** functions in PathUtils to set the taint of the return, for instance, perhaps:

Note that you need to handle cases where only one of the operands (rVal or lVal) are symbolic, and also cases where both operands are symbolic. If both operands are concrete, then the return value taint should be null.

Check your progress: At this point, running the same test case (with mvn verify) should give you output similar to:

Tracking Constraints  (30 points)

The last component of this assignment is to track the constraints that we witness applied to variables as the program runs. I have provided instrumentation code (that you shouldn’t have to modify in any way) in PathConstraintTaintTagFactory that will call the  addConstraint method on PathUtils before each jump. For the one-argument jumps (IFEQ, IFNE, etc),  addConstraint(Taint<Expression> t, int opcode) will be called, passing the taint of the jump condition (null if it is not symbolic). For the two-argument jumps (IF_ICMP**),  public static void addConstraint(Taint<Expression> l, Taint<Expression> r, int v1, int v2, int opcode) will be called, passing both jump condition taints and both jump condition values.

Implement these routines so that they track the list of constraints as a conjunction of AND  za.ac.sun.cs.green.expr.Operation’s stored in the static field  edu.gmu.cs795.hw3.runtime.PathUtils.currentPathConstraints.

Check your progress: At this point, running the same test case (with mvn verify) should give you output similar to:

Testing notes

You can reset the current list of path constraints by calling  edu.gmu.cs795.hw3.runtime.PathUtils.resetPathConstraints(). You may find it helpful to write multiple test cases, and create assertions that certain expressions are equal to some string value.

Submission

Perform all of your work in your homework-3 git repository. Commit and push your assignment. Make sure that your name is specified somewhere in the README. The time that you push the code will be the time used to determine if it is on time, within the 24-hour grace period, or too late.

Make sure that your released code includes all of your files and builds properly. You can do this by clicking to download the archive, and inspecting/trying to build it on your own machine/vm. There is no need to submit binaries/jar files (the target/ directory is purposely ignored from git).

If you want to resubmit before the deadline: Simply commit and push again. I will grade the last commit that was received before the deadline. You will not be able to push after the deadline.

Reminder – Late policy: Late assignments will be accepted for 24 hours after the due date, for a penalty of 10%. After 24 hours have passed since the deadline, no late submissions will be accepted. Your release is what will be graded: if you forget to make a release, then we will assume that you chose to not submit the assignment.

Contact