This dataset was generated by running the entire test suite for each included project 10,000 times (with maven) and observing if a test outcome ever changed. More documentation of the dataset is forthcoming, and a preliminary release of it is available here. There is also a lot more companion material (including the build logs and test result files for each run of each test) that we plan to release. This is a work-in-progress and is currently not published.