The content of this file is rather fixed. The Interpreter.checkFile() method checks its content, so if made any change, you shoulduse it to see if everything is fine. Its default content is that of this file.
In order to know what to put in the HardDataFields section, you can go log in your account at CrowdFlower after creating a cml template and uploading some fake data. After logging, go to https://crowdflower.com/jobs/$Job_Id.csv, where $Job_Id is of course the identifier of the job you plan to use. It should launch the downloading of a csv file containing the job results.
You do not need to order a job to download it: if there is no results available right know, which is most likely the case, you will simply download the structure of the results CPS will parse. This is all you need.
The content of the HardDataFields should match the structure of this csv file.
The media.xml file contains all the information that is necessary for you to use your media. For instance, if you are using video extracted from several movies defined by their timecode, you should use this file's structure. While you can add all the information you want on each media, the following fields are mandatory:
Corresponds to a number associated with each media. Two media can not have the same id.
The name of the media, i.e its URL.
Miscellaneous data to display along with the media, such as the name of the movie it is taken from, etc. If you do not want to use it: leave it empty, but leave it!
During its execution, after each iteration, CPS will create a file for each axis along which it has performed comparisons. For example, if you asked it to place media along the axis1 and axis2 axes, it will create axis1.xml and axis2.xml in the data/previously directory.
These files contain the current state of the quicksort, i.e the way the identifier of the media are currently arranged along each axis. If you are not familiar with quicksort, you should go here first. The identifier are in arrays, each array waiting for the next iteration. These files save this structure, as you can see in this example file: identifiers are grouped within "array" nodes, which is pretty self explanatory.