Serves as an intermediary between the MySQLbase database in which HIT results are stored and CrowdFlower, our crowdsourcing provider. More...

Collaboration diagram for crowdUser.CrowdManager:

[legend]

List of all members.

Public Member Functions
	CrowdManager (Integer firstJobId, String key, String resultsUrl, MySQLbase baseSQL, MediaBase baseDIR, List< String > fieldsHardData)
	Creates an instance of the CrowdManager class. <../..>
Integer	createNewJob ()
	Creates a new job at CrowdFlower that is a copy of the "original" one (the one with the `originalJobId` identifier) and returns its identifier. <../..>
void	uploadHIT (Integer jobId)
	Uploads at CrowdFlower the csv file corresponding to the comparisons required. <../..>
String	getHITresults (Integer jobId)
	Downloads results from the URL specified in the `resultsUrl` attribute and returns the path to it minus its extension (i.e `'../data/HITresults/results42'` for the results stored in the `'../data/HITresults/results42.csv'` file). <../..>
void	readHITresults (String path)
	Adds the HIT results contained in the csv file in `path` to the `dbSQL` myDataBases.MySQLbase database. <../..>
List< Integer >	getGreater ()
	Returns the value of greater. <../..>
List< Integer >	getSmaller ()
	Returns the value of smaller. <../..>
List< String >	getAxes ()
	Returns the value of axes. <../..>
Integer	getComparisonsNumber ()
	Returns the value of comparisonsNumber. <../..>
Static Public Member Functions
static void	thisIsTheEnd (String antechamberUrl, String results)
	Asks the `Antechamber` to send an email signaling the end of the Splitsort. <../..>
Private Attributes
List< Integer >	greater
	The ID of the media alleged to be "greater". <../..>
List< Integer >	smaller
	The ID of the media alleged to be "smaller". <../..>
List< String >	axes
	The axes along which each comparison must be performed. <../..>
Integer	comparisonsNumber
	The number of comparisons performed during the current iteration. <../..>
String	cfKey
	The key given by crowdFlower to authenticate yourself. <../..>
Integer	originalJobId
	The identifier of the "original" job, i.e the one from which every characteristics (webhook address, tags, cml configuration...) will be copied in all the jobs created during the current execution of CPS. <../..>
String	resultsUrl
	The number of jobs already submitted (a job is a group of HITs) <../..>
MediaBase	dbMedia
	The MediaBase instance in which the names of the media are stored. <../..>
MySQLbase	dbSQL
	The MySQLbase instance to which it is connected and in which HIT results must be stored. <../..>
Refiner	refiner
	The Refiner instance used to treat the result of the HITs. <../..>

Detailed Description

Serves as an intermediary between the MySQLbase database in which HIT results are stored and CrowdFlower, our crowdsourcing provider.

This class performs several operations : after retrieving unknown comparisons, it submits them to the crowd using CrowdFlower's "self service". When the HITs are finished, it downloads the results from the Antechamber (see the online documentation), saves them, gives them to its Refiner instance (refiner) to parse them and sends them to dbSQL, a myDataBases.MySQLbase instance in which they are stored.

Once this is done, it uses refiner to reduce the noise in the hard results and thus to obtain refined results (more details in the Refiner documentation). These are then sent to dbSQL.

Author:: Leo Perrin (perrin.leo@gmail.com)

Definition at line 39 of file CrowdManager.java.

Constructor & Destructor Documentation

crowdUser.CrowdManager.CrowdManager	(	Integer	firstJobId,
		String	key,
		String	resultsUrl,
		MySQLbase	baseSQL,
		MediaBase	baseDIR,
		List< String >	fieldsHardData
	)

Creates an instance of the CrowdManager class.

Initializes all of its attributes, except the greater, smaller and axes attributes. Indeed, these are set afterward using affectation like CrowdManagerInstance.getGreater() = someArrayOfIntegers.

Parameters:

firstJobId	The identifier of a job in your CrowdFlower account already configured.
key	The key given by CrowdFlower.
resultsUrl	The URL where `results$JOB_ID.txt` is, most likely that of the `csv/` folder of the Antechamber.
baseSQL	The `MySQLbase` in which the results must be stored.
baseDIR	The `MediaBase` in which the name of the media sorted are stored
fieldsHardData	The fields of the CSV file returned by the Antechamber containing the HITs' hard results.

Definition at line 110 of file CrowdManager.java.

  {
        this.comparisonsNumber = 0;
        this.resultsUrl = resultsUrl;
        this.greater = new ArrayList<Integer>();
        this.smaller = new ArrayList<Integer>();
        this.axes = new ArrayList<String>();
        this.originalJobId = firstJobId;
        this.cfKey = key;
        this.dbSQL = baseSQL;
        this.dbMedia = baseDIR ;
        this.refiner = new Refiner(fieldsHardData);
  };

Member Function Documentation

Integer crowdUser.CrowdManager.createNewJob ( )

Creates a new job at CrowdFlower that is a copy of the "original" one (the one with the originalJobId identifier) and returns its identifier.

Creates a copy of of the original job using a http POST to the address given in Crowdflower's documentation. When the POST is done, CrowdFlower returns a JSON formatted string containing, among other things, the identifier of the job.

Returns:: The identifier of the new job.

Definition at line 139 of file CrowdManager.java.

  {
        Integer id = 0;

        String postUrl = "http://api.crowdflower.c../..bs/" + this.originalJobId
                  + "/copy.json?key=" + this.cfKey ;
        // posting the request
        try
        {
              URL url = new URL(postUrl);
              HttpURLConnection conn = (HttpURLConnection) url.openConnection();
              // Building an HTTP request
              conn.setDoOutput(true);
              conn.setRequestMethod("POST");
              conn.connect();
              // getting the answer from CrowdFlower and reading the "id" field
              BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
              String line, idString = "";
              while ((line = rd.readLine()) != null)
                    if (line.contains('"'+"id"+'"'))
                          // we go to the index of ' "id": ' and read the number. In order no to take '"id":', 5 is added.
                          for (int i = line.lastIndexOf('"' +"id"+'"')+5; i<line.length(); i++)
                                if ( line.charAt(i) == ',' )
                                      break;
                                else
                                      idString += line.charAt(i) ;
              id = Integer.parseInt(idString);
              rd.close();
        } catch (Exception e) { e.printStackTrace(); }

        return id;
  }

List<String> crowdUser.CrowdManager.getAxes ( )

Returns the value of axes.

Returns:: axes

Definition at line 382 of file CrowdManager.java.

  { return this.axes; }

Integer crowdUser.CrowdManager.getComparisonsNumber ( )

Returns the value of comparisonsNumber.

Returns:: comparisonsNumber

Definition at line 390 of file CrowdManager.java.

  { return this.comparisonsNumber; }

List<Integer> crowdUser.CrowdManager.getGreater ( )

Returns the value of greater.

Returns:: greater

Definition at line 366 of file CrowdManager.java.

  { return this.greater ; }

String crowdUser.CrowdManager.getHITresults ( Integer jobId )

Downloads results from the URL specified in the resultsUrl attribute and returns the path to it minus its extension (i.e '../data/HITresults/results42' for the results stored in the '../data/HITresults/results42.csv' file).

First, you have to make sure the results are at the given address: was the correct webhook sent at the Antechamber ? Is the processing of the raw results finished? An email is sent when both of these operations are finisehded.

If so, use this method to download the results$JOB_ID.txt file containing the answers of the workers and to save it as ../data/results/results$JOB_ID.csv.

Parameters:

jobId The identifier of the job used to obtain comparisons. It is given by CrowdFlower's web application when a new job is created as well as the createNewJob() method.

Returns:: The path to the csv file containing the results of the job.

Definition at line 259 of file CrowdManager.java.

  {
        try
        {
              // creating connection
              String urlString = this.resultsUrl + "csv/" + jobId + ".txt";
              System.out.println(urlString);
              URL url = new URL(urlString);
              System.out.println("Opening connection to " + urlString + "...");
              // reading input
              InputStream is = url.openStream();
              System.out.flush();
              // creating results$JOB_ID.csv file
              FileOutputStream fos=null;
              fos = new FileOutputStream("../data/HITresults/results" + jobId + ".csv");
              // writing to file
              int oneChar, count=0;
              while ((oneChar=is.read()) != -1)
              {
                    fos.write(oneChar);
                    count++;
              }
              // close everything
              is.close();
              fos.close();
              System.out.println("csv results file downloaded, " + count + " byte(s) copied");
        }
      catch (Exception e) { e.printStackTrace(); }
      // return the name of the file minus ".csv"
        return "results" + jobId ;
  }

List<Integer> crowdUser.CrowdManager.getSmaller ( )

Returns the value of smaller.

Returns:: smaller

Definition at line 374 of file CrowdManager.java.

  { return this.smaller; }

void crowdUser.CrowdManager.readHITresults ( String path )

Adds the HIT results contained in the csv file in path to the dbSQL myDataBases.MySQLbase database.

The hard results are sent to refiner in order it to parse them. They are then sent to dbSQL to be stored. At this point, dbSQL contains all of the results obtained.

After that, results are refined by refiner in order to reduce the noise, see the Refiner.getanotherlabel method for more details on how it is done. Again, when this is over, refined results are stored in dbSQL.

Parameters:

path	The path to the csv file containing the HIT results

See also:: Refiner

Definition at line 306 of file CrowdManager.java.

  {
        List<String[]> entries ;
        // retrieve hard data from the csv file
        this.refiner.csv2List(path) ;
        entries = this.refiner.getData() ;
        this.dbSQL.insertHardDataInSQL(entries);
        // retrieve good data from the refiner
        this.refiner.getanotherlabel() ;
        entries = this.refiner.getData() ;
        this.dbSQL.insertResultsInSQL(entries,this.dbMedia) ;
  }

static void crowdUser.CrowdManager.thisIsTheEnd	(	String	antechamberUrl,
		String	results
	)		`[static]`

Asks the Antechamber to send an email signaling the end of the Splitsort.

Uses an http POST powered by java.net to the correct URL to ask the Antechamber to send an e-mail containing the identifier of this last job, information on how to retrieve the results and the content of the results String as an attachment.

Parameters:

antechamberUrl	The url to which the information must be posted
results	Content to be sent as an attachment file along with the e-mail. The name it will have is set in the Antechamber.

Definition at line 333 of file CrowdManager.java.

  {
        try
        {
              URL url = new URL(antechamberUrl);
              HttpURLConnection conn = (HttpURLConnection) url.openConnection();
              // Building an HTTP request
              conn.setDoOutput(true);
              conn.setRequestMethod("POST");
              OutputStreamWriter osw = new OutputStreamWriter(conn.getOutputStream());
              String data = URLEncoder.encode("signal", "UTF-8") + "=" + URLEncoder.encode("sort_finished","UTF-8")
                  + "&" + URLEncoder.encode("payload", "UTF-8") + "=" + URLEncoder.encode(results, "UTF-8");
              osw.write(data);
              osw.flush();
              osw.close();
              // getting the answer from CrowdFlower
              BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
              String line;
              while ((line = rd.readLine()) != null) {
                  System.out.println(line);
              }
              rd.close();
        } catch (Exception e) { e.printStackTrace(); }
  }

void crowdUser.CrowdManager.uploadHIT ( Integer jobId )

Uploads at CrowdFlower the csv file corresponding to the comparisons required.

The csv (Comma Separated File) file is first generated. The fields used are NOT set using the config.xml, they are hard coded. Indeed, CrowdFlower needs nothing else than the axis along which the comparison is performed, the identifier of the media (so they can be sent back), their names (so they can be displayed) and a text associated with each one (for example to respect a CC-BY license).

Once generated, it is sent using a PUT http request containing the csv file, the identifier of the job (jobNUmber) and the key given by CrowdFlower (the key parameter).

Parameters:

jobId The identifier of the job that will receive the data contained in the csv file.

Definition at line 187 of file CrowdManager.java.

  {
        this.comparisonsNumber += this.greater.size() ;
        String post = "";
        // writing the .csv file and the post String
        try{
              BufferedWriter fichier = new BufferedWriter
                              (new FileWriter("../data/HITdata/" + jobId + ".csv"));
              String line = "axis,idMedia1,idMedia2,urlMedia1,urlMedia2,miscData1,miscData2";
              fichier.write(line);
            fichier.newLine();
            post += line ;
              for (int i=0; i<this.greater.size(); i++)
              {
                    line = this.axes.get(i) + "," + this.greater.get(i) + "," + this.smaller.get(i) +
                        "," + this.dbMedia.getMedia(this.greater.get(i)) +
                        "," + this.dbMedia.getMedia(this.smaller.get(i)) +
                        ",\"" + this.dbMedia.getContent().get(this.greater.get(i)).get("miscData") +
                        "\",\"" + this.dbMedia.getContent().get(this.smaller.get(i)).get("miscData") + "\"";
                    fichier.write(line);
                  fichier.newLine();
                  post += "\n" + line ;
              }
            fichier.close();
            
        }
        catch (Exception e) { e.printStackTrace(); }
        
        // posting the file

        try
        {
              URL url = new URL("http://api.crowdflower.c../..bs/" + jobId +
                        "/upload.json?key=" + this.cfKey);
              HttpURLConnection conn = (HttpURLConnection) url.openConnection();
              // Building an HTTP request
              conn.setDoOutput(true);
              conn.setRequestMethod("PUT");
              conn.setRequestProperty("Content-Type", "text/csv");
              OutputStreamWriter osw = new OutputStreamWriter(conn.getOutputStream());
              osw.write(post);
              osw.flush();
              osw.close();
              
              // getting the answer from CrowdFlower
              BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
              String line;
              while ((line = rd.readLine()) != null) {
                  System.out.println(line);
              }
              
              rd.close();
        } catch (Exception e) { e.printStackTrace(); }

  }

Member Data Documentation

List<String> crowdUser.CrowdManager.axes [private]

The axes along which each comparison must be performed.

Definition at line 58 of file CrowdManager.java.

String crowdUser.CrowdManager.cfKey [private]

The key given by crowdFlower to authenticate yourself.

If you have an account (and you definitively have to), it is available here.

Definition at line 67 of file CrowdManager.java.

Integer crowdUser.CrowdManager.comparisonsNumber [private]

The number of comparisons performed during the current iteration.

Definition at line 62 of file CrowdManager.java.

MediaBase crowdUser.CrowdManager.dbMedia [private]

The MediaBase instance in which the names of the media are stored.

Definition at line 82 of file CrowdManager.java.

MySQLbase crowdUser.CrowdManager.dbSQL [private]

The MySQLbase instance to which it is connected and in which HIT results must be stored.

Definition at line 86 of file CrowdManager.java.

List<Integer> crowdUser.CrowdManager.greater [private]

The ID of the media alleged to be "greater".

They are to be compared with those of the "smaller" attribute through HITs.

Definition at line 49 of file CrowdManager.java.

Integer crowdUser.CrowdManager.originalJobId [private]

The identifier of the "original" job, i.e the one from which every characteristics (webhook address, tags, cml configuration...) will be copied in all the jobs created during the current execution of CPS.

Definition at line 74 of file CrowdManager.java.

Refiner crowdUser.CrowdManager.refiner [private]

The Refiner instance used to treat the result of the HITs.

Definition at line 90 of file CrowdManager.java.

String crowdUser.CrowdManager.resultsUrl [private]

The number of jobs already submitted (a job is a group of HITs)

Definition at line 78 of file CrowdManager.java.

List<Integer> crowdUser.CrowdManager.smaller [private]

The ID of the media alleged to be "smaller".

They are to be compared with those of the "greater" attribute through HITs.

Definition at line 54 of file CrowdManager.java.

The documentation for this class was generated from the following file: