Concepts¶
Each input for the analyst is called a Task. The task could be of varying kinds like Image Annotation, Search Relevance etc based on the project. A task may go through multiple levels based on the required SLA. At a single level, a task could be analysed by multiple analysts in case of Parallel execution. In this case depending on the customer requirements, a majority rule could be adopted to get the final value. Typically, the analysts are associated with an Data Service Partner organisation.
A Task typically contains a set of Input fields which is the data presented to the Analyst and a set of Output fields which is the information curated by the Analyst.
All tasks belong to a Project which is the basic unit for all configuration and RBAC. Fields, levels, permissions and users are assigned at a project level. The projects are typically created by the Annotation Partner or Taskmonk Support team through the portal.
Tasks are grouped into logical units called Batch. Batches help in correlating the tasks between the customer and annotation partner. Typically a batch corresponds to an input and a corresponding output file. New batches can be created using Create Batch API
Typically tasks are completed by the analysts based on the order in which they are added to TaskMonk. Batches can have a priority associated with it which can be modified using the Edit Batch API. This helps when a set of tasks needs to be completed by the analysts before other tasks. Batches with higher priority are processed earlier.
Input¶
Tasks can be input into a batch in Taskmonk through multiple methods using the APIs. A single task can be uploaded, or multiple tasks can be uploaded at the same time. There are two different APIs to upload a group of tasks
Using the streaming client
The upload APIs return a job_id. The job_id is used for tracking the upload of the tasks to the TaskMonk database. When the number of tasks are large and if there is certain pre-processing that needs to be done, the import process can take a few minutes. The job_id allows the user to track and get status for the import process. TaskMonk also support email and slack notifications which can update the user when the import process is complete.
Output¶
As analysts work on the tasks, the user can get the status of the batch by calling the batch status API. This API will return the number of completed tasks and the pending tasks.
The user can retrieve the updated output for the task in multiple ways
Real-time updates The user can setup a streaming client which gets notified of the updated task as and when the Annotation Partner completes working on the task
Batch output When the batch is fully completed by the annotation partner, the user can extract the complete output file. There are two methods to download the batch output
Get output file. This API returns a job_id and a file_url which will be valid once the job is complete.
Get dictionary output. This supports paginated output and the number of tasks to be retrieved in each call can be specified