Need: Developer needed to build a Scheduler to controls jobs and running of custom objects within processing engine.
We are building a processing engine to perform database actions with a user configurable workflow. This is a three tier system. Tier one is the database and definition of objects. This will present a list of object definitions encoded in JSON for the upper layers to manipulate and an implementation which can be executed at the command line in Linux. On top of this is a scheduler service responsible for execution and queuing of database actions at the request of the user interface or as responses to the successful or failed termination of other processes. The scheduler will be controlled by a user interface layer which will communicate with the scheduler via HTTP.
We are contracting for a developer to build only the Scheduler component of the overall system.
The scheduler will control the running of objects within the processing engine. This will hold the current state of all workflows known to the system. It will execute objects within the processing engine based on the rules held within the definition and configuration of object files.
The scheduler is to run on a Linux (CentOS) box and should be written in a suitable system type programming language (perl, python or similar). It may be based on and be an extension of a pre-existing product but if so, the pre-existing product must be open-source.
A workflow is represented visually as a flow diagram of interconnected nodes. Internally nodes will be stored individually with their own configuration. Execution of these process nodes will trigger the next process node in the workflow. All nodes in the system are defined in JSON and referred to as scheduler objects. The object will include several data portions depending on its state in the system.
Definition portion - a template for the configuration parameters required for this node type. Provided by the database engine.
Configuration portion - configuration details for this instance of the node. Provided by the user interface. Will include links to following nodes to control execution paths.
Payload portion - Space for information to be passed between nodes. The UI and scheduler are not required to parse this.
Status/Results portion - provided by the database process. Part of this will be displayed by the user interface to show the overall status of any particular workflow.
The scheduler will interface to the processing engine via command line execution. It will execute the process object as required, passing the whole of the object definition via stdin. Messages will be returned from the process object via stdout. These will include intermediate status information and final results information. Final results may also include further instructions to the scheduler, for instance to execute the next object in the workflow and a payload section of information for the following process.
The scheduler will present a control interface for the user interface to use. The user interface will probably be on a separate server so this interface must be network capable. This interface will provide access to the object templates, methods to read and write workflows of joined configured objects and status reports for running or completed objects. This interface will also accept control commands for example to queue the execution of objects or manually terminate running objects.
The scheduler shall maintain several process queues for different types of objects as specified in the object template definition. These may be implemented as a single queue as long as the following behaviours are supported.
The first queue will contain a copy of all objects which require to be run at regular intervals. These objects will include such things as timers, processes waiting for files to arrive in a directory etc. Every object in this queue shall be executed in turn and replaced at the end of the queue after e