Condor Workflow Job Type

Last Updated: January 2022

Important

This feature requires the condorpy library to be installed. Starting with Tethys 5.0 or if you are using micro-tethys-platform, you will need to install condorpy using conda or pip as follows:

# conda: conda-forge channel strongly recommended
conda install -c conda-forge condorpy

# pip
pip install condorpy

A Condor Workflow provides a way to run a group of jobs (which can have hierarchical relationships) as a single (Tethys) job. The hierarchical relationships are defined as parent-child relationships. For example, suppose a workflow is defined with three jobs: JobA, JobB, and JobC, which must be run in that order. These jobs would be defined with the following relationships: JobA is the parent of JobB, and JobB is the parent of JobC.

Creating a Condor Workflow

Creating a Condor Workflow job involves 3 steps:

Create an empty Workflow job from the job manager.

Create the jobs that will make up the workflow with CondorWorkflowJobNode

Define the relationships among the nodes

from tethysapp.my_first_app.app import MyFirstApp as app
from tethys_sdk.jobs import CondorWorkflowJobNode
from tethys_sdk.workspaces import app_workspace


@app_workspace
def some_controller(request, app_workspace):
    workflow = job_manager.create_job(
        name='MyWorkflowABC',
        user=request.user,
        job_type='CONDORWORKFLOW',
        scheduler=app.get_scheduler('condor_primary'),
    )
    workflow.save()

    job_a = CondorWorkflowJobNode(
        name='JobA',
        workflow=workflow,
        condorpy_template_name='vanilla_transfer_files',
        remote_input_files=(
            os.path.join(app_workspace, 'my_script.py'),
            os.path.join(app_workspace, 'input_1'),
            os.path.join(app_workspace, 'input_2')
        ),
        attributes=dict(
            executable='my_script.py',
            transfer_input_files=('../input_1', '../input_2'),
            transfer_output_files=('example_output1', 'example_output2'),
        )
    )
    job_a.save()

    job_b = CondorWorkflowJobNode(
        name='JobB',
        workflow=workflow,
        condorpy_template_name='vanilla_transfer_files',
        remote_input_files=(
            os.path.join(app_workspace, 'my_script.py'),
            os.path.join(app_workspace, 'input_1'),
            os.path.join(app_workspace, 'input_2')
        ),
        attributes=dict(
            executable='my_script.py',
            transfer_input_files=('../input_1', '../input_2'),
            transfer_output_files=('example_output1', 'example_output2'),
        ),
    )
    job_b.save()

    job_c = CondorWorkflowJobNode(
        name='JobC',
        workflow=workflow,
        condorpy_template_name='vanilla_transfer_files',
        remote_input_files=(
            os.path.join(app_workspace, 'my_script.py'),
            os.path.join(app_workspace, 'input_1'),
            os.path.join(app_workspace, 'input_2')
        ),
        attributes=dict(
            executable='my_script.py',
            transfer_input_files=('../input_1', '../input_2'),
            transfer_output_files=('example_output1', 'example_output2'),
        ),
    )
    job_c.save()

    job_b.add_parent(job_a)
    job_c.add_parent(job_b)

    workflow.save()
    # or
    workflow.execute()

Note

The CondorWorkflow object must be saved before the CondorWorkflowJobNode objects can be instantiated, and the CondorWorkflowJobNode objects must be saved before you can define the relationships.

Before a controller returns a response the job must be saved, otherwise, the changes made to the job will be lost (executing the job automatically saves it). If submitting the job takes a long time (e.g. if a large amount of data has to be uploaded to a remote scheduler) then it may be best to use AJAX to execute the job.

API Documentation

class tethys_compute.models.CondorWorkflow(*args, **kwargs): CondorPy Workflow job type

class tethys_compute.models.CondorWorkflowNode(*args, **kwargs): Base class for CondorWorkflow Nodes

class tethys_compute.models.CondorWorkflowJobNode(*args, **kwargs): CondorWorkflow JOB type node

Condor Job Type

Dask Job Type