Jobs API

Last Updated: September 2024

The Jobs API provides a way for your app to run asynchronous tasks (meaning that after starting a task you don't have to wait for it to finish before moving on). As an example, you may need to run a simulation that takes a long time (potentially hours or days) to complete. Using the Jobs API you can create a job that will submit the simulation run, and then leave it to run while your app moves on and does other stuff. You can check the job's status at any time, and when the job is done the Jobs API will help retrieve the results.

Job Management Software

The Jobs API leverages third-party job management systems to perform the actual execution of the jobs, including HTCondor and Dask. HTCondor is well suited for executing long-running executable-like jobs, such as running a simulation. Dask is ideal for converting existing Python logic into efficient, parallel running code, especially logic that uses the SciPy stack to perform the analysis.

To learn more about HTCondor and Dask, review the following resources:

Schedulers

A scheduler is the part of a job managment system that is responsible for accepting job requests and assigning them to the appropriate computing resources to be executed. The Jobs API needs to know how to connect to the scheduler to be able to submit jobs. This is done by adding Scheduler entries in the Tethys Portal admin pages (see: Tethys Compute Admin Pages: Schedulers).

Note

Schedulers can also be created and accessed programmatically using the lower-level compute API. However, this approach is only recommended for experienced developers (see: Low-Level Scheduler API).

Scheduler App Settings

Apps that use either HTCondor or Dask should define one or more app Scheduler Settings in their app class. Tethys Portal administrators use this setting to assign an appropriate Scheduler to the app when it is being configured. To access Schedulers assigned to app Scheduler Settings, use the get_scheduler() method of the app class. See Scheduler Settings for more details.

Job Manager

To facilitate interacting with jobs asynchronously, the metadata of the jobs are stored in a database. The Jobs API provides a Job Manager to handle the details of working with the database, and provides a simple interface for creating and retrieving jobs. The Jobs API supports various types of jobs (see Job Types).

Using the Job Manager in your App

To use the Job Manager in your app you first need to import the TethysAppBase subclass from the app.py module:

from .app import App

You can then get the job manager by calling the method get_job_manager on the app.

job_manager = App.get_job_manager()

You can now use the job manager to create a new job, or retrieve an existing job or jobs.

Creating and Executing a Job

To create a new job call the create_job method on the job manager. The required arguments are:
  • name: A unique string identifying the job

  • user: A user object, usually from the request argument: request.user

  • job_type: A string specifying on of the supported job types (see Job Types)

Any other job attributes can also be passed in as kwargs.

# create a new job from the job manager
job = job_manager.create_job(
    name='myjob_{id}',  # required
    user=request.user,  # required
    job_type='CONDOR',  # required

    # any other properties can be passed in as kwargs
    attributes=dict(attribute1='attr1'),
    condorpy_template_name='vanilla_transfer_files',
    remote_input_files=(
        os.path.join(app_workspace, 'my_script.py'),
        os.path.join(app_workspace, 'input_1'),
        os.path.join(app_workspace, 'input_2')
    )
)

# properties can also be added after the job is created
job.extended_properties = {'one': 1, 'two': 2}

# each job type may provided methods to further specify the job
job.set_attribute('executable', 'my_script.py')

# save or execute the job
job.save()
# or
job.execute()

Before a controller returns a response the job must be saved or else all of the changes made to the job will be lost (executing the job automatically saves it). If submitting the job takes a long time (e.g. if a large amount of data has to be uploaded to a remote scheduler) then it may be best to use AJAX to call the controller that executes the job in the background.

Tip

The Jobs Table Gizmo has a built-in mechanism for submitting jobs with AJAX. If the Jobs Table Gizmo is used to submit the jobs then be sure to save the job after it is created.

Common Attributes

Job attributes can be passed into the create_job method of the job manager or they can be specified after the job is instantiated. All jobs have a common set of attributes. Each job type may have additional attributes specific that are to that job type.

The following attributes can be defined for all job types:
  • name (string, required): a unique identifier for the job. This should not be confused with the job template name. The template name identifies a template from which jobs can be created and is set when the template is created. The job name attribute is defined when the job is created (see Creating and Executing a Job).

  • description (string): a short description of the job.

  • workspace (string): a path to a directory that will act as the workspace for the job. Each job type may interact with the workspace differently. By default the workspace is set to the user's workspace in the app that is creating the job.

  • extended_properties (dict): a dictionary of additional properties that can be used to create custom job attributes.

  • status (string): a string representing the state of the job. When accessed the status will be updated if necessary. Possible statuses are:
    • Pending

    • Submitted

    • Running

    • Paused

    • Results-Ready

    • Complete

    • Error

    • Aborted

    • Various*

    • Various-Complete*

    • Other**

    *used for job types with multiple sub-jobs (e.g. CondorWorkflow).

    **When a custom job status is set the official status is 'Other', but the custom status is stored as an extended property of the job. See Custom Statuses

  • cached_status (string): Same as the status attribute, except that the status is not actively updated. Rather the last known status is returned.

All job types also have the following read-only attributes:
  • user (User): the user who created the job.

  • label (string): the package name of the Tethys App used to created the job.

  • creation_time (datetime): the time the job was created.

  • execute_time (datetime): the time that job execution was started.

  • start_time (datetime):

  • completion_time (datetime): the time that the job status changed to 'Complete'.

Job Types

The Jobs API is designed to support multiple job types. Each job type provides a different framework and environment for executing jobs. When creating a new job you must specify its type by passing in the job_type argument. Supported values for job_type are:
  • "BASIC"

  • "CONDOR" or "CONDORJOB"

  • "CONDORWORKFLOW"

  • "DASK"

For detailed documentation on each of the job types see:

Retrieving Jobs

Two methods are provided to retrieve jobs: list_jobs and get_job. Jobs are automatically filtered by app. An optional user parameter can be passed in to these methods to further filter jobs by the user.

# get list of all jobs created in your app
job_manager.list_jobs()

# get list of all jobs created by current user in your app
job_manager.list_jobs(user=request.user)

# get job with id of 27
job_manager.get_job(job_id=27)

# get job with id of 27 only if it was created by current user
job_manager.get_job(job_id=27, user=request.user)

Caution

Be thoughtful about how you retrieve jobs. The user filter is provided to prevent unauthorized users from accessing jobs that don't belong to them.

Jobs Table Gizmo

The Jobs Table Gizmo facilitates job management through the web interface and is designed to be used in conjunction with the Job Manager. It can be configured to list any of the properties of the jobs, and will automatically update the job status. It also can provide a list of actions that can be done on the a job. In addition to several build-in actions (including run, delete, viewing job results, etc.), developers can also create custom actions to include in the actions dropdown list. Note that while none of the built-in actions are asynchronous on any of the built-in Job Types, the Jobs Table supports both synchronous and asynchronous actions. Custom actions or the built-in actions of custom job types may be asynchronous. The following code sample shows how to use the job manager to populate the jobs table:

job_manager = App.get_job_manager()

jobs = job_manager.list_jobs(request.user)

jobs_table_options = JobsTable(
    jobs=jobs,
    column_fields=('id', 'name', 'description', 'creation_time', 'execute_time'),
    actions=['run', 'resubmit', '|', 'logs', '|', 'terminate', 'delete'],
    hover=True,
    striped=False,
    bordered=False,
    condensed=False,
    results_url=f'{App.package}:results_controller',
)

See also

Jobs Table

Job Status Callback

Each job has a callback URL that will update the job's status. The URL is of the form:

http://<host>/update-job-status/<job_id>/

For example, a URL may look something like this:

http://example.com/update-job-status/27/

The response would look something like this:

{"success": true}

This URL can be retrieved from the job manager with the get_job_status_callback_url method, which requires a request object and the id of the job.

job_manager = App.get_job_manager()
callback_url = job_manager.get_job_status_callback_url(request, job_id)

The callback URL can be used to update the jobs status after a specified delay by passing the delay query parameter:

http://<host>/update-job-status/<job_id>/?delay=<delay_in_seconds>

For example, to schedule a job update in 30 seconds:

http://<host>/update-job-status/27/?delay=30

In this case the response would look like this:

{"success": "scheduled"}

This delay can be useful so the job itself can hit the endpoint just before completing to trigger the Tethys Portal to check its status after it has time to complete and exit. This will allow the portal to register that the job has completed and start any data transfer that is triggered upon job completion.

Custom Statuses

Custom statuses can be given to jobs simply by assigning the status attribute:

my_job.status = "Custom Status"
However, note that the TethysJob.update_status method will only check for updated statuses of jobs where the current status is one of the TethysJob.NON_TERMINAL_STATUSES. The default TethysJob.NON_TERMINAL_STATUSES are:
  • Pending

  • Submitted

  • Running

  • Various

  • Paused

Also note that the Jobs Table Gizmo will only actively poll the status of jobs that have one of the TethysJob.ACTIVE_STATUSES. The default TethysJob.ACTIVE_STATUSES are:
  • Submitted

  • Running

  • Various

If you would like to classify a custom status to take advantage of these features then there are several methods on the TethysJob class to add custom statuses to various categories. For example:

TethysJob.add_custom_active_status("Custom Status")
This will ensure that the jobs table will continue to poll the server to update the status and that the TethysJob.update_status method will check if the status has changed. See the details of these methods below in the API documentation for Tethys Job:
  • add_custom_pre_running_status

  • add_custom_running_status

  • add_custom_active_status

  • add_custom_terminal_status

Note

When adding a custom status to the TethysJob.NON_TERMINAL_STATUSES the status will be updated when the TethysJob.update_status method is called. This is the intended behavior, however, it may be necessary to modify the TethysJob.update_status method to add additional logic that preserves the desired custom status. This is most easily done by subclassing the TethysJob class (or one of it's subclasses).

API Documentation

Job Manager

class tethys_compute.job_manager.JobManager(app)

A manager for interacting with the Jobs database providing a simple interface creating and retrieving jobs.

Note

Each app creates its own instance of the JobManager. The get_job_manager method returns the app.

from .app import App

job_manager = App.get_job_manager()
create_job(name, user, groups=None, job_type=None, **kwargs)

Creates a new job of the given type.

Parameters:
  • name (str) -- The name of the job.

  • user (django.contrib.auth.User) -- A User object for the user who creates the job.

  • groups (django.contrib.auth.Group, optional) -- A list of Group object assigned to job. The job will be saved automatically if groups are passed in. Default is None.

  • job_type (TethysJob) -- A subclass of TethysJob.

  • **kwargs

Returns:

A new job object of the type specified by job_type.

get_job(job_id, user=None, filters=None)

Gets a job by id.

Parameters:
  • job_id (int) -- The id of the job to get.

  • user (django.contrib.auth.User, optional) -- The user to filter the jobs by.

Returns:

A instance of a subclass of TethysJob if a job with job_id exists (and was created by user if the user argument is passed in).

get_job_status_callback_url(request, job_id)

Get the absolute url to call to update job status

list_jobs(user=None, groups=None, order_by='id', filters=None)

Lists all the jobs from current app for current user.

Parameters:
  • user (django.contrib.auth.User, optional) -- The user to filter the jobs by. Default is None. This parameter cannot be passed

  • other. (cannot be passed together with the user parameter. Choose one or the)

  • groups (django.contrib.auth.Group, optional) -- One or more Group objects to filter the jobs by. Default is None. This parameter

  • other.

  • order_by (str, optional) -- An expression to order jobs. Default is 'id'.

  • filters (dict, optional) -- A list of key-value pairs to filter the jobs by. Default is None.

Returns:

A list of jobs created in the app (and by the user if the user argument is passed in).

Tethys Job

class tethys_compute.models.TethysJob(*args, **kwargs)

Base class for all job types. This is intended to be an abstract class that is not directly instantiated.

exception DoesNotExist
exception MultipleObjectsReturned
classmethod add_custom_active_status(status)

Classify a custom status as an "Active" Status. The status will be added to ACTIVE_STATUSES and NON_TERMINAL_STATUSES.

Parameters:

status (str) -- The name of the status to classify

classmethod add_custom_pre_running_status(status)

Classify a custom status as a "Pre-Running" Status. The status will be added to PRE_RUNNING_STATUSES and NON_TERMINAL_STATUSES.

Parameters:

status (str) -- The name of the status to classify

classmethod add_custom_running_status(status)

Classify a custom status as a "Running" Status. The status will be added to RUNNING_STATUSES, ACTIVE_STATUSES, and NON_TERMINAL_STATUSES.

Parameters:

status (str) -- The name of the status to classify

classmethod add_custom_terminal_status(status)

Classify a custom status as a "Terminal" Status. The status will be added to TERMINAL_STATUSES.

Parameters:

status (str) -- The name of the status to classify

property cached_status

The cached status of the job (i.e. the status from the last time it was updated).

Returns: A string of the display name for the cached job status.

execute(*args, **kwargs)

executes the job

is_time_to_update()

Check if it is time to update again.

Returns:

True if update_status_interval or longer has elapsed since our last update, else False.

Return type:

bool

abstract pause()

Pauses job during execution

process_results(*args, **kwargs)

Process the results.

property process_results_function

Returns: A function handle or None if function cannot be resolved.

abstract resume()

Resumes a job that has been paused

async safe_close()

Override to close any asynchronous connections before object destruction

property status

The current status of the job. update_status is called to ensure status is current.

Returns: A string of the display name for the current job status.

It may be set as an attribute in which case update_status is called.

abstract stop()

Stops job from executing

property type

Returns the name of Tethys Job type.

update_status(status=None, *args, **kwargs)
Updates the status of a job. If status is passed then it will manually update the status. Otherwise,

it will determine if _update_status should be called.

Parameters:
  • status (str, optional) -- The value to manually set the status to. It may be either the display name or the three letter database code for defined statuses. If it is not one of the defined statuses, then the status will be set to OTH and the status value will be saved in extended_properties using the OTHER_STATUS_KEY.

  • *args -- positional arguments that are passed through to _update_status.

  • **kwargs -- key-word arguments that are passed through to _update_status.

property update_status_interval

Returns a datetime.timedelta of the minimum time between updating the status of a job.

Low-Level Scheduler API

tethys_sdk.compute.list_schedulers()

Gets a list of all scheduler objects registered in the Tethys Portal

Returns:

List of Schedulers

tethys_sdk.compute.get_scheduler(name)

Gets the scheduler associated with the given name

Parameters:

name (str) -- The name of the scheduler to return

Returns:

The scheduler with the given name or None if no scheduler has the name given.

tethys_sdk.compute.create_scheduler(name, host, scheduler_type='condor', **kwargs)

Creates a new scheduler of the type given.

Parameters:
  • name (str) -- The name of the scheduler

  • host (str) -- The hostname or IP address of the scheduler

  • scheduler_type (str) -- Type of scheduler to create. Either 'dask' or 'condor'. Defaults to 'condor'.

  • kwargs -- Keyword arguments of scheduler-specific options. See: create_dask_scheduler and create_condor_scheduler.

Returns:

The newly created scheduler

Note

The newly created scheduler object is not committed to the database.

tethys_sdk.compute.create_condor_scheduler(name, host, username=None, password=None, private_key_path=None, private_key_pass=None)

Creates a new condor scheduler

Parameters:
  • name (str) -- The name of the scheduler

  • host (str) -- The hostname or IP address of the scheduler

  • username (str, optional) -- The username to use when connecting to the scheduler

  • password (str, optional) -- The password for the username

  • private_key_path (str, optional) -- The path to the location of the SSH private key file

  • private_key_pass (str, optional) -- The passphrase for the private key

Returns:

The newly created condor scheduler

Note

The newly created condor scheduler object is not committed to the database.

tethys_sdk.compute.create_dask_scheduler(name, host, timeout=None, heartbeat_interval=None, dashboard=None)

Creates a new dask scheduler

Parameters:
  • name (str) -- The name of the scheduler

  • host (str) -- The hostname or IP address of the scheduler

  • timeout (str, optional)

  • heartbeat_interval (str, optional)

  • dashboard (str, optional)

Returns:

The newly created dask scheduler

Note

The newly created dask scheduler object is not committed to the database.