# Jobs[¶](https://developer.dataiku.com/latest/concepts-and-examples/jobs.html#jobs "Permalink to this heading")

The API offers methods to retrieve the list of jobs and their status, so that they can be monitored. Additionally, new jobs can be created to build datasets.

## Reading the jobs’ status[¶](https://developer.dataiku.com/latest/concepts-and-examples/jobs.html#reading-the-jobs-status "Permalink to this heading")

The list of all jobs, finished or not, can be fetched with the `dataikuapi.dss.project.DSSProject.list\_jobs()` method. For example, to retrieve job failures posterior to a given date:

§ date = '2015/09/24'

§ date\_as\_timestamp = int(datetime.datetime.strptime(date, "%Y/%m/%d").strftime('%s')) \* 1000

§ project = client.get\_project('TEST\_PROJECT')

§ jobs = project.list\_jobs()

§ failed\_jobs = [job for job in jobs if job['state'] == 'FAILED' and job['def']['initiationTimestamp'] >= date\_as\_timestamp]

The method `dataikuapi.dss.project.DSSProject.list\_jobs()` returns all job information for each job, as a JSON object. Important fields are:

§ {

§ 'def': {   'id': 'build\_cat\_train\_hdfs\_NP\_2015-09-28T09-17-37.455',    # the identifier for the job

§ 'initiationTimestamp': 1443431857455,                      # timestamp of when the job was submitted

§ 'initiator': 'API (aa)',

§ 'mailNotification': False,

§ 'name': 'build\_cat\_train\_hdfs\_NP',

§ 'outputs': [   {   'targetDataset': 'cat\_train\_hdfs',      # the dataset(s) built by the job

§ 'targetDatasetProjectKey': 'IMPALA',

§ 'targetPartition': 'NP',

§ 'type': 'DATASET'}],

§ 'projectKey': 'IMPALA',

§ 'refreshHiveMetastore': False,

§ 'refreshIntermediateMirrors': True,

§ 'refreshTargetMirrors': True,

§ 'triggeredFrom': 'API',

§ 'type': 'NON\_RECURSIVE\_FORCED\_BUILD'},

§ 'endTime': 0,

§ 'stableState': True,

§ 'startTime': 0,

§ 'state': 'ABORTED',                                                    # the stable state of the job

§ 'warningsCount': 0}

The `id` field is needed to get a handle of the job and call `abort()` or `get\_log()` on it.

## Starting new jobs[¶](https://developer.dataiku.com/latest/concepts-and-examples/jobs.html#starting-new-jobs "Permalink to this heading")

Datasets can be built by creating a job of which they are the output. A job is created by building a job definition and starting it. For a simple non-partitioned dataset, this is done with:

§ project = client.get\_project('TEST\_PROJECT')

§ definition = {

§ "type" : "NON\_RECURSIVE\_FORCED\_BUILD",

§ "outputs" : [{

§ "id" : "dataset\_to\_build",

§ "type": "DATASET",

§ "partition" : "NP"

§ }]

§ }

§ job = project.start\_job(definition)

§ state = ''

§ while state != 'DONE' and state != 'FAILED' and state != 'ABORTED':

§ time.sleep(1)

§ state = job.get\_status()['baseStatus']['state']

§ # done!

The example above uses `dataikuapi.dss.project.DSSProject.start\_job()` to start a job, and then checks the job state every second until it is complete. Alternatively, the method `dataikuapi.dss.project.DSSProject.start\_job\_and\_wait()` can be used to start a job and return only after job completion.

The `start\_job()` method returns a job handle that can be used to later abort the job. Other jobs can be aborted once their id is known. For example, to abort all jobs currently being processed:

§ project = client.get\_project('TEST\_PROJECT')

§ for job in project.list\_jobs():

§ if job['stableState'] == False:

§ project.get\_job(job['def']['id']).abort()

Here’s another example of using `DSSProject.new\_job()` to build a managed folder and the `with\_output` method as an alternative to creating a dictionary job definition:

§ project = client.get\_project('TEST\_PROJECT')

§ # where O2ue6CX3 is the managed folder id

§ job = project.new\_job('RECURSIVE\_FORCED\_BUILD').with\_output('O2ue6CX3', object\_type='MANAGED\_FOLDER')

§ res = job.start\_and\_wait()

§ print(res.get\_status())

## Aborting jobs[¶](https://developer.dataiku.com/latest/concepts-and-examples/jobs.html#aborting-jobs "Permalink to this heading")

Jobs can be individually be aborted using the `dataikuapi.dss.job.DSSJob.abort()` method. The following example shows how to extend it to abort all jobs of a given Project.

§ project = client.get\_project('TEST\_PROJECT')

§ aborted\_jobs = []

§ for job in project.list\_jobs():

§ if not job["stableState"]:

§ job\_id = job["def"]["id"]

§ aborted\_jobs.append(job\_id)

§ project.get\_job(job\_id).abort()

§ print(f"Deleted {len(aborted\_jobs)} running jobs")

## Reference documentation[¶](https://developer.dataiku.com/latest/concepts-and-examples/jobs.html#reference-documentation "Permalink to this heading")

|  |  |

| --- | --- |

| `dataikuapi.dss.project.JobDefinitionBuilder`(project) | Helper to run a job. |

| `dataikuapi.dss.job.DSSJob`(client, ...) | A job on the DSS instance |

| `dataikuapi.dss.job.DSSJobWaiter`(job) | Helper to wait for a job's completion |
