Databricks Pipeline Job

The Databricks Pipeline job type triggers a Delta Live Tables (DLT) pipeline update via POST pipelines/{pipeline_id}/updates (Pipelines API 2.0). The DLT pipeline is identified by its Pipeline Id — Polysync stores that id in the Job's External Id.

This job type is supported on the Azure Databricks platform.

Required job fields

  • External Id — the DLT Pipeline Id (e.g., a1b2c3d4-…).
  • Job TypeDatabricks Pipeline (set automatically on import).

Job discovery

Polysync calls pipelines (Pipelines API 2.0) to list pipeline statuses, then fetches pipelines/{pipeline_id} for each one to read the full configuration. Parameters are extracted from spec.configuration (filtered to runnable settings like continuous, development, photon, spark.*, custom.*).

Parameter handling

The update API accepts very few request-time parameters — Polysync only sends the trigger flags. Configuration values are baked into the pipeline definition itself and cannot be changed per update through the update API.

Request body:

{
  "pipeline_id": "<external-id>",
  "full_refresh": <bool>
}

full_refresh is read from the Polysync parameter of the same name when present (Data Type Bool); defaults to false.

Direction Sent on update Updated from response
Input ⚠ Only full_refresh
Output (not supported)
Input&Output ⚠ Only full_refresh (not supported)

Output parameters are not supported. DLT update tracking returns only state and timing, not table-level metrics.

All other config keys imported into the Polysync parameter list are displayed for reference but are not forwarded to the update API. To change them, edit the pipeline in Databricks and click Sync Parameters on the Polysync Job.

Execution flow

  1. Polysync posts to pipelines/{pipeline_id}/updates with the trigger body above.

  2. The response update_id is combined with the pipeline id into a composite RunId: "{pipeline_id}:{update_id}".

  3. Status is polled via GET pipelines/{pipeline_id}/updates/{update_id} and update.state is decoded:

    DLT update state Polysync status
    RUNNING / STOPPING Running
    COMPLETED Success
    FAILED Failed
    CANCELED Cancelled
    (other / missing) Unknown
  4. Cancel is supported via POST pipelines/{pipeline_id}/updates/{update_id}/stop.

Monitor URL

{workspace_url}/#joblist/pipelines/{pipeline_id}

Best practices

  • Set full_refresh = true only when you need to recompute the entire pipeline (rebuilds all tables from source).
  • Make pipeline configuration changes in Databricks and click Sync Parameters in Polysync rather than editing values on the Polysync Parameters tab — only the full_refresh flag actually flows through.

Troubleshooting

  • Update returns 400 immediately — the pipeline is IDLE and a required Databricks resource (target storage, cluster policy) is missing. Investigate in the Databricks Delta Live Tables UI.
  • State stays STOPPING for a long time after Cancel — DLT drains in-flight micro-batches before stopping; that's normal for continuous pipelines.