AWS Glue is the AWS managed ETL service. Polysync uses the official AWS SDK for .NET v4 (AWSSDK.Glue and AWSSDK.SecurityToken) to discover Glue jobs, run them with parameter values, poll their status, and surface a direct link to the AWS Glue Studio console for monitoring.
us-east-1, ap-southeast-2). All Glue API calls are region-scoped.sts:AssumeRoleWithWebIdentity. No long-lived secrets stored. Required attributes: Role ARN. Optional: External Id, Session Name.
https://login.microsoftonline.com/<polysync-tenant-id>/v2.0) with audience sts.amazonaws.com.sts:AssumeRoleWithWebIdentity from that provider with a condition on the Polysync workload identity's sub/oid claim.sts:AssumeRole on the target role; the assumed role holds the Glue permissions.The role / user used to call Glue must hold (at minimum):
glue:ListJobs — discover available jobs.glue:GetJobs, glue:GetJob — read job definitions, including parameter defaults.glue:StartJobRun — submit a run with arguments.glue:GetJobRun — poll status.glue:BatchStopJobRun — cancel a run.Plus any IAM permissions Glue itself needs to access the script's S3 location and the job's data sources (typically defined on the Glue job's IAM role, not the Polysync caller).
Polysync calls ListJobs + GetJobs (paginated via NextToken) and imports each Glue job. Parameter defaults declared on the Glue job's DefaultArguments are imported as Polysync input parameters. AWS Glue arguments are always strings — datatypes are inferred from the value where possible. Glue does not expose output parameters; status and metrics come from CloudWatch.
AWS Glue job arguments must be prefixed with -- (e.g., --source_path, --target_table). Polysync automatically normalizes parameter keys: if a parameter name does not start with --, the provider prepends it before calling StartJobRun. Polysync-reserved internal parameters (those starting with _) are excluded.
Glue compute settings can be set in three places. Higher items win:
Worker Type, Number Of Workers, Version, Timeout (minutes), Max Retries). Surfaced in the Job editor as optional overrides.Notes:
Worker Type accepts standard Glue values (G.1X, G.2X, G.4X, G.8X, G.025X, Z.2X, Standard).Timeout (minutes) and Number Of Workers must be positive integers; invalid values fall through to the next tier.Version is forwarded to the run as the --glue-version argument so downstream tooling can record it; AWS itself binds the version on the job definition.Max Retries is exposed for visibility/intent but AWS only honors it on the job definition, not on StartJobRun.ExecutePipelineAsync → StartJobRunAsync(JobName, Arguments).{jobName}/{runId} so subsequent status calls can resolve both pieces.GetPipelineRunStatusAsync calls GetJobRunAsync(JobName, RunId) and maps Glue's JobRunState to Polysync's PipelineRunStatus:
STARTING → StartingRUNNING, STOPPING → RunningSUCCEEDED → SuccessFAILED, ERROR, TIMEOUT → FailedSTOPPED → Cancelledhttps://{region}.console.aws.amazon.com/gluestudio/home?region={region}#/job/{jobName}/run/{runId}
This deep-links into AWS Glue Studio for the specific job run, showing logs, CloudWatch metrics, and the DAG view.
AccessDenied on StartJobRun — the caller's IAM principal is missing glue:StartJobRun on arn:aws:glue:{region}:{account}:job/{jobName}.InvalidInputException about arguments — confirm parameter keys start with --. Polysync normalizes them, but a literal value of key=value will not be split.InvalidIdentityToken — check that the IAM Identity Provider's thumbprint matches login.microsoftonline.com, the audience is sts.amazonaws.com, and the role trust policy allows the Polysync workload identity's sub / oid.TIMEOUT — increase the Timeout (Minutes) on the Glue job, or check the worker type and DPU allocation.