Amazon SageMaker

Amazon SageMaker is the AWS managed machine learning platform. Polysync integrates with SageMaker Pipelines (the SageMaker Model Building Pipelines / SDK v2 workflow service) using the official AWS SDK for .NET v4 (AWSSDK.SageMaker and AWSSDK.SecurityToken). Polysync can list pipelines, start executions with declared parameters, poll execution status, cancel running executions, and surface a direct link to the SageMaker Studio / classic console for monitoring.

Required attributes

  • Region — the AWS region in which the SageMaker domain and pipelines are deployed (e.g., us-east-1, eu-west-1). All SageMaker API calls are region-scoped.

Optional platform-level defaults

  • Pipeline Execution Role ARN — a default execution role ARN passed when the per-job override is not set. Most pipelines specify their own role at creation time, in which case this can be left blank.

Authentication methods

  • Web Identity Federation(recommended for Polysync SaaS) — Polysync exchanges its Microsoft Entra ID workload identity token for short-lived AWS credentials via sts:AssumeRoleWithWebIdentity. No long-lived secrets stored. Required attributes: Role ARN.
    • In AWS, create an IAM Identity Provider (OIDC) trusting Polysync's Entra ID issuer (https://login.microsoftonline.com/<polysync-tenant-id>/v2.0) with audience sts.amazonaws.com.
    • Create an IAM role whose trust policy allows sts:AssumeRoleWithWebIdentity from that provider with a condition on the Polysync workload identity's sub / oid claim.
  • Access Key — Provide Access Key Id, Secret Access Key, and optionally Session Token. Simplest, but the secret must be rotated and stored in a Secret Vault.
  • Assume Role — Provide a bootstrap Access Key Id and Secret Access Key, plus the Role ARN to assume. The bootstrap user only needs sts:AssumeRole on the target role; the assumed role holds the SageMaker permissions.
  • Instance Profile — Uses the host EC2/ECS instance profile. Only viable when Polysync is deployed inside AWS.

IAM permissions checklist

The role / user used by Polysync needs (at minimum):

  • sagemaker:ListPipelines — discover available pipelines.
  • sagemaker:DescribePipeline — read pipeline metadata, ARN, and pipeline role.
  • sagemaker:StartPipelineExecution — start executions (scope to specific pipeline ARNs in production).
  • sagemaker:DescribePipelineExecution — poll execution status.
  • sagemaker:StopPipelineExecution — cancel running executions.
  • iam:PassRole (on the pipeline execution role) — only required when supplying an execution role at start time.

The pipeline's own execution role needs whatever permissions its steps require (S3, ECR, CloudWatch Logs, Model Registry, etc.) — these are defined on the role attached to the pipeline, not on the Polysync caller.

Job discovery

Polysync calls ListPipelines (paginated via NextToken) and for each entry calls DescribePipeline to capture metadata: the Pipeline ARN and the Pipeline Role ARN. Pipelines are imported as Polysync Jobs identified by their Pipeline Name (SageMaker pipeline APIs are name-based, not ARN-based).

SageMaker pipeline parameters are declared inside the pipeline definition JSON (under Parameters with Type and DefaultValue). Polysync does not parse the definition automatically — users declare input parameters manually on the Job, matching the parameter names declared in the pipeline.

Parameter conventions

  • Input parameters are passed as a list of (Name, Value) pairs to StartPipelineExecution. All values are sent as strings; SageMaker resolves the declared parameter type (String / Integer / Float / Boolean) server-side based on the pipeline definition. Mismatches between the supplied value and the declared type are reported as a ValidationException at start time.
  • Output parameters are not supported by the SageMaker Pipelines execution API — DescribePipelineExecution returns status and timing only. Per-step outputs (processing artifacts, training model files, evaluation metrics) live in S3 and the SageMaker Model Registry, outside the SDK's execution response.

Execution flow

  1. ExecutePipelineAsync builds the list of pipeline parameters from input parameters, generates a unique ClientRequestToken (polysync-<guid>), optionally applies the per-job Execution Display Name, and calls StartPipelineExecution(PipelineName, PipelineParameters, ClientRequestToken).
  2. The provider returns a PipelineRun with the execution's PipelineExecutionArn as the RunId — SageMaker accepts the ARN directly on follow-up calls.
  3. GetPipelineRunStatusAsync calls DescribePipelineExecution(PipelineExecutionArn) and maps the status:
    • Executing / StoppingRunning
    • SucceededSuccess
    • FailedFailed (FailureReason surfaced on the run message)
    • StoppedCancelled
  4. CancelPipelineRunAsync calls StopPipelineExecution(PipelineExecutionArn, ClientRequestToken).

Monitor URL

https://{region}.console.aws.amazon.com/sagemaker/home?region={region}#/pipelines/{pipelineName}/executions/{executionArn}

This deep-links into the SageMaker classic console for the specific pipeline execution, where you can see the step graph, per-step CloudWatch Logs, generated artifacts, and the execution history.

Idempotency

SageMaker StartPipelineExecution and StopPipelineExecution require a ClientRequestToken (32–128 chars). Polysync generates a fresh GUID-derived token for every call. If Polysync retries a transient failure, the existing token guarantees the operation is not duplicated.

Troubleshooting

  • AccessDeniedException on StartPipelineExecution — the caller is missing sagemaker:StartPipelineExecution on arn:aws:sagemaker:{region}:{account}:pipeline/{name}, or is missing iam:PassRole on the supplied execution role.
  • ResourceNotFoundException — confirm the pipeline name exists in the target region and that the caller can DescribePipeline.
  • ValidationException on parameter type — the value supplied does not match the declared parameter type in the pipeline definition (e.g., a non-numeric string for an Integer parameter).
  • ResourceLimitExceeded — the account/region has hit a SageMaker quota (concurrent executions, training jobs, endpoints). Request a quota increase from AWS Service Quotas.
  • Web Identity Federation InvalidIdentityToken — check that the IAM Identity Provider's thumbprint matches login.microsoftonline.com, the audience is sts.amazonaws.com, and the role trust policy allows the Polysync workload identity's sub / oid.
  • No structured output parameters — this is by design. Use the SageMaker Studio UI, S3, or the Model Registry to retrieve per-step outputs.

Scope notes

This initial provider integrates SageMaker Pipelines only — the most natural fit for Polysync's pipeline-orchestration model. Other SageMaker resources (training jobs, processing jobs, transform jobs, hyperparameter tuning jobs, real-time endpoints) are not yet exposed as Polysync Job types. They can be invoked indirectly by wrapping them in a SageMaker Pipeline step (TrainingStep, ProcessingStep, TransformStep, TuningStep, LambdaStep), which is the AWS-recommended pattern for orchestrated ML workflows.