Understand the identity used during component execution in Synapse
Published Aug 04 2021 03:36 PM 4,978 Views
Microsoft

Azure Synapse Analytics enables data engineers and data analysts to easily create pipelines for implementing ETL scenarios. Synapse pipelines can include Spark notebooks, SQL queries, as well as components like Dataflows. Pipelines in Synapse can be configured to execute using the workspace identity. Debugging pipelines can become complicated when the workspace identity and the user executing the pipeline, or its internal components have differing levels of access to data in Storage accounts such as ADLS Gen 2 or Dedicated SQL pools. It can be helpful to understand the identity that Synapse uses to execute various user-driven scenarios such as pipeline execution, SQL script execution, and Notebook execution, etc.

 

The table below lists the usage of identities for component execution scenarios in Synapse. Synapse currently supports Workspace managed identity. More details about the workspace managed identity can be found here. Alternate identities such as service principals or SQL logins can also be used in certain scenarios.

 

Scenario's Workspace System Assigned Managed Identity Users Azure Active Directory (AAD) Identity Alternate Identity
Pipeline debug x  

x

Account keys and service principals can be used 

Pipeline execution (manual) x  

x

Account keys and service principals can be used  

Trigger execution x  

x

Account keys and service principals can be used  

Execute Dataflow activity x  

x

Account keys and service principals can be used 

Dataset browse x  

x

Account keys and service principals can be used  

Dataflow data preview x  

x

Account keys and service principals can be used  

View pipeline output   x

 

Notebook execution   x

x

Account keys, SAS tokens, AAD tokens, secrets, service principals, & managed identity can be used

Notebook execution via pipeline

x    

x

Account keys, SAS tokens, AAD tokens, secrets, service principals, & managed identity can be used

View notebook output

  x  

Spark job execution

  x

x

Account keys, SAS tokens, AAD tokens, secrets, service principals, & managed identity can be used

Spark job execution via pipeline

x  

x

Account keys, SAS tokens, AAD tokens, secrets, service principals, & managed identity can be used

View Spark job input/output

  x  

External tables with PolyBase/Copy

  x

x

SQL authentication, SAS tokens, managed identity, & SPN can be used

SQL script execution

  x

x

SQL authentication, SAS tokens, managed identity, & SPN can be used

SQL script execution via pipeline

x  

x

SQL authentication, SAS tokens, managed identity, & SPN can be used

OPENROWSET

  x

x

SQL authentication, SAS tokens, managed identity, & SPN can be used

View SQL script output

  x  

 

 References

  1. Azure Synapse workspace managed identity
  2. Linked Services in Azure Data Factory
  3. Secure credentials with linked services using the TokenLibrary
  4. SQL Authentication in Synapse
  5. Control storage account access for serverless SQL pool in Azure Synapse Analytics
1 Comment
Co-Authors
Version history
Last update:
‎Sep 15 2021 12:23 PM
Updated by: