<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="https://skynorthsoftware-com.azurewebsites.net/blog/rss/xslt"?>
<rss xmlns:a10="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>SkyNorth</title>
    <link>https://skynorthsoftware-com.azurewebsites.net/blog/</link>
    <description>Blog</description>
    <generator>Articulate, blogging built on Umbraco</generator>
    <item>
      <guid isPermaLink="false">1220</guid>
      <link>https://skynorthsoftware-com.azurewebsites.net/blog/posts/copy-blob-event-trigger/</link>
      <category>Azure</category>
      <title>Copy Azure Blobs with Data Factory Event Trigger</title>
      <description>&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt;  This article applies to version 2 of Data Factory. The integration described in this article depends on &lt;a href="https://azure.microsoft.com/services/event-grid/"&gt;Azure Event Grid&lt;/a&gt;. Make sure that your subscription is registered with the Event Grid resource provider. For more info, see &lt;a href="https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-supported-services#portal"&gt;Resource providers and types&lt;/a&gt;.  
&lt;/p&gt;
&lt;h3&gt;Overview&lt;/h3&gt;
&lt;p&gt;This article builds on the concepts described in &lt;a href="https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-overview"&gt;Copy Activity in Azure Data Factory&lt;/a&gt;. We will be using an &lt;a href="https://docs.microsoft.com/en-us/azure/data-factory/how-to-create-event-trigger"&gt;event trigger&lt;/a&gt; to copy blobs between Azure Storage accounts. &lt;/p&gt;
&lt;p&gt;There are scenarios where you need to copy blobs between storage accounts. It could be for another layer of redundancy, or simple to move to a lower tiered storage account for cost optimization. In this article, we will show you how to copy blobs immediately after they are created using Data Factory event triggers.&lt;/p&gt;
&lt;h3&gt;Prerequisites&lt;/h3&gt;
&lt;p&gt;1.) Azure Data Factory version&lt;br /&gt;
2.) Azure storage accounts for source and destination. Each storage  has a container called &lt;code&gt;backups&lt;/code&gt;&lt;br /&gt;
3.) Event Grid enabled as a resource provider on your Azure subscription  
&lt;/p&gt;
&lt;p&gt;This is what our resource group looks like:&lt;br /&gt;
&lt;strong&gt;NOTE:&lt;/strong&gt;
The storage accounts are in the same location to prevent egress charges  
&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1025/resource-group-setup.png" alt="Resource Group" /&gt;&lt;/p&gt;
&lt;h3&gt;Create Linked Storage Accounts&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; Datasets and linked services are described in depth at &lt;a href="https://docs.microsoft.com/en-us/azure/data-factory/concepts-datasets-linked-services"&gt;Datasets and linked services in Azure Data Factory&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In this demo we will use two storage accounts named &lt;code&gt;dfcopyeventsource&lt;/code&gt; and &lt;code&gt;dfcopyeventdestination&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Create linked services for both storage accounts in Data Factory under the connections tab. For simplicity we will just use the storage account account key for authentication. We will just add a &lt;code&gt;_ls&lt;/code&gt; suffix to these so we they are linked services later.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1016/create-linked-service-storage.png" alt="Create linked service menu" /&gt;&lt;/p&gt;
&lt;p&gt;Select &amp;quot;Use account key&amp;quot; for authentication method&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1017/create-linked-service-storage-source.png" alt="Create linked storage" /&gt;&lt;/p&gt;
&lt;p&gt;Both storage accounts are now linked.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1015/create-linked-service-finished.png" alt="Create linked storage" /&gt;&lt;/p&gt;
&lt;h3&gt;Create Datasets&lt;/h3&gt;
&lt;p&gt;Now we will create a Dataset for each storage account with &lt;code&gt;_ds&lt;/code&gt; as a suffix. For simplicity, use binary copy. Be sure to add the path &lt;code&gt;backups&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1013/create-dataset-storage.png" alt="Create Dataset Storage" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1014/create-dataset-storage-name.png" alt="Create Dataset Storage Name" /&gt;&lt;/p&gt;
&lt;p&gt;Publish the Datasets&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1024/publish-resources.png" alt="Publish Datasets" /&gt;&lt;/p&gt;
&lt;h3&gt;Create Pipeline&lt;/h3&gt;
&lt;p&gt;Now we will create a Pipeline and add the Copy Data activity. Our pipeline is called &lt;code&gt;copy_blob_pipeline&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1018/create-pipeline.png" alt="Create Pipeline" /&gt;&lt;/p&gt;
&lt;p&gt;Drag the &lt;code&gt;Copy Data&lt;/code&gt; activity onto the canvas.&lt;br /&gt;
1.) Name it Copy Data Activity&lt;br /&gt;
2.) Select source Dataset&lt;br /&gt;
3.) Select destination Dataset (sink)  
&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1019/create-pipeline-add-copy-data-activity.png" alt="Copy Data" /&gt;
&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1012/copy-data-activity-set-sink.png" alt="Copy Data Sink" /&gt;&lt;/p&gt;
&lt;h3&gt;Testing Pipeline&lt;/h3&gt;
&lt;p&gt;Publish all resources and your Pipeline should be ready to test. We will add event triggers after confirming the copy data activity works as expected.&lt;/p&gt;
&lt;p&gt;Take the following steps to test the Pipeline&lt;/p&gt;
&lt;p&gt;1.) Upload test data to source storage account&lt;br /&gt;
2.) In Data Factory, click the &lt;code&gt;Debug&lt;/code&gt; button&lt;br /&gt;
3.) Verify the pipeline finishes without error and the data was moved to the destination  
&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1021/pipeline-test-before.png" alt="Before" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1022/pipeline-test-debug.png" alt="Debug" /&gt;
&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1023/pipeline-test-results.png" alt="Results" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1020/pipeline-test-after.png" alt="After" /&gt;&lt;/p&gt;
&lt;h3&gt;Adding Event Trigger&lt;/h3&gt;
&lt;p&gt;We will be adding a &lt;code&gt;Blob Created&lt;/code&gt; event trigger to the &lt;code&gt;copy_blob_pipeline&lt;/code&gt;. It will watch the source storage account under the &lt;code&gt;backups&lt;/code&gt; container for new blobs, and then copy only the &lt;em&gt;new&lt;/em&gt; blobs to the destination storage account. To accomplish this, we need to parameterize our Pipeline and source Dataset. The event trigger will inject information about the blob into our parameters at runtime.&lt;/p&gt;
&lt;h4&gt;Parameterize Pipeline&lt;/h4&gt;
&lt;p&gt;Edit the &lt;code&gt;copy_blob_pipeline&lt;/code&gt; and add two parameters:&lt;br /&gt;
1.) &lt;code&gt;sourceFolder&lt;/code&gt;&lt;br /&gt;
2.) &lt;code&gt;sourceFile&lt;/code&gt;  
&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1009/blob-event-trigger-pipeline-parameters.png" alt="Pipeline Parameters" /&gt;&lt;/p&gt;
&lt;h4&gt;Add Event Trigger with Parameters&lt;/h4&gt;
&lt;p&gt;We'll create the event trigger and configure it to inject the new blob &lt;code&gt;sourceFolder&lt;/code&gt; and &lt;code&gt;sourceFile&lt;/code&gt; into the pipeline parameters we just created.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1026/blob-event-trigger.png" alt="Trigger" /&gt;
&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1008/blob-event-trigger-parameters.png" alt="Trigger Pamaeters" /&gt;&lt;/p&gt;
&lt;h4&gt;Parameterize Source Dataset&lt;/h4&gt;
&lt;p&gt;The final step is to parameterize the source Dataset &lt;code&gt;dfcopyeventsource_ds&lt;/code&gt;. On the Connection tab, add &lt;code&gt;@pipeline().parameters.sourceFolder&lt;/code&gt; and &lt;code&gt;@pipeline().parameters.sourceFile&lt;/code&gt; to their respective input boxes.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1007/blob-event-trigger-dataset-parameters.png" alt="Pipeline Parameters" /&gt;&lt;/p&gt;
&lt;h4&gt;Testing Trigger&lt;/h4&gt;
&lt;p&gt;Publish All and the Event Trigger will be live and watching the source storage account. Upload a file to the source and then view the monitoring tab to validate the pipeline executed without issues. If all is configured correctly, the blob will be copied to the destination storage account.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1011/blob-event-trigger-test-monitor-tab.png" alt="Test Trigger" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://skynorthsoftwarecomstor.blob.core.windows.net/media/1010/blob-event-trigger-test.png" alt="Test Trigger" /&gt;&lt;/p&gt;
</description>
      <pubDate>Fri, 05 Oct 2018 16:11:11 Z</pubDate>
      <a10:updated>2018-10-05T16:11:11Z</a10:updated>
    </item>
  </channel>
</rss>