SELECT SIFISO https://selectsifiso.com we know how Sat, 23 Dec 2023 14:08:42 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://selectsifiso.com/wp-content/uploads/2023/03/icon-150x150.png SELECT SIFISO https://selectsifiso.com 32 32 How to Load Slowly Changing Dimension in Azure Using MERGE https://selectsifiso.com/blog/how-to-load-slowly-changing-dimension-in-azure-using-merge/?utm_source=rss&utm_medium=rss&utm_campaign=how-to-load-slowly-changing-dimension-in-azure-using-merge https://selectsifiso.com/blog/how-to-load-slowly-changing-dimension-in-azure-using-merge/#respond Mon, 20 Nov 2023 11:22:50 +0000 https://selectsifiso.com/?p=5380 How to Load Slowly Changing Dimension In Azure Using SQL MERGE

The post How to Load Slowly Changing Dimension in Azure Using MERGE appeared first on SELECT SIFISO.

]]>
A data warehouse is typically made up of fact and dimension tables that are loaded using different load patterns such as incremental vs. full; and batched vs real-time. The different types of fact and dimension tables can further influence how the data warehouse gets loaded. Slowly Changing Dimensions (SCD) is one type of dimension table found in a data warehouse. It simply works by storing, tracking and managing current and historical data over a period of time within a data warehouse. 

In this article, we look at one example of a data warehouse load, specifically an incremental load of a type 2 Slowly Changing Dimension using a SQL Merge statement. SQL Merge is an ANSI standard (SQL:2003) supported by database systems such as Oracle, PostgreSQL and SQL Server. To demonstrate a load of Slowly Changing Dimension using a SQL Merge, we will use the following:

  1. SQL Server – specifically T-SQL within a Microsoft’s Azure SQL DB
  2. Premier League dataset

Premier League Use Case

The Premier League (also known as EPL) is England’s top football league system and one of the most watched football league around the globe. It is organized by seasons which run between August until May of the following year. A total of 38 games are played by each of the 20 teams participating in a season with the Premier League trophy awarded to a team that accumulated the highest points.

Historical and current season data is available for download from various sports websites. One such sports website is Eurosport.com which keeps track of league standings per season per team along with goals scored, games won, draws, losses, goals conceded etc. An example of Premier League standings data from Eurosport.com is shown in Figure 1.

Premier League 2013-2014 Table Standings
Figure 1: Premier League 2013-2014 Table Standings

The Premier League use case is a perfect candidate to simulate data warehouse load as it has various data sets that could be used to apply different data warehouse load patterns. Teams in the Premier League refer to the different football clubs that play in the Premier League. Teams is therefore one data set that we could store in a dimension to simulate a data warehouse load. Possible attributes of a Team dimension are:

  1. Name of team
  2. Team description
  3. The season that the team plays in (although season can also be a standalone dimension). 

Stage Data into Azure SQL DB

To get started, we extract the Premier League data shown in Figure 1 into a table in Azure SQL DB. The definition of our staging table is shown in Script 1.

CREATE TABLE [dbo].[PL_Season_2010_2011](
	[NO] [tinyint] NOT NULL,
	[TEAMS] [nvarchar](50) NOT NULL,
	[P] [tinyint] NOT NULL,
	[W] [tinyint] NOT NULL,
	[D] [tinyint] NOT NULL,
	[L] [tinyint] NOT NULL,
	[GF] [tinyint] NOT NULL,
	[GA] [tinyint] NOT NULL,
	[GD] [smallint] NOT NULL,
	[PTS] [tinyint] NOT NULL
) ON [PRIMARY]
GO
Script 1: CREATE TABLE Statement for the staging table

The simplest way to get the data staged would be to copy the data from Eurosport.com into Excel Spreadsheet and then import it using SSMS as shown in Figure 2.

Azure, Data Warehouse, Eurosport.com, MERGE, Premier League, SCD
Figure 2: Stage season data using SSMS

Figure 3 shows Premier League 2020-2011 season data that has been imported and staged into SQL table dbo.PL_Season_2010_2011.

Azure, Data Warehouse, Eurosport.com, MERGE, Premier League, SCD
Figure 3: Premier League 2010-2011 season data

The rest of the different Premier League seasons’ data was successfully staged into separate tables as shown in Figure 4.

Premier League-Staging-Tables
Figure 4: Successfully Staged Premier League Season

Create Dimension Table

Data staging helps prepare the definition of the dimension table that will be used to store the final data. Script 2 provides a CREATE TABLE statement for our Team dimension. You will notice in Line 1 under Script 2 that the Team dimension is created under the DIM schema this is to make it easier to identify data warehouse dimension tables.

CREATE TABLE [DIM].[Team](
	[TeamID] [int] IDENTITY(1,1) NOT NULL,
	[Team] [varchar](255) NOT NULL,
	[Season] [varchar](25) NOT NULL,
	[IsActiveRecord] [bit] NULL,
	[IsDeletedRecord] [bit] NULL  DEFAULT ((0)) ,
	[ActiveRecord_Start] [datetime2](7) NOT NULL  DEFAULT (GETDATE()),
	[ActiveRecord_End] [datetime2](7) NOT NULL  DEFAULT ('2099-12-31')
) ON [PRIMARY]
GO
Script 2: CREATE TABLE Statement for Team dimension

The definition of our Team dimension consists of business-related fields such as Team (for team name) and Season. However, data warehouse designs typically recommend creation of metadata columns in addition to business key columns. Thus lines 2, 5-8 represent such metadata columns in our dimension. The complete name and description of the columns used in Script 2 are provided in Table 1.

Column Name

Column Description

TeamIDis a surrogate key used to uniquely identify a record in the dimension
Teamis used to store names of the Teams participating in Premier League Seasons. The team is also the unique business key identifier
Seasonis used to store the Premier League Seasons i.e. 2010-2011, 2011-2012
IsActiveRecordis a Boolean value used to indicate current and historical values. Current values are indicated by one (1) whereas a zero (0) indicates historical record.
IsDeletedRecordis used to indicate a record that no longer exists at source.
ActiveRecord_Start is used to track the effective start date for when the record is current/active. All records are assigned a default value of the current date until they are closed off and replaced by the latest versions of the records
ActiveRecord_Endis used to track effective end dates for when the record is current/active. If the record remains active then the default value is set to a future date of 2099-12-31
Table 1: Definition of Team dimension

Build SQL Merge Statement

Now that we have prepared our staging dataset and created our Team dimension, we move next to build a SQL Merge statement that will be used to load the Team dimension. Generally, SQL Merge statements work by comparing source data (incoming records) against target data (existing records in a dimension table). If data at the source doesn’t exist at the target, then you can choose to insert the new data into the dimension.

Script 3 is a statement for preparing our source data using one of the Premier League season’s data that has previously been staged. The script simply returns a list of unique Teams from the 2010-2011 season’s data.

SELECT DISTINCT
	[TEAMS] as [Team]
	,'2010-2011' as Season
	, GETDATE() AS ActiveRecord_Start
FROM [dbo].[PL_Season_2010_2011]
Script 3: CREATE TABLE Statement for Team dimension

The complete SQL Merge statement for our Team dimension is shown in Script 4. Some highlights of the SQL Merge statement are as follows:

  1. The join between source and target is based on the Team name. This comparison is only done against records that are currently active in the dimension.
  2. for WHEN MATCHED we run an update of the target provided that the team or season is not the same.
  3. The load into the data warehouse is from lines 29-46.
MERGE [DIM].[Team] AS TARGET
USING (
		SELECT DISTINCT
			 [TEAMS] as [Team]
			,'2010-2011' as Season
			, GETDATE() AS ActiveRecord_Start
		FROM [dbo].[PL_Season_2010_2011]
) AS SOURCE
ON (
		TARGET.[Team] = SOURCE.[Team]
	AND TARGET.IsActiveRecord = 1
)
WHEN MATCHED 
	AND (
		   TARGET.[Team] <> SOURCE.[Team]
		OR TARGET.[Season] <> SOURCE.[Season]
		)
	THEN
    UPDATE SET
		
		 TARGET.[Season] = SOURCE.[Season]
		,TARGET.ActiveRecord_Start = SOURCE.ActiveRecord_Start

WHEN NOT MATCHED BY SOURCE 
		AND TARGET.IsActiveRecord = 1

	THEN DELETE

WHEN NOT MATCHED BY TARGET THEN
  --Add new records into the dimension
  INSERT (
       [Team]
      ,[Season]
      ,[IsActiveRecord]
	  ,[IsDeletedRecord]	
      ,[ActiveRecord_Start]
      ,[ActiveRecord_End]
)
  VALUES(
	SOURCE.[Team]
	,SOURCE.[Season]
	,1
	,0
	,SOURCE.ActiveRecord_Start
	,'2099-12-31 23:59'
)
;
Script 4: CREATE TABLE Statement for Team dimension

Figure 5 shows the data loaded into the Team dimension following the execution of Script 4 – by default all the new records will have an IsActiveRecord flag of one (1), IsDeletedRecord to zero (0), and ActiveRecord_Start set to the current date.

Azure, Data Warehouse, Eurosport.com, MERGE, Premier League, SCD
Figure 5: Team Dimension load with 2010-2011 Season Data

Implement Slowly Changing Dimension

At this point, we have successfully demonstrated the following:

  1. Prepare and Stage data into the data warehouse (hosted in SQL Azure DB)
  2. Create a target Team dimension under a seperate schema (DIM)
  3. Build source query to identify Premier League team records
  4. Build a SQL Merge statement to load Team dimension

What would happen if we were to load next season’s data from our staging like the 2011-12 season?

Well, the SQL Merge statement will do the following:

  1. Update the existing records
  2. Insert new Teams
  3. Delete teams no longer exist at the source

However, this will not version control our existing records – it would simply implement a type 1 SCD, which is replacing old records with new ones. As mentioned earlier, we aim to implement a type 2 SCD – which means we need to find a mechanism to close out existing records and flag new versions of the records as active.

The mechanism to do this in the SQL Merge is via the OUTPUT clause. OUTPUT variables are used to track the version of records that have been deleted, updated or inserted. We can extend our current type 1 SCD functionality by tracking all records that are deleted or updated by storing them in a temporary table and then writing those records back into the Team dimension.

Script 5 shows the complete SQL Merge stored procedure that implements type 2 SCD.

  1. Line 3 declares a temporary table variable that will be used to store records tracked by the OUTPUT clause.
  2. Lines 61-70 are used to write any updated/deleted/inserted records into the temporary table variable
  3. Lines 73-110 are then used to write the updated and deleted records back into our Team dimension.
CREATE PROC [DIM].[prcTeam] as

DECLARE @DIM_Team TABLE
(
   ActionType [varchar](25) NULL,
    [Team] [varchar](255)  NULL,
	[Season] [varchar](25)  NULL,
	[IsActiveRecord] [bit] NULL,
	[IsDeletedRecord] [bit] NULL DEFAULT (0),
	[ActiveRecord_Start] [datetime2](7)  NULL,
	[ActiveRecord_End] [datetime2](7)  NULL
);

MERGE [DIM].[Team] AS TARGET
USING (
		SELECT DISTINCT
			 [TEAMS] as [Team]
			,'2011-2012' as Season
			, GETDATE() AS ActiveRecord_Start
		FROM [dbo].[PL_Season_2011_2012]
) AS SOURCE
ON (
		TARGET.[Team] = SOURCE.[Team]
	AND TARGET.IsActiveRecord = 1
)
WHEN MATCHED 
	AND (
		   TARGET.[Team] <> SOURCE.[Team]
		OR TARGET.[Season] <> SOURCE.[Season]
		)
	THEN
    UPDATE SET
		
		 TARGET.[Season] = SOURCE.[Season]
		,TARGET.ActiveRecord_Start = SOURCE.ActiveRecord_Start

WHEN NOT MATCHED BY SOURCE 
		AND TARGET.IsActiveRecord = 1

	THEN DELETE

WHEN NOT MATCHED BY TARGET THEN
  --Add new records into the dimension
  INSERT (
       [Team]
      ,[Season]
      ,[IsActiveRecord]
	  ,[IsDeletedRecord]	
      ,[ActiveRecord_Start]
      ,[ActiveRecord_End]
)
  VALUES(
	SOURCE.[Team]
	,SOURCE.[Season]
	,1
	,0
	,SOURCE.ActiveRecord_Start
	,'2099-12-31 23:59'
)

OUTPUT
	$action AS ActionType
	,DELETED.[Team]
	,DELETED.[Season]
	,DELETED.IsActiveRecord
	,DELETED.[IsDeletedRecord]
	,DELETED.[ActiveRecord_Start]
	,GETDATE()
	
INTO @DIM_Team;

----expire previously active record
INSERT INTO [DIM].[Team]
(
       [Team]
      ,[Season]
      ,[IsActiveRecord]
	  ,IsDeletedRecord
      ,[ActiveRecord_Start]
      ,[ActiveRecord_End]
)
SELECT  
	   [Team]
      ,[Season]
      ,0 as IsActiveRecord
	  ,0 IsDeletedRecord
      ,[ActiveRecord_Start]
	  ,[ActiveRecord_End]	
FROM  @DIM_Team
WHERE ActionType in ('UPDATE')

----expire previously active record
INSERT INTO [DIM].[Team]
(
       [Team]
      ,[Season]
      ,[IsActiveRecord]
	  ,IsDeletedRecord
      ,[ActiveRecord_Start]
      ,[ActiveRecord_End]
)
SELECT  
	   [Team]
      ,[Season]
      ,0 as IsActiveRecord
	  ,1 IsDeletedRecord
      ,[ActiveRecord_Start]
	  ,[ActiveRecord_End]	
FROM  @DIM_Team
WHERE ActionType in ('DELETE')
;
GO
Script 5: SQL Merge Proc with OUTPUT Clause

Figure 6 shows a preview of our type 2 SCD Team dimension with 2-seasons’ worth of data (2010-2011 & 2011-2012 season data). You will notice for instance that the Arsenal team appears in both seasons as a result the latest season data (2011-2012) has an active record flag. Birmingham City team only appeared in the 2010-2011 season – as a result, it was then flagged as deleted during the processing of the 2011-2012 season data; the same is true of the Blackpool team at line 8 – these teams were relegated at the end of 2010-2011 season.

Azure, Data Warehouse, Eurosport.com, MERGE, Premier League, SCD
Figure 6: Results of the Team Dimension into a Type 2 SCD using a Merge

Summary

This article has demonstrated a data warehouse load using a SQL Merge statement. We prepared and staged the Premier League season’s data from the Eurosport.com website into a SQL Azure DB. We then created a target Team dimension and populated it using the SQL Merge statement.

Loading

The post How to Load Slowly Changing Dimension in Azure Using MERGE appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/how-to-load-slowly-changing-dimension-in-azure-using-merge/feed/ 0
SELECT SIFISO Celebrates a successful 2022 as a team https://selectsifiso.com/blog/select-sifiso-celebrates-a-successful-2022-as-a-team/?utm_source=rss&utm_medium=rss&utm_campaign=select-sifiso-celebrates-a-successful-2022-as-a-team https://selectsifiso.com/blog/select-sifiso-celebrates-a-successful-2022-as-a-team/#respond Mon, 06 Feb 2023 00:00:00 +0000 https://selectsifiso.com/blog/select-sifiso-celebrates-a-successful-2022-as-a-team/ The post SELECT SIFISO Celebrates a successful 2022 as a team appeared first on SELECT SIFISO.

]]>

SELECT SIFISO celebrated the end of a successful and productive 2022 at Cradle Moon Lakeside Game Lodge in Muldersdrift, Gauteng from 2 December to 4 December 2022. During the getaway, the team took out some time to relax, bond as a team, and reflect on a fruitful year at SELECT SIFISO.

We spent two days at the lodge taking in the serenity that comes with a bushveld experience. The itinerary during our stay included game drives, a boat cruise, and competitive (but fun!) games night. The team also had a tour of the Cradle of Humankind, which was declared a World Heritage Site by UNESCO in 1999.  We also made time to enjoy a relaxing day enjoying some fun in the sun at Happy Island Waterworld, which features some of the world’s best water park slides and rides.

The much-needed and fulfilling time together as a team came to an end with an awards ceremony, where the founder and director of SELECT SIFISO, Sifiso Ndlovu, recognised and celebrated the hardworking members his team.

SELECT SIFISO award winners

CATEGORY NAME
Most recognized employee Emmanuel Patrick
Team Player Mbali Gumbi
Late comer Bafana Makhubela
Always meets deadline Danny Masonto
Excellent listener Neide Cumbane
Most organized Thembelihle Ndlovu
Showed more Growth Mandisa Vilakazi
Golden Oldie Rachel Ndawo
Most Punctual Nhlanhla Mape

Loading

The post SELECT SIFISO Celebrates a successful 2022 as a team appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/select-sifiso-celebrates-a-successful-2022-as-a-team/feed/ 0
Export data from a Power BI dashboard https://selectsifiso.com/blog/export-data-from-a-power-bi-dashboard/?utm_source=rss&utm_medium=rss&utm_campaign=export-data-from-a-power-bi-dashboard https://selectsifiso.com/blog/export-data-from-a-power-bi-dashboard/#respond Thu, 26 Jan 2023 00:00:00 +0000 https://selectsifiso.com/blog/export-data-from-a-power-bi-dashboard/ Step 1: Open a dashboard in the Power BI service and select a tile with a visual. Step 2: From the upper right corner of the tile, open the More options (…) dropdown and select Export data. Step 3: A pop up screen will appear and you click on Export The data will be exported […]

The post <strong>Export data from a Power BI dashboard</strong> appeared first on SELECT SIFISO.

]]>
Step 1: Open a dashboard in the Power BI service and select a tile with a visual.

Step 2: From the upper right corner of the tile, open the More options (…) dropdown and select Export data.

Step 3: A pop up screen will appear and you click on Export

The data will be exported as an excel file.

Loading

The post <strong>Export data from a Power BI dashboard</strong> appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/export-data-from-a-power-bi-dashboard/feed/ 0
Team Announcement https://selectsifiso.com/blog/team-announcement-2/?utm_source=rss&utm_medium=rss&utm_campaign=team-announcement-2 https://selectsifiso.com/blog/team-announcement-2/#respond Mon, 23 Jan 2023 00:00:00 +0000 https://selectsifiso.com/blog/team-announcement-2/ We are pleased to announce that two of our staff members have been offered permanent positions within Select Sifiso. We offer permanent positions on our team once a staff member has been with us for a year. Rachel Ndawo has filled the role of Executive PA and Project Manager since January 2022. We are delighted […]

The post Team Announcement appeared first on SELECT SIFISO.

]]>
We are pleased to announce that two of our staff members have been offered permanent positions within Select Sifiso. We offer permanent positions on our team once a staff member has been with us for a year.

Rachel Ndawo has filled the role of Executive PA and Project Manager since January 2022. We are delighted to announce that she will be fulfilling this role in a permanent position.

Mandisa Vilakazi has filled the role of Office Administrator since January 2022. We are excited to announce that she has been appointed to this role with a permanent position.

Thank you for being a fundamental part of our team. Congratulations Rachel and Mandisa!

Loading

The post Team Announcement appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/team-announcement-2/feed/ 0
Giving Back to Deserving Communities https://selectsifiso.com/blog/giving-back-to-deserving-communities/?utm_source=rss&utm_medium=rss&utm_campaign=giving-back-to-deserving-communities https://selectsifiso.com/blog/giving-back-to-deserving-communities/#respond Wed, 11 Jan 2023 00:00:00 +0000 https://selectsifiso.com/blog/giving-back-to-deserving-communities/ SELECT SIFISO takes its commitment to CSI seriously and we make time to contribute to worthy causes in our own unique ways. As an IT company, we applaud and support the work done by NGOs in the ICT sector. One of these NGOs is Siyakhula Computer School, based at the Emthonjeni Community Centre in the […]

The post Giving Back to Deserving Communities appeared first on SELECT SIFISO.

]]>
SELECT SIFISO takes its commitment to CSI seriously and we make time to contribute to worthy causes in our own unique ways.

As an IT company, we applaud and support the work done by NGOs in the ICT sector. One of these NGOs is Siyakhula Computer School, based at the Emthonjeni Community Centre in the township of Zandspruit, Johannesburg, which provides affordable, accessible, and quality computer literacy training.

To end off a successful 2022 and as part of our CSI activities that aim to support organizations that provide access to ICT training to disadvantaged communities, SELECT SIFISO donated a laptop, keyboard, and an onboard computer camera to Siyakhula Computer School.

The school operates as a locally owned and managed social enterprise that creates sustainable employment using traditional business methodologies to meet the social need for low-cost, high-quality ICT and end-user computer literacy skills training. Some of the ICT training provided by the school is aimed at adults in the community, which opens further opportunities for learning and employment for them. All the students attending Siyakhula Computer School graduate with a certificate after completing a course at the school.

Loading

The post Giving Back to Deserving Communities appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/giving-back-to-deserving-communities/feed/ 0
The People’s Race https://selectsifiso.com/blog/the-peoples-race/?utm_source=rss&utm_medium=rss&utm_campaign=the-peoples-race https://selectsifiso.com/blog/the-peoples-race/#respond Fri, 11 Nov 2022 00:00:00 +0000 https://selectsifiso.com/blog/the-peoples-race/ Sunday, 6th November 2022 our team participated in the 27th “The People’s Race”. The race took place at the FNB Stadium in Nasrec, Johannesburg.  Three marathons took place, during the course of the event.  The Standard Marathon, Half Marathon, and 10 Kilometer Race.  Completing the race within the allotted time, awarded participants with a medal. […]

The post The People’s Race appeared first on SELECT SIFISO.

]]>
Sunday, 6th November 2022 our team participated in the 27th “The People’s Race”. The race took place at the FNB Stadium in Nasrec, Johannesburg. 

Three marathons took place, during the course of the event. 

The Standard Marathon, Half Marathon, and 10 Kilometer Race. 

Completing the race within the allotted time, awarded participants with a medal.

Four members of our team not only participated in the 10 Kilometer Race but completed it within the allotted time.

Nhlanhla Mape | 49m 03s

Mbali Gumbi | 1h 22m 00s

Rachel Ndawo | 1h 34m 59s

Neide Cumbane | 1h 39m 27s

I want to congratulate the team and all those who participated. You did Select Sifiso proud.

Loading

The post The People’s Race appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/the-peoples-race/feed/ 0
Latest Blog Post – Importing XML in SSIS https://selectsifiso.com/blog/latest-blog-post-sourcing-xml-in-ssis/?utm_source=rss&utm_medium=rss&utm_campaign=latest-blog-post-sourcing-xml-in-ssis https://selectsifiso.com/blog/latest-blog-post-sourcing-xml-in-ssis/#respond Fri, 14 Oct 2022 00:00:00 +0000 https://selectsifiso.com/blog/latest-blog-post-sourcing-xml-in-ssis/ Using SSIS to Import HTTP-Based XML Document http://www.blogs.selectsifiso.com/?p=1009

The post Latest Blog Post – Importing XML in SSIS appeared first on SELECT SIFISO.

]]>
Using SSIS to Import HTTP-Based XML Document

http://www.blogs.selectsifiso.com/?p=1009

Loading

The post Latest Blog Post – Importing XML in SSIS appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/latest-blog-post-sourcing-xml-in-ssis/feed/ 0
Programmatically Create Data Flow Task inside a Sequence Container Using C# https://selectsifiso.com/blog/programmatically-create-data-flow-task-inside-a-sequence-container-using-c/?utm_source=rss&utm_medium=rss&utm_campaign=programmatically-create-data-flow-task-inside-a-sequence-container-using-c https://selectsifiso.com/blog/programmatically-create-data-flow-task-inside-a-sequence-container-using-c/#respond Fri, 14 Oct 2022 00:00:00 +0000 https://selectsifiso.com/blog/programmatically-create-data-flow-task-inside-a-sequence-container-using-c/ Abstract This article explains how to create an SSIS package with a Data Flow Task inside a Sequence Container using C# programming language. Requirements Microsoft Visual Studio 2008 SQL Server 2008 AdventureWorks2008R2 database (downloadable database file available here) Article If the above requirements are all met, we will begin by launching Microsoft Visual Studio 2008. […]

The post Programmatically Create Data Flow Task inside a Sequence Container Using C# appeared first on SELECT SIFISO.

]]>
Abstract
This article explains how to create an SSIS package with a Data Flow Task inside a Sequence Container using C# programming language.
Requirements

Article
If the above requirements are all met, we will begin by launching Microsoft Visual Studio 2008.
Create a new project Integration Services Project which is located under Business Intelligence Projects.
After you have named the new project, proceed to click and drag the script task in the Control Flow pane of the new package.
Right-click the script task and click on “Edit”
Under the Script Task Editor change the “ScriptLanguage” to “Microsoft Visual C# 2008”.
In Project Explorer import relevant references and ensure that you have declared namespaces as below:
using System;
using System.Data;
using System.Data.Sql;
using System.Data.SqlTypes;
using System.Data.SqlClient;
using Microsoft.SqlServer.Server;
using Microsoft.SqlServer.Dts.Tasks;
using Microsoft.SqlServer.Dts.Tasks.ExecuteSQLTask;
using Microsoft.SqlServer.Dts.Runtime;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using RuntimeWrapper = Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;
After declarations, create an instance of the application and package:
Application SIFISO_app = new Application();
Package dyna_pkg = new Package();

Create a connection to the AdventureWorks2008R2 database.
ConnectionManager ConMgr = dyna_pkg.Connections.Add(“OLEDB”);
ConMgr.ConnectionString = “Provider=SQLOLEDB.1;” +
“Integrated Security=SSPI;Initial Catalog=AdventureWorks2008R2;” +
“Data Source=(local);”;
ConMgr.Name = “ConMgr_OLEDB”;
ConMgr.Description = “OLE DB connection to the AdventureWorks2008R2 database.”;

Insert a Sequence container:
Sequence exec_SEQ = (Sequence)dyna_pkg.Executables.Add(“STOCK:SEQUENCE”);
exec_SEQ.FailPackageOnFailure = true;
exec_SEQ.FailParentOnFailure = true;
exec_SEQ.Name = @”select SIFISO Sequence Container”;
exec_SEQ.Description = @”select SIFISO Sequence Container”;

Add the following Execute SQL Tasks:
Executable exec = exec_SEQ.Executables.Add(“STOCK:SQLTask”);
TaskHost th = exec as TaskHost;
th.Properties[“Name”].SetValue(th, “Create View”);
th.Properties[“Description”].SetValue(th, “Drops and Create SQL View which based on Adventureworks database”);
th.Properties[“Connection”].SetValue(th, “ConMgr_OLEDB”);
th.Properties[“SqlStatementSource”].SetValue(th, “CREATE OR REPLACE VIEW v_Sales as select * from Employee”);

Executable exec2 = exec_SEQ.Executables.Add(“STOCK:SQLTask”);
TaskHost th2 = exec2 as TaskHost;
th2.Properties[“Name”].SetValue(th2, “select from view”);
th2.Properties[“Description”].SetValue(th2, “select from view”);
th2.Properties[“Connection”].SetValue(th2, “ConMgr_OLEDB”);
th2.Properties[“SqlStatementSource”].SetValue(th2, “SELECT * FROM v_Sales”);

Executable exec3 = dyna_pkg.Executables.Add(“STOCK:SQLTask”);
TaskHost th3 = exec3 as TaskHost;
th3.Properties[“Name”].SetValue(th3, “delete View”);
th3.Properties[“Description”].SetValue(th3, “delete View”);
th3.Properties[“Connection”].SetValue(th3, “ConMgr_OLEDB”);
th3.Properties[“SqlStatementSource”].SetValue(th3, “DROP VIEW v_Sales”);

Insert a data flow task:
Executable e = exec_SEQ.Executables.Add(“STOCK:PipelineTask”);
TaskHost thMainPipe = e as TaskHost;
MainPipe dataFlowTask = thMainPipe.InnerObject as MainPipe;
thMainPipe.Name = “selectSIFISO data Flow”;

Add a source OLEDB connection manager to the package:

ConnectionManager cm = dyna_pkg.Connections.Add(“OLEDB”);
cm.Name = “OLEDB ConnectionManager”;
cm.ConnectionString = string.Format(
“Provider=SQLOLEDB.1;Data Source={0};Initial Catalog={1};Integrated Security=SSPI;”, “your_source_server”, “your_db”);

Add a destination OLEDB connection manager to the package:

// Add an OLEDB connection manager to the package.
ConnectionManager cm_DES = dyna_pkg.Connections.Add(“OLEDB”);
cm_DES.Name = “DES OLEDB”;
cm_DES.ConnectionString = string.Format(
“Provider=SQLOLEDB.1;Data Source={0};Initial Catalog={1};Integrated Security=SSPI;”, “your_destination_server”, “your_db”);

Add an OLE DB source to the data flow:

IDTSComponentMetaData100 component =
dataFlowTask.ComponentMetaDataCollection.New();
component.Name = “OLEDBSource”;
component.ComponentClassID = “DTSAdapter.OleDbSource.2”;

Get the design time instance of the component and initialize the component:

CManagedComponentWrapper instance = component.Instantiate();
instance.ProvideComponentProperties();

Specify the connection manager:

if (component.RuntimeConnectionCollection.Count > 0)
{
component.RuntimeConnectionCollection[0].ConnectionManager =
DtsConvert.GetExtendedInterface(dyna_pkg.Connections[0]);
component.RuntimeConnectionCollection[0].ConnectionManagerID =
dyna_pkg.Connections[0].ID;
}

Set the custom properties:

instance.SetComponentProperty(“AccessMode”, 2);
instance.SetComponentProperty(“SqlCommand”,
“SELECT * FROM your_source_table”);

Reinitialize the source metadata:

instance.AcquireConnections(null);
instance.ReinitializeMetaData();
instance.ReleaseConnections();

Add the destination component:

IDTSComponentMetaData100 destination =
dataFlowTask.ComponentMetaDataCollection.New();
destination.ComponentClassID = “DTSAdapter.OleDbDestination”;
destination.Name = “OLEDBDestination”;

Get destination design-time instance and initialise component:

CManagedComponentWrapper destDesignTime = destination.Instantiate();
destDesignTime.ProvideComponentProperties();

Set destination connection:

destination.RuntimeConnectionCollection[0].ConnectionManagerID = cm_DES.ID;
destination.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.ToConnectionManager90(cm_DES);

Set destination table name:

destDesignTime.SetComponentProperty(“OpenRowset”, “your_destination_table”);
destDesignTime.SetComponentProperty(“AccessMode”, 3);
destDesignTime.SetComponentProperty(“FastLoadOptions”, “TABLOCK,CHECK_CONSTRAINTS”);

Connect the source to the destination component:

dataFlowTask.PathCollection.New().AttachPathAndPropagateNotifications(component.OutputCollection[0],
destination.InputCollection[0]);

Get input and virtual input for the destination to select and map columns:

IDTSInput100 destinationInputerr = destination.InputCollection[0];
IDTSVirtualInput100 destinationVirtualInputerr = destinationInputerr.GetVirtualInput();
IDTSVirtualInputColumnCollection100 destinationVirtualInputColumnserr =
destinationVirtualInputerr.VirtualInputColumnCollection;

Reinitialize the destination metadata:

destDesignTime.AcquireConnections(null);
destDesignTime.ReinitializeMetaData();
destDesignTime.ReleaseConnections();

Get the destination’s default input and virtual input:

IDTSInput100 input = destination.InputCollection[0];
IDTSVirtualInput100 vInput = input.GetVirtualInput();

Iterate through the virtual input column collection:

foreach (IDTSVirtualInputColumn100 vColumn in vInput.VirtualInputColumnCollection)
{
// Select column, and retain new input column
IDTSInputColumn100 inputColumn = destDesignTime.SetUsageType(input.ID,
vInput, vColumn.LineageID, DTSUsageType.UT_READONLY);
// Find external column by name
IDTSExternalMetadataColumn100 externalColumn =
input.ExternalMetadataColumnCollection[inputColumn.Name];
// Map input column to external column
destDesignTime.MapInputColumn(input.ID, inputColumn.ID, externalColumn.ID);

Join the Execute SQL Tasks:
PrecedenceConstraint pcFileTasks3 =
exec_SEQ.PrecedenceConstraints.Add((Executable)exec, (Executable)e);
pcFileTasks3.Value = DTSExecResult.Success;

PrecedenceConstraint pcFileTasks =
exec_SEQ.PrecedenceConstraints.Add((Executable)e, (Executable)exec2);
pcFileTasks.Value = DTSExecResult.Success;

PrecedenceConstraint pcFileTasks2 =
dyna_pkg.PrecedenceConstraints.Add((Executable)exec_SEQ, (Executable)exec3);
pcFileTasks2.Value = DTSExecResult.Success;

SIFISO_app.SaveToXml(C:\\TEMP\\pkg_create_DTS_Seq_Sql_Tasks.dtsx”, dyna_pkg, null);

We then save the package into a file system.
Dts.TaskResult = (int)ScriptResults.Success;
SIFISO_app.SaveToXml(“C:\\TEMP\\pkg_create_DTS.dtsx”, dyna_pkg, null);
Conclusion
It’s that simple!
You can now execute your script task and the package will be created in location you specified.

Loading

The post Programmatically Create Data Flow Task inside a Sequence Container Using C# appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/programmatically-create-data-flow-task-inside-a-sequence-container-using-c/feed/ 0
Extract & Save SQL Statements of Execute SQL Tasks in a ForEach Loop Container using C# https://selectsifiso.com/blog/extract-save-sql-statements-of-execute-sql-tasks-in-a-foreach-loop-container-using-c/?utm_source=rss&utm_medium=rss&utm_campaign=extract-save-sql-statements-of-execute-sql-tasks-in-a-foreach-loop-container-using-c https://selectsifiso.com/blog/extract-save-sql-statements-of-execute-sql-tasks-in-a-foreach-loop-container-using-c/#respond Fri, 14 Oct 2022 00:00:00 +0000 https://selectsifiso.com/blog/extract-save-sql-statements-of-execute-sql-tasks-in-a-foreach-loop-container-using-c/ Abstract This article explains how to extract and save SQL Statements found in Execute SQL Tasks of an SSIS package that contains ForEach Loop container using C# programming language. Requirements Microsoft Visual Studio 2008 SQL Server 2008 Article This article continues from another article which can be found here. Launch Visual Studio 2008 and create […]

The post Extract & Save SQL Statements of Execute SQL Tasks in a ForEach Loop Container using C# appeared first on SELECT SIFISO.

]]>
Abstract
This article explains how to extract and save SQL Statements found in Execute SQL Tasks of an SSIS package that contains ForEach Loop container using C# programming language.

Requirements

Article
This article continues from another article which can be found here.
Launch Visual Studio 2008 and create an Integration Services Project. After the default (new) package has launched, drag a script task to the control flow pane.
Right-click to edit the script task. In your Script Task Editor ensure that you have selected Microsoft Visual C# as your programming language.
At the bottom of your Script Task Editor, click “Edit Script”.
Add the following references:
using System;
using System.Data;
using System.Data.Sql;
using System.Data.SqlTypes;
using System.Data.SqlClient;
using Microsoft.SqlServer.Server;
using Microsoft.SqlServer.Dts.Tasks;
using Microsoft.SqlServer.Dts.Tasks.ExecuteSQLTask;
using Microsoft.SqlServer.Dts.Runtime;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using RuntimeWrapper = Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;
Proceed to load the package (that contains the Execute SQL Tasks you would like to extract and save).
Application app = new Application();

Load package
Package p = app.LoadPackage(“C:\\TEMP\\pkg_Execute_Sql_Tasks.dtsx”, null);
Establish a connection to the SQL database where you will store the extracted SQL statements:
SqlConnection connectiont = new SqlConnection(
string.Format(“Data Source={0};Initial Catalog={1};Integrated Security=SSPI;”, “(local)”, “your_db”));
SqlCommand commandt;
connectiont.Open();

Declare variables that will be used to store extracted queries:
string src_query2 = “”;
string src_query3 = “”;
string src_query = “”;
string sql_task_name = “”;

The rest of the code is as follows:
foreach (Executable executable in importPackage.Executables)
{
//This part is for Execute SQL Tasks found within the ForEach Loop Container
DtsContainer container = (DtsContainer)executable;
if (executable.GetType().Name == “ForEachLoop”)
{
ForEachLoop seq = (ForEachLoop)executable;
foreach (Executable ForLp_exec in seq.Executables)
{
DtsContainer ForLp_container = (DtsContainer)ForLp_exec;
if (ForLp_exec.GetType().Name == “TaskHost”)
{
TaskHost loop = (TaskHost)ForLp_exec;
ExecuteSQLTask sqlTask = (ExecuteSQLTask)loop.InnerObject;
src_query2 = sqlTask.SqlStatementSource;
src_query3 = src_query2.ToUpper();
src_query = src_query3;
sql_task_name = ForLp_container.Name;

//Split extracted queries on keyword Go
string source3 = src_query;
string[] stringSeparators3 = new string[] { “GO” };
string[] result3;

result3 = source3.Split(stringSeparators3, StringSplitOptions.None);
foreach (string s in result3)
{
//NB: You can replace table name “SQL_Task” with your own table
commandt = new SqlCommand(“INSERT INTO SQL_Task VALUES(@SRC_Q,@SQL_NAME)”, connectiont);
commandt.Parameters.Add(new SqlParameter(“@SRC_Q”, s));
commandt.Parameters.Add(new SqlParameter(“@SQL_NAME”, sql_task_name));
commandt.ExecuteNonQuery();
}
}
}
}
//This part is for Execute SQL Tasks found outside of the ForEach Loop Container
if (executable.GetType().Name == “TaskHost”)
{
TaskHost loop = (TaskHost)executable;
ExecuteSQLTask sqlTask = (ExecuteSQLTask)loop.InnerObject;
src_query = sqlTask.SqlStatementSource;
sql_task_name = container.Name;
string source = src_query;
string[] stringSeparators = new string[] { “GO” };
string[] result;
result = source.Split(stringSeparators, StringSplitOptions.None);
foreach (string s in result)
{
commandt = new SqlCommand(“INSERT INTO SQL_Task VALUES(@SRC_Q,@SQL_NAME)”, connectiont);
commandt.Parameters.Add(new SqlParameter(“@SRC_Q”, s));
commandt.Parameters.Add(new SqlParameter(“@SQL_NAME”, sql_task_name));
commandt.ExecuteNonQuery();
}
}
}

Dts.TaskResult = (int)ScriptResults.Success;
We then save the package in a file system.
SIFISO_app.SaveToXml(“C:\\TEMP\\pkg_Execute_Sql_Tasks.dtsx”, dyna_pkg, null);
Conclusion
It’s that simple!
You can now execute your script task and the package will be created in the location you specified.

Loading

The post Extract & Save SQL Statements of Execute SQL Tasks in a ForEach Loop Container using C# appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/extract-save-sql-statements-of-execute-sql-tasks-in-a-foreach-loop-container-using-c/feed/ 0
Extract & Save SQL Statements in Execute SQL Tasks using C# https://selectsifiso.com/blog/programmatically-extract-save-sql-statements-in-execute-sql-tasks-using-c/?utm_source=rss&utm_medium=rss&utm_campaign=programmatically-extract-save-sql-statements-in-execute-sql-tasks-using-c https://selectsifiso.com/blog/programmatically-extract-save-sql-statements-in-execute-sql-tasks-using-c/#respond Fri, 14 Oct 2022 00:00:00 +0000 https://selectsifiso.com/blog/programmatically-extract-save-sql-statements-in-execute-sql-tasks-using-c/ Abstract This article explains how to extract and save SQL Statements in Execute SQL Tasks of an SSIS package using C# programming language. Requirements Microsoft Visual Studio 2008 SQL Server 2008 Article I have been recently been tasked by my boss at work to extract the SQL Statements in an SSIS package that contained over […]

The post Extract & Save SQL Statements in Execute SQL Tasks using C# appeared first on SELECT SIFISO.

]]>
Abstract
This article explains how to extract and save SQL Statements in Execute SQL Tasks of an SSIS package using C# programming language.

Requirements

Article
I have been recently been tasked by my boss at work to extract the SQL Statements in an SSIS package that contained over 1400 Execute SQL Tasks steps. Consequently, I programmed a script task using C# that reads each SQL Task and save the SQL statement to a SQL Server 2008 table. In this article, I have created a similar script using the example I discussed in the article which can be found here
Okay, let’s begin.
Launch Visual Studio 2008 and create an Integration Services Project. After the default (new) package has launched, drag a script task to the control flow pane.
Right-click to edit the script task. In your Script Task Editor ensure that you have selected Microsoft Visual C# as your programming language.
At the bottom of your Script Task Editor, click “Edit Script”.
Add the following references:
using System;
using System.Data;
using System.Data.Sql;
using System.Data.SqlTypes;
using System.Data.SqlClient;
using Microsoft.SqlServer.Server;
using Microsoft.SqlServer.Dts.Tasks;
using Microsoft.SqlServer.Dts.Tasks.ExecuteSQLTask;
using Microsoft.SqlServer.Dts.Runtime;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using RuntimeWrapper = Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;
Proceed to load the package (that contains the Execute SQL Tasks you would like to extract and save).
Application app = new Application();

Load package from SQL Server:
Package importPackage = app.LoadFromSqlServer(@”\\your_pkg_name”, “(local)”, null, null, null);

or load it from the Integrations Services Server:
Package importPackage = app.LoadFromDtsServer(@”File System\your_pkg_name”, “yourserver”, null);

or load it from a hard disk:
Package p = app.LoadPackage(“C:\\TEMP\\your_pkg_name.dtsx”, null);
Establish a connection to the SQL database where you will store the extracted SQL statements:
SqlConnection connectiont = new SqlConnection(
string.Format(“Data Source={0};Initial Catalog={1};Integrated Security=SSPI;”, “(local)”, “your_db”));
SqlCommand commandt;
connectiont.Open();

string src_query2 = “”;
string src_query3 = “”;
string src_query = “”;
string sql_task_name = “”;

Loop through Execute SQL Tasks in the package:
foreach (Executable executable in importPackage.Executables)
{
DtsContainer Seq_container = (DtsContainer)executable;
if (executable.GetType().Name == “TaskHost”)
{
TaskHost loop = (TaskHost)executable;
ExecuteSQLTask sqlTask = (ExecuteSQLTask)loop.InnerObject;
src_query2 = sqlTask.SqlStatementSource;
src_query3 = src_query2.ToUpper();
src_query = src_query3;
sql_task_name = Seq_container.Name;

//split SQL query on keyword “GO”
string source = src_query;
string[] stringSeparators = new string[] { “GO” };
string[] result;

result = source.Split(stringSeparators, StringSplitOptions.None);
//for each extracted and split statement, insert it into a table
for each (strings in the result)
{
commandt = new SqlCommand(“INSERT INTO Your_tb VALUES”(@SRC_Q,@SQL_NAME)”,
connectiont);
commandt.Parameters.Add(new SqlParameter(“@SRC_Q”,s));
commandt.ExecuteNonQuery();
}
}
}
Dts.TaskResult = (int)ScriptResults.Success;

Conclusion
It’s that simple!
You can now execute your script task and the Execute SQL Tasks statements will be extracted and saved into your table.

Loading

The post Extract & Save SQL Statements in Execute SQL Tasks using C# appeared first on SELECT SIFISO.

]]>
https://selectsifiso.com/blog/programmatically-extract-save-sql-statements-in-execute-sql-tasks-using-c/feed/ 0