Monthly Archives: July 2012

Synchronous and Asynchronous behaviour in SSIS

Most of us have heard the terms ‘Synchronous’ and ‘Asynchronous’ in SSIS but have always been confused between the two. Let’s brush up the concepts.

All the data flow components can be categorized as synchronous and asynchronous. Below table will list out the clear differences between the two:

Synchronous

Asynchronous

  1. The number of rows input to the synchronous components is equal to the number of rows output from synchronous components
  2. Components use the same buffer
  3. All destination components are synchronous
  4. Non-blocking transformations are synchronized
  1. The number of records output from Asynchronous components can be less\ more than the number of records input to the components
  2. Components use different buffer
  3. All source components are asynchronous
  4. Partial\ Full blocking transformations are asynchronous in nature

*If anyone does not have idea about non-blocking\ partial blocking\ full blocking transformations then don’t worry, just keep following my blog. Post comments if you have any queries\ like my blog.

Hope you have a good time from now.

Enjoy guys!!

Advertisements

SSIS Package Design Concepts – Control Flow and Data Flow

Let’s explore the two fundamental concepts related with package design

  • Control Flow
  • Data Flow

Control flow:

As the name suggest, ‘Control Flow’ means controlling the flow of execution of SSIS package. While designing the ETL package, we need to think how one container can be connected to another, how related tasks can be connected. To connect different components of SSIS, we need to identify whether we need to make tasks sequential or parallel. This whole design depicts the flow of execution and is implemented in ‘Control Flow’ component.

Control flow can consist of multiple tasks such as DFTs, Sequence Container, Execute SQL task and so on. These tasks are linked with the help of ‘Precedence Constraints’ (explained later). Following snapshot shows the location of ‘Control Flow’ in SSIS designer

Control Flow in SSIS Designer

The below screenshot shows an example of how the flow looks like:

Flow of Execution

Note:

  1. The two sequence containers will run in parallel with each other
  2. Within the sequence container, the tasks mentioned will run in sequence. Ex: First sequence container contains 3 tasks – ‘Execute SQL Task’, ‘DFT task’ and ‘For Each Loop Container’. Firstly the ‘Execute SQL Task’ will execute then ‘DFT task’ and then ‘For Each task’
  3. ‘Send Mail Task’ will execute when both the containers execute successfully

Data Flow Task (DFT):

DFT is the heart of SSIS. Most of ETL operations are performed in DFT. As the name suggest, this task is used to transfer the data from source to destination. While transferring the data, operations like merge, merge join, data conversion etc. can be performed. This whole process involves Extracting the data from source, Transformation of data in the form expected in destination and then Load the data into destination which is nothing but ETL. However, the ‘Transformation’ part is optional.

DFT has to be a part of Control flow. To add DFT to control flow, drag the DFT task from SSIS toolbox to designer. Below screenshot shows a DFT in Control flow:

DFT

On double clicking the DFT, Data Flow Tab will open where you can add Source, Destination and any transformation if required. Below screenshot adds one source, transformation and one destination:

DFT

Precedence Constraint:

Now we have different tasks in control flow or DFTs, but the question is how will we create flow between these tasks? For this purpose, Precedence Constraint is used. They connect different tasks\ containers and direct the flow of execution.

Precedence Constraint can be of 3 types:

  • Success – indicated by Green arrow
  • Failure – indicated by Red arrow
  • Completion – indicated by Blue arrow

Below screenshot will show different Precedence Constraint:

Precedence Constraint

How to create Precedence Constraint?

Whenever you drag any task from SSIS toolbox to the designer, you must have noticed a green arrow coming from task. To create a Precedence Constraint, you need to connect this arrow with another task\ container.

On double clicking the constraint following dialogue box will open:

Precedence Constraint

One can perform following things on Precedence Constraint:

  • Write Constraints
  • Write Expressions
  • For multiple constraints you can choose Logical And/ Logical Or

SSIS v/s DTS

Functionality differences:

1. Purpose:

SSIS is built to perform data integration, assist with database administration, and facilitate business intelligence whereas DTS was not built for these purposes.

2. Separation of duties:

SQL Server Integration Services separates workflow and data flow operations whereas DTS does not provide this separation

3. In DTS we were allowed to create only independent packages whereas in SSIS we can create multiple related packages

Design Differences:

1. In DTS, the designer consists of a single pane.

In SSIS, the designer is split into 5 design panes:

  •   Control Flow,
  •   Data Flow,
  •   Event Handlers
  •   Package Explorer, and
  •   Parameter

2. Solution Explorer

In DTS, related packages of the module cannot be stored in a single place. In SSIS, we can have multiple packages relation to one module in a single Solution.This is achieved using Solution Explorer

SSIS Oh! What’s that..

SSIS stands for SQL Server Integration Services. It is:

  • a platform for building enterprise-level data integration and data transformations solutions.
  • built to assist with database administration, and facilitate business intelligence.

Data integration\ transformations -Integration Services can extract and transform data from a wide variety of sources such as XML data files, flat files, and relational data sources, and then load the data into one or more destinations.

Database Administration

  • Data can be migrated between SQL Server tables, Excel files, flat files and many more
  • Data can be transferred at table or query level. This gives DBA the ability to either aggregate the data or transform the data as it is migrated.

Business Intelligence

Data Integration tools like SSIS are mainly used to perform Extract, Transform, and Load (ETL) operations. From a high level, BI projects usually involve an end point of reporting. Often times as developers we are used to transactional report writing such as the details of a PO but BI can get into very broad reports that cover product sales trends over decades and deal with hundreds of millions of records. The way we design databases for applications is not ideal for this kind of reporting so other tools and technologies were invented and are used in the BI space. These are things like Cubes which you often hear called OLAP cubes. OLAP cubes usually originate from a data warehouse which is nothing more than another database – but typical warehouses contain data that came from more than one, and often dozens of other application databases. Your inventory app, purchasing app, HR app and a whole bunch of others all contain bits and pieces of data that create a complete picture of the business, a BI architect will use something like SSIS to pull the data from all these systems, massage it and store it in the data warehouse which is designed with a different kind of design better for reporting. Once it is in the warehouse he will use Analysis services to create cubes on that data and something like Reporting Services to show you reports over that data.

SSIS – Execute Process Task (Getting rid of xp_cmdshell)

Running a batch file through SSIS??

Most of us have been running commands on command prompt using SQL Server. And we know for these commands to run from SQL Server, we need to enable xp_cmdshell option of SQL Server.
Did we ever wonder why xp_cmdshell is disabled by default? I guess majority will say no.
There is a security issue attached with xp_cmdshell.
May be in your production environment, you cannot enable this option. Then what will you do? How will you run such commands?
Well, through SSIS you can achieve this task. You can write all your commands in a batch file and execute this batch file from ‘Execute Process Task’.
Now you don’t need to enable xp_cmdshell

Enjoy!!