data integration

In this episode of our Happy Hour series, Allen Ellison, Managing Consultant for our Data & AI practice, covers SSIS and Azure Data Factory. He talks about the differences and similarities between the two, and when you should be using them. Enjoy!

“Hi everyone, thanks for joining. Want to take a little bit of time today to talk about SSIS and Azure Data Factory. What makes them alike? What makes them different and when we should be using one versus the other? SSIS and Azure Data Factory are both tools used for moving small or even large amounts of data from source to destination, and having the ability to transform that data whether it’s for aggregating a rolling up the data, remodeling that data completely for reporting and analytics, or even importing new or archiving old data. Way too many different options, almost unlimited to be able to list out at this point. So, SSIS has been an old reliable tool since 2005, typically resides on-prem, is included with runs from SQL Server tons and tons of different components in there to be able to do those transforms just as we need to. As far as Data Factory, on the other hand, resides solely in the cloud. That is the new cool kid on the block, and everybody is starting to look at it because of all the cloud based Azure data platform technologies that we have out there. Both of these tools have the main purpose for moving data from many, many different data sources to many different types of data destinations, and as they did transform that data structure. Why is this so important? Well, as an example, you’ve got a whole bunch of different applications you need to match that data up together. These are the kind of tools that you use. Maybe you want to create a report, for example for comparing product sales and employee wages to applications. Let’s get that data together so that you can analyze that in any way that you might see fit. We have a few different scenarios where we need to look at before choosing which tool might likely be suited for this particular job at hand or the data currently resides where the data is going to end up, like the landing spot, so typically speaking, destination is going to tell us a lot about which tools we want to use as a best practice, but it’s not 100%. It’s basically just a rule of thumb. If we’re moving data from on-prem to on-prem, the SSIS tool, probably is the tool of choice, since it already resides on-prem. Their reason to push it up the cloud just to pull it back down. Same kind of rule applies for Azure Data Factory. If all the data is sitting in the cloud and we wanted to land in the cloud, there is no reason to push down on-prem just to push it back up to the cloud. Just a better choice to leave it where it starts, or leave it where it ends, excuse me.

In the spirit of destination drives that technology moving data from on-prem to cloud, Azure Data Factory is probably going to be our best choice. If the data is going from cloud to on-prem, SSIS might be our best tool of choice. Let’s remember that even outside of your network that has to travel over the internet in order to consume is typically considered cloud based like Web APIs, Internet, RSS feeds, cloud drives, anything of the like. So is easy enough? Well, maybe not. We have one more question, it comes up a lot, a whole lot, and that is: “I have hundreds of SSIS packages already, but I want to migrate to Azure and move all of my data out to the cloud. What am I supposed to do in this kind of situation?” Well, you’re in luck. Microsoft, infinite wisdom. They have given us the ability to migrate SSIS packages to Azure Databases using something called Integration Runtimes or IRs. This allows us to migrate the data, change the connection strings that SSIS had and migrate the SSIS packages to Azure. As time goes on, kind of have a choice there. You can either rebuild the packages as as-needed in Azure Data Factory or you can just say, “You know what? This package is running perfectly fine, just like they are. Let’s just start moving forward from here and creating Azure Data Factory packages.” Either way, there’s a product that’s going to do the best, but if you need a little bit of help with deciding what technologies to use or how to be able to move it, feel free to give us a call. Will be happy to help you anytime. Thanks everyone!”