Hello everyone, by way of introduction my name is Brendan Peterson, Team Lead for Scribe's Services department. Now, in my role here at Scribe I've encountered many types of questions and comments regarding Insight, but the most common question is how to make my integrations faster. As you can imagine, this is not a two-sentence answer, but I'd like to share with all of you some of the common ways we've found to boost performance.
The first, and most obvious, is to reduce the amount of work Scribe has to do. That may come across as a cop-out, but I assure you I've got more details to back it up! When working with Insight try to imagine your DTS not in context of the number of steps and rows, but rather the number of database operations needed per record. Let's take a common setup; someone trying to integrate new and changed account data between two systems. This integration methodology requires us to first check to see if the account record we are working with exists in the target system. If it does we then want to update that record with the appropriate data. So this design would have 1 seek, 1 update, and 1 insert.
Now when seeing this design you would imagine that, worst case scenario, you have a maximum of 2 operations per record (seek + insert, or seek + update). But you would be wrong; inherently an update does a "pre-seek" and then attempts the update. Therefore a successful seek step has 2 database operations it commits, not just 1. The only way to avoid a situation like this would be to make better use of pre-operation formulas and eliminate the seek step being used (IE have a success of 0 from the update indicate the record does not exist in the system).
Additionally if you find yourself re-using DBLOOKUP or SQLQUERY functions within the formulas of a DTS, you should store these in a calculated variable and utilize that instead. Every time a DBLOOKUP or SQLQUERY function is used, it's an additional database operation being performed, thus if you store it in a variable, it's only committed 1 time to bring the results back to the DTS.
Looking outside the DTS itself, you can increase performance exponentially by making use of Queue based integrations within Scribe. Our queue based technology uses multi-threading, allowing us to increase the amount of data being passed to the target system. We also make use of our Message Processors in a powerful fashion. The message processors have functionality built in to remember the context of a DTS. So, if we are passing Accounts, Contacts, and Sales Orders from, let's say Dynamics GP and Dynamics CRM, our message processors will pick up a message from the queue, make the appropriate connections to GP or CRM, and pass that record through. Before closing the connections, they will attempt to find another XML message of the same type, and pass that through the open connections from the first message.
The reason this is such a good feature is that we do not need to re-connect to our data sources time and time again. This process can be slow, especially with applications such as CRM. By maintaining our connections we can increase the speed of jobs. When using Time, File, or Query based integrations we do not have this same functionality.
There are also some "app-specific" changes we can make with our adapters. Using the Salesforce.com adapter you should make sure Compression has been enabled, and change the Query Batching settings when moving larger set of data. Make sure your machine running Scribe has enough memory to store the data (this feature will hold more data in memory, but will make less hops over the internet passing it to Salesforce.com) and adjust this as needed. Lastly, use Upsert whenever possible for more simple data loads, as the Salesforce.com API will handle the relationships instead of forcing Scribe to create the connections.
For Dynamics CRM, you can adjust some particular settings with the adapter as well. We'll see the same change to the query batching as in Salesforce.com, but also removing Pick List Validation for CRM. These and more suggestions can be found in this blog posting from Customer Effective as well.
There are plenty of other ways to boost performance of Scribe, keep checking back here as my focus will solely be the technical features and functions of Insight, and making the best use of them.
So in this case, what about using an Update/Insert step instead? How is performance with one step compared to the two options you discussed?
Posted by: Michael Cross | 08/28/2009 at 02:04 PM
Your performance would be the same in the fact that you would either do an update, or an insert. But the benefit of using multiple steps is that you can use Pre-op formulas to direct data. So in my example I'm checking for NULL data in the Key Cross lookup, and also doing a DBLOOKUP to see if the account exists in the table. If those both come back #NULL! I move on to my insert, which saves me from having to perform the update and wait for it to return a success of zero. I also do have the success of zero on an update go to my insert as a "just in case", that should not happen from my design...
Posted by: Brendan Peterson | 08/28/2009 at 02:43 PM