Three Principles of Boomi Data Integration Design: Empty Pipe
I have recently had a chance to speak with lots of Boomi customers. Boomi World 17 was an opportunity for us at Kitepipe to connect with our customers, and the larger Boomi community. Often these discussions very quickly got to best practices, and design approaches. Here are three principles that I apply in my integration projects
1. Empty Pipe
2. No Black Hole
3. Part of the Business Process
This post will consider Empty Pipe.
What Does "Empty Pipe" Mean for Data Integration Design?
Empty Pipe in a Boomi integration simply means that the integration process should not store, or persist, any data or state information from one execution to another. Another way to say this is that the integration should be stateless – that is, everytime you run the integration it starts from the same initial, empty state.
Three questions immediately come to mind:
– Why is Empty Pipe important? and – Aren’t all Boomi integrations designed this way (in that values and caches don’t persist from one run to another) – When would you violate this? – How do you make an integration stateless?
Why Is Empty Pipe Important for Data Integration Design?
Several reasons – including failure management and rerun ability. If the integration generates intermediate data or information, then that is an additional data store that must be managed. That data, or state, must be stored somewhere, and that somewhere must be managed.
The new data store can run out of memory, have its media fail, be off-line, be out-of-synch. If something fails in the integration, you have to deal with three possible recovery states: The intermediate storage was updated, the storage was not, or the storage was partially updated. Contrast this with an empty pipe design, where in case of failure you simply re=run the integration.
Aren't All Boomi Integrations Designed To Be Stateless?
Well, that depends. Yes, an individual integration process is “stateless” in that, other than a persisted property, nothing is saved in the process. But processes can be designed to use intermediate storage that captures transactions partially processed. This can cause recovery problems.
You would violate Empty Pipe in a few situations – notably if you can’t rerun from a source system, such as when you receive transactions from a business partner or a service call that you cannot rerun or re-query. In that case you often want to immediately store the transaction so that you can re-run the transaction if needed.
How Do You Make A Boomi Integration Stateless?
A more elegant solution is to use a Queue as the initial repository of such transactions, and remove them at the end when successfully processed. Then the downstream process is stateless – it just pulls from the queue and processes, not caring if the transaction is new or a re-run.
Another case is for logging – you will often want to write logs that capture transaction state and status, but these are passive, and should not be used to drive subsequent processing logic (because what if the write failed…)
The primary way to make integrations stateless is to store state in the application endpoints. For example, the status of “Not Integrated” “Error” or “Complete” is important in deciding how to handle a transaction in an integration. You could write this status in a status table in a DB that is updated by the process, and is used in subsequent steps. But careful thought is needed to manage all possible failure modes to insure that transactions don’t get “stuck” or double processed.
Better is to post the status back to the source application, and design the integration processes to handle each status separately. You want to do the update last, after the processing. This approach works because Applications like Salesforce) are designed to store transaction “State” – plus an application has change logging, views, and edit capabilities to allow the user to see status, and edit when needed.
When an integration process uses the application endpoints to track and manage transaction status, and exchange keys, we call this a Handshake.
So, best practice is to try to architect your processes as empty pipes, to improve re-runnability and reliability.
Read Our Other Articles on No Black Hole & Process Integration Design