You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What if we replace the data integration teams with a process.
Search for existing data sources of same or similar use/type.
1a) If an ingest already exists from the intended source, consider if the current pipeline can be reused or extended.
1b) Establish a contract for multiple data products using the same pipeline, define dependencies, refresh requirements, quality requirements, classification.
If a pipeline does not exist or it is considered to be substantially different from existing pipelines the cross functional team creates a new data integration product.
2a) A critical consideration when creating new connections to source systems is handling the load on that system and ensuring that the additional load of the data egress is not affecting performance.
2b) A design consideration is also that multiple pipelines increases the complexity for the overall data extraction from
a source system.
The pipelines must be monitored and data must be classified. Data policies can be applied to detect and act based on classification. Corporate regulations can be applied automatically or advised depending of criticality.
Without the data integration teams you remove a dependency on external teams and allow your cross functional team to be more self serviced. The cross functional teams has the SME's for the source systems. This simplifies enabling access and finding the right source data required to deliver the use-case.
Potential breaking points are the need for the teams to honor the source system limitations. This is only a potential problem if the cross functional team does not have representatives from the owners of the source system.
What if we replace the data integration teams with a process.
1a) If an ingest already exists from the intended source, consider if the current pipeline can be reused or extended.
1b) Establish a contract for multiple data products using the same pipeline, define dependencies, refresh requirements, quality requirements, classification.
2a) A critical consideration when creating new connections to source systems is handling the load on that system and ensuring that the additional load of the data egress is not affecting performance.
2b) A design consideration is also that multiple pipelines increases the complexity for the overall data extraction from
a source system.
Without the data integration teams you remove a dependency on external teams and allow your cross functional team to be more self serviced. The cross functional teams has the SME's for the source systems. This simplifies enabling access and finding the right source data required to deliver the use-case.
Potential breaking points are the need for the teams to honor the source system limitations. This is only a potential problem if the cross functional team does not have representatives from the owners of the source system.
Originally posted by @esbran in #204 (comment)
The text was updated successfully, but these errors were encountered: