This project revisits the Cloud Scale Analytics data platform reference architecture for Microsoft Azure. While the core principles of the architecture design have not changed, the next generation of the design will and enhance and introduce many new capabilities that will simplify the overall management, onboarding and significantly reduce the time to market.
Over the last couple of years, numerous data platforms have been built on the basis of Cloud Scale Analytics which resulted in a ton of learnings and insights. In addition to that, new services and features have been introduced, reached a GA status and common requirements have drifted. All these data points have been used to build this next iteration of the reference architecture for scalable data platforms on Azure.
The Cloud Scale Analytics reference architecture consists of the following core building blocks:
-
The Data Management Zone is the core data governance entity of on organization. In this Azure subscription, an organization places all data management solution including their data catalog, the data lineage solution, the master data management tool and other data governance capabilities. Placing these tools inside a single subscription ensures a resusable data management framework that can be applied to all Data Landing Zones and other data sources across an organization.
-
The Data Landing Zone is used for data retention and processing. A Data Landing Zone maps to a single Azure Subscription, but organizations are encouraged to have multiple of these for scaling purposes. Within a Data Landing Zone an orgnaization may implement one or multiple data applications.
-
A Data Application environment is a bounded context within a Data Landing Zone. A Data Application is concerned with consuming, processing and producing data as an output. These outputs should no longer be treated as byproducts but rather be managed as a full product that has a defined service-level-agreement.
The Terraform Modules can be found by following these links:
Data Management Zone
Data Landing Zone
We have a public GitHub project, which gives you visibility into the backlog and the status of issues of this project.
This project accepts public contributions. Please use issues, pull requests and the discussins feature in case you have any questions or concerns.