A Software Reference Architecture for Modern Big Data Systems
Big Data, Software architecture, Reference architecture
Big Data is an umbrella term usually referring to data sets whose size grows beyond the ability of traditional methods and tools to gather, store, process, and analyze the available data at a tolerable time and using reasonable computational resources. Big Data systems (BDS) can be found in many fields, providing valuable insights and information to organizations and users. These systems' intrinsic complexity and characteristics require software architectures to meet functional and quality requirements. Reference architectures (RAs) are acknowledged as an important asset in building software architectures as they promote knowledge reuse and guide their development, standardization, and evolution. However, many RAs for BDS are still produced using an ad-hoc approach without following a systematized process for their design and evaluation. This work proposes the Modern Data Reference Architecture (MoDaRA), an RA for BDS founded on a systematic process while gathering industry practice and academic knowledge in this domain. The design of MoDaRA has followed ProSA-RA, a well-defined process to guide the definition of RAs, comprising phases such as architectural analysis, synthesis, and evaluation grounded on curated information sources. MoDaRA has been evaluated considering two use cases from the industry and an RA assessment checklist adapted to BDS.