Question

In: Accounting

1. Discuss the high-level conceptual architecture for major elements and relationships in Big Data Solutions. 2....

1. Discuss the high-level conceptual architecture for major elements and relationships in Big Data Solutions.


2. What are the similarities and differences between high-level conceptual architecture and architecture for data warehousing and data mining?


3. What information technologies can the computational demand of Big Data Analytics fulfil?

Solutions

Expert Solution

ANSWER(1)

In the new era of Big Data and Data Sciences, it is vitally important for an enterprise to have a centralized data architecture aligned with business processes, which scales with business growth and evolves with technological advancements. A successful data architecture provides clarity about every aspect of the data, which enables data scientists to work with trustable data efficiently and to solve complex business problems. It also prepares an organization to quickly take advantage of new business opportunities by leveraging emerging technologies and improves operational efficiency by managing complex data and information delivery throughout the enterprise.

When compared with information architecture, system architecture, and software architecture, data architecture is relatively new. The role of Data Architects has also been nebulous and has fallen on the shoulders of senior business analysts, ETL developers, and data scientists. Nonetheless, I will use Data Architect to refer to those data management professionals who design data architecture for an organization.

When talking about architecture, we often think about the analogy with building architecture. A conventional building architect plans, designs, and reviews the construction of a building. The design process involves working with the clients to fully gather the requirements, understanding the legal and environmental constraints of the location, and working with engineers, surveyors and other specialists to ensure the design is realistic and within the budget. The complexity of the job is indeed very similar to the role of a data architect. However, there are a few fundamental differences between the two architect roles:

  • The building architecture is designed top-down, while data architecture is often an integration process of the components or systems that likely already exist.
  • A building architect has to know the full requirements and define the entire scope before he or she builds the building. The scope for a data architecture can be broad and easily changed. A successful data architecture, therefore, should be designed to be flexible and to anticipate changes in the future.
  • A building architect has precise educational and professional requirements and should possess in-depth knowledge in business, art, structural physics, and building materials. On the other hand, most data architects come from an IT background with professional experience in a few companies or industries and limited exposure to the business. They, therefore, should be aware that their design could be biased and that they need to adjust it based on feedback from both business and technical expertise in the organization.
  • The building design is almost always for a new building being built from scratch. A building architect, therefore, could plan and design entirely based on the new requirements and new materials. A data architect does not have this luxury. They can seldom start from scratch, but need to understand the existing platforms and databases while designing for the future.

ANSWER(2)

Data Warehouse Architecture

The basic concept of a Data Warehouse is to facilitate a single version of truth for a company for decision making and forecasting. A Data warehouse is an information system that contains historical and commutative data from single or multiple sources. Data Warehouse concept, simplifies reporting and analysis process of the organization.

Data Mining Architecture

Data mining is a very important process where potentially useful and previously unknown information is extracted from large volumes of data. There are a number of components involved in the data mining process. These components constitute the architecture of a data mining system.

The major components of any data mining system are data source, data warehouse server, data mining engine, pattern evaluation module, graphical user interface and knowledge base.

Difference between Data warehousing and Data Mining

A data warehouse is database system which is designed for analytical analysis instead of transactional work. Data mining is the process of analyzing data patterns.
Data is stored periodically. Data is analyzed regularly.
Data warehousing is the process of extracting and storing data to allow easier reporting. Data mining is the use of pattern recognition logic to identify patterns
Data warehousing is solely carried out by engineers. Data mining is carried by business users with the help of engineers.
Data warehousing is the process of pooling all relevant data together.

Data mining is considered as a process of extracting data from large data sets

ANSWER(3):-

1. Conceptual Level Data Architecture Design based on Business Process and Operations

In modern IT, business processes are supported and driven by data entities, data flows, and business rules applied to the data. A data architect, therefore, needs to have in-depth business knowledge, including Financial, Marketing, Products, and industry-specific expertise of the business processes, such as Health, Insurance, Manufacturers, and Retailers. He or she can then properly build a data blueprint at the enterprise level by designing the data entities and taxonomies that represent each business domain, as well as the data flow underneath the business process. In particular, the following areas need to be considered and planned at this conceptual stage:

  • The core data entities and data elements such as those about customers, products, sales.
  • The output data needed by the clients and customers.
  • The source data to be gathered and transformed or referenced to produce the output data.
  • Ownership of each data entity and how it should be consumed and distributed based on business use cases.
  • Security policies to be applied to each data entity.
  • The relationships between the data entities, such as reference integrity, business rules, execution sequence.
  • Standard data classification and taxonomy.
  • Standards of data quality, operations, and Service Level Agreements (SLAs).

This conceptual level of design consists of the underlying data entities that support each business function. The blueprint is crucial for the successful design and implementation of Enterprise and System architectures and their future expansions or upgrades. In many organizations, this conceptual design is usually embedded in the business analysis driven by the individual project without guidance from the perspective of enterprise end-to-end solutions and standards.

2. Logical Level Data Architecture Design

This level of design is sometimes called data modeling by considering which type of database or data format to use. It connects the business requirements to the underlying technology platforms and systems. However, most organizations have data modeling designed only within a particular database or system, given the siloed role of the data modeler. A successful data architecture should be developed with an integrated approach, by considering the standards applicable to each database or system, and the data flows between these data systems. In particular, the following 5 areas need to be designed in a synergistic way:

The naming conventions and data integrity

The naming conventions for data entities and elements should be applied consistently to each database. Also, the integrity between the data source and its references should be enforced if the same data have to reside in multiple databases. Ultimately, these data elements should belong to a data entity in the conceptual design in the data architecture, which can then be updated or modified synergistically and accurately based on business requirements.

Data archival/retention policies

The data archival and retention policies are often not considered or established until every late-stage on Production, which caused wasted resources, inconsistent data states across different databases, and poor performance of data queries and updates. To enforce the data integrity, data architects should define the data archival and retention policy in the data architecture based on Operational standards.

Privacy and security information

Privacy and security become an essential aspect of the logical database design. While the conceptual design has defined which data component is sensitive information, the logical design should have the confidential information protected in a database with limited access, restricted data replication, particular data type, and secured data flows to protect the information.

Data Replications

Data Replication is a critical aspect to consider for three objectives: 1) High availability; 2) Performance to avoid data transferring over the network; 3) De-coupling to minimize the downstream impact. Excessive data replications, however, can lead to confusion, poor data quality, and poor performance. Any data replication should be examined by data architect and applied with principles and disciplines.

Data Flows and Pipelines

How data flows between different database systems and applications should be clearly defined at this level. Again, this flow is consistent with the flow illustrated in the business process and data architect conceptual level. Besides, the frequencies of the data ingestion, data transformations in the pipelines, and data access patterns against the output data should be considered in an integrated view in the logical design. For example, if an upstream data source comes in real-time, while a downstream system is mainly used for data access of aggregated information with heavy indexes (e.g., expensive for frequent updates and inserts), a data pipeline needs to be designed in between to optimize the performance.

3. Data Governance is the Key to the Continous Success of Data Architecture.

As data architecture reflects and supports the business processes and flow, it is subject to change whenever the business process is changed. As the underlying database system is changed, the data architecture also needs to be adjusted. The data architecture, therefore, is not static but needs to be continuously managed, enhanced, and audited. Data governance, therefore, should be adopted to ensure that enterprise data architecture is designed and implemented correctly as each new project is being kicked off.

Conclusion

Within a successful data architecture, a conceptual design based on the business process is the most crucial ingredient, followed by a logical design that emphasizes consistency, integrity, and efficiency across all the databases and data pipelines. Once the data architecture is established, the organization can see what data resides where and ensure that the data is secured, stored efficiently, and processed accurately. Also, when one database or a component is changed, the data architecture can allow the organization to assess the impact quickly and guides all relevant teams on the designs and implementations. Lastly, the data architecture is a live document of the enterprise systems, which is guaranteed to be up-to-date and gives a clear end-to-end picture. In summary, a holistic data architecture that reflects the end-to-end business process and operations is essential for a company to advance quickly and efficiently while undergoing significant changes such as acquisitions, digital transformation, or migration to the next-gen platform.


Related Solutions

In case of big data there is a need of new data architecture, new storage approaches,...
In case of big data there is a need of new data architecture, new storage approaches, new tools, new analytics methods and technique. Select one: a. True b. False 2- The Shopping Mall wants to group the products based on the data about their similar features, which of the following data mining approaches can be applied Select one: a. Association b. Regression c. Clustering d. Classificiation 3- A Excell spreedshed containing id number, names and marks of the student is...
1. Discuss What is the Role of Technology and Big Data in Innovation & Entrepreneurship 2....
1. Discuss What is the Role of Technology and Big Data in Innovation & Entrepreneurship 2. What Does Innovative Entrepreneurship means to you ? 3. What is your Innovative Entrepreneuship dream ? 4. how do you going to start this dream ?
Marketing Research: 1. What is a conceptual model and what are its elements? 2. What is...
Marketing Research: 1. What is a conceptual model and what are its elements? 2. What is a hypothesis and why do we need both, a null and an alternative hypothesis?
DQ #1: Discuss the pertinent elements of a testing plan.   DQ #2:  Discuss the elements of a...
DQ #1: Discuss the pertinent elements of a testing plan.   DQ #2:  Discuss the elements of a testing scenario.     DQ #3:  Discuss approaches on how to align testing and functional requirements to ensure adequate testing coverage of the project. DQ #4: Should developers be the primary testers of their own projects? Why or why not?
Big data in healthcare industry. Discuss the ethical issue with big data.
Big data in healthcare industry. Discuss the ethical issue with big data.
1.    What is Big Data? Why Is Big Data Different? (from data mart, data warehouse) 2.    What Are...
1.    What is Big Data? Why Is Big Data Different? (from data mart, data warehouse) 2.    What Are the Benefits of Big Data? 3.    Some of the potential business benefits from implementing an effective big data analytics 4.    How can organization leverage Big Data? For example, Big Data can be used to develop the next generation of products and services. For instance, manufacturers are using data obtained from sensors embedded in products to create innovative after-sales service offerings such as proactive maintenance to avoid...
1) What is Big Data? 2) What challenges and risks does Big Data present to business?...
1) What is Big Data? 2) What challenges and risks does Big Data present to business? 3) Provide one or more example of the use and benefits of Big Data
Scan the internet and find an article about big data. Discuss the implications of big data...
Scan the internet and find an article about big data. Discuss the implications of big data by considering the pros and cons. Is big data a good thing? Is the collection of data from people without their knowledge or consent ethical? Will the information gathered from big data be beneficial? Include the internet link to the article in the initial discussion post
Discuss the concept of data independence with respect to the three schema architecture.
Discuss the concept of data independence with respect to the three schema architecture.
1. Discuss What is the Role of Technology and Big Data in Innovation & Entrepreneurship
1. Discuss What is the Role of Technology and Big Data in Innovation & Entrepreneurship
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT