Question

In: Computer Science

whats data earehouse design? list some pos and cons how we implement dimensional data models in...

whats data earehouse design? list some pos and cons

how we implement dimensional data models in a data base covering relational implementation and multidimesional implementation.

whats the process like of designing the data stucture of a warehouse.

Solutions

Expert Solution

E

Data Warehousing
Data warehousing can define as a particular area of comfort wherein subject-oriented, non-volatile collection of data happens to support the management’s process.It senses the limited data within the multiple data resources.It has built-in data resources that modulate upon the data transaction.


Data Warehousing Pros Cons

The data warehouse can modulate when people have a common way of explaining new things that emerg as a particular subject. Here are some of the few characteristics of data warehousing.

Subject-oriented:
It can perform in a particular subject area. It means the data warehousing process intends to deal with a particular subject that is more defined.
A deep understanding will help in developing sales procedures that define within the bounds. It deals with all the subject matters that have a warehouse
Time-variant:
It discovers different time limits that modulate within the large amounts of data and holds in online transaction processing.

It means by time-variant when the data sent into the causes of the support of staging files.
It normally proceeds with the majority of data that handle by large tables containing updated facts.
Non-volatile:
It encompasses the high quantity of data that enters into change within the selected quantity on logical business. It enumerates the analysis in the warehouse technologies.

Non-volatility will make people understand what has occurred. It makes a clear sense of analysis that is done.

Integrated:
It is similar to the subject orientation that made in a consistent format. It should resolve the problems and make the disparate problem. It has a finite number of procedures for issues such as naming conventions, conflicts, units of measure, inconsistent values. It manages a different subject related to warehouse information.

Functions
It works as a repository and the data here hold by an organization that ensures the facilities to backup data functions.

It reduces the cost of the storage system and even the backup data at the organizational level.

It stores facts about the tables that have high granular transaction levels that monitor to define the data warehousing techniques. Functions involved are:

Data consolidations
Data cleaning
Data integration
Data Extraction
Data Cleaning
Data Transformation
Data Loading
Refreshing
Alternative Names for Data warehouse system:
Data warehouse system also knows by the following names,

Decision Support System (DSS)
Executive Information System
Management Information System
Business Intelligence Solution
Analytic Application
Data Warehouse

The data flown will be in the following formats

Structured
Semi-structured
Unstructured data
Types of Data Warehouse:
There are mainly 3 types of Data Warehouses, and they are

Enterprise Data Warehouse
Operational Data Store
Data Mart
Data Warehouse Stages :
The usage of data warehousing simple earlier, but as time passes by the procedures in assessing the data changes a lot. Following are the few stages involved in the use of data warehousing

Offline Operational Database
Offline Data Warehouse:
Real-time Data Warehouse:
Integrated Data Warehouse
Applications of Data Warehouse:
The business executives help in performing various other businesses to organize and analyze the detailed data description. These instances execute within the loop and monitor within a closed loop. Data warehousing mainly follow in the following fields:

Airline
Banking services
Healthcare
Public sector
Investment and Insurance sector
Telecommunication
Hospitality Industry
Financial services
Retail sectors
Consumer goods
Controlled manufacturing
Steps to Implement Data Warehouse:
The risk connected to data warehousing implementation is huge and needs to take into consideration at the earliest and the finest way is to use a 3 level strategy.

Enterprise strategy
Phased delivery
Iterative Prototyping
Here are a few steps in the implementation of Datawarehousing along with its deliverables.

Datawarehouse Implementation Table
Steps
Tasks
Output
1
Specifying project scope
Scope definition
2
Ascertain business needs
Logical data model
3
Defining Operational Datastore requirements
Operational Data Store Model
4
Develop or Obtain Extraction tools
Extract software and tools
5
Specifying Data Warehouse Data Needs
Transition Data Model
6
Document missing information
To Do Project List
7
Mapping Operational Data Store to Data Warehouse
D/W Data Integration Map
8
Improve Data Warehouse Database design
D/W Database Design
9
Pull Out Data from Operational Data Store
Integrated D/W Data Extracts
10
Load Data Warehouse
Initial Data Load
11
Manage Data Warehouse
Continuous Data Access and Subsequent Loads
Data Warehouse Tools:
Though you can find many data warehouse tools online, we have mentioned here a few best ones

Oracle
MarkLogic
Amazon Redshift
Pros or Advantages of Data Warehousing:
It is a common process for the new implementations in a business that is based on variou

a. Cleans data:
It mainly follows in data cleansing of removing errors that are inconsistent to improve the data and its respective quality. It emerges as a database containing many files. It has a variety of resources that made by using creativeness. It undergoes a process that enables one to deal with data cleaning substances.

Metadata reflects in sufficient quantity that especially means for all the constraints and even the system translation.

b. Indexes multiple types:
Indexing has created multiple database tables and created to speed up the accessing of information.

It can handle a large quantity of data and iterative queries before building the aligned form of data using OLTP applications. It has a huge number of existence within the modulated database system management queries.

c. Secured data and its access:
Security is the best way to mitigate the self publish breaches on rapid warehousing and that has to apply for all aspects as tradeoffs into potential warehousing behavior.

It has consolidated layered form of data with the objectives enabled and database enforced as to improve its values and gains.

It has critical compromising of sequential data within the unauthorized access.

d. Query processing with multiple options:
Query processes caries out in a parallel manner that helps in defining the unthinkable state of technology. These query tools design to process and load the data into various modules.

It accesses using simple logics along with a parallel repository of data. It enhances the defined field of routes and queries. It has a large number of query tools that manage heterogeneous resources. It handles requests from the tools online.

e. Enhanced business intelligence:
These insights develop within the information access and free from decision making. It limits the gut feelings and also defines each strategic credible fact of the evidence and backup.

It has personal needs that are varied within the better involved decision makings that are more competent with that of the limited data. It has warehouse related business tactics that measure within the informed facts. Financial management plays a vital role within the inventory management.

f. Increased system and query performance:
It mainly constructs to enhance and find the retrieval of data. It has the speed of performing different warehouses and the corresponding storage on large volumes. It has credible facts that involve storing large values.

It enables within the sequential information mediated within the business intelligence and has defined the modules that are matched with personal needs.

It constructs the operations of multiple subsystems. It concludes business intelligence and to alleviate the business repository. It gathers efforts for extracting the information.

g. Business Intelligence:
Many enterprises from a detailed log of multiple subsystems. It has different platforms that physically build within the data sources and access to a single phase of data.

It defines platforms that made different multiple sources and imagined to have a consolidated enterprise.

It enables a single data repository on a detailed subject to ensure that there is no duplication of data.

h. Timely access to data:
It helps the users to access different resources to analyze the data for the retrieval process. It spends time on schedule information on data that sequenced into routines. It has multiple resources that hold time for information technology.

It sustains for the queries and the consuming of data on query language. It has lesser information about the ability to generate standard reports that define with a special performance. It also has professional queries that diminish against warehouse reports.

i. Enhanced data consistency and quality:
It manages and sequences the illuminated data with the standardization of unique system resources. It has individual sales and utilization of a repository of data.

It has different and consistent units of substantially increased business. It accounts for the repository of operations and manages unique resources.

j. Return on investment is high:
Here the ROI made as a revenue part and with decreased expenses. It is a business that enables realize the project capital within the generates revenues and the cost savings.

The study of the business and substantial impact upon the analytics of the financial status can divide into various business studies.

k. Increase revenues:
It manages similar investigation systems that joined up for approach that might link to the stability of work and modulate within the deploy data on the database. It exists among the isolated warehouse departing from the cross checks and manage with the central point of each database.

It also follows a proactive approach within the link database to detect and prevent the summarized reports. It proactively minimizes the corporate investigators that match with increased streams.

l. Standardizes data across the organization:
Data standards are followed on different secured sharing of data. It has a particular standard within the modulated and visualized knowledge about connectivity. It contributes to numerous applications and is organized within the delivered data management systems. The conflict between data sharing avoided. It has critical applications that sequenced.

m. Database normalization:
The data can be stored and extracted in various forms that are stored in warehouse reports. It is a process of organizing the data in the relational database to minimize redundancy and that is more helpful in organizing the data. It emerges as a sequential flow of all the required data that are minimized.

Cons or Disadvantages of Data Warehousing:
Even though there are a lot of advantages, people involve in implementing time and cost with high sequences that involve data translation, long time implementation of processes, lack of flexibility in the data transfer. Here are some of the disadvantages of data warehousing explained:

a. Raising ownership:
The majority of the data that are passed are held from the data resources and are represented within multiple efforts of a data warehouse. It intimates long term implementation of the schema and its resources.

It has its issues with raising ownership, privacy and secured results. It is associated with long term owners and with high costs.

b. Extra reporting:
The data warehouse will be run depending on the risks of the organization. It has typically generated teams that help in business negotiations. It manages to duplicate the data exist within the sequencing of the long term database. It consumes more time when the extra reporting is done.

c. Data flexibility:
It is arranged when the data that is imported has many static complaints and abilities that are mapped with the same schema and enumerated filtered displays. It is often recognized leaks between customers of an organization.

It generates analysis reports within the related privacy of the customer and is defined with minimal ability. It has limited value and constant transition that are mapped within the sequential processing of data.

d. Compatibility with the existing system:
The data warehouse system can be managed within the regular extract of the data that are loaded into the system. The usage of technology requires modification of data that has foremost concerns. All the existing system functionalities that are engaged are considered to be complex.

e. Keeping data online:
Softwares do not allow keeping the entire repository online after a certain duration. It maintains the data online and is enlarged by its textual means and large data online. It records and analyses the data for future reference.

f. Dimensional technique:
This technique contains all the information with specific events. It has a limit amount of information that identified with the proper understanding of all the events. It uses for many of the practical applications that are redundant.

The process of updating, deletion, and insertion process here. It accounts for the detailed description of the undesirable characteristics of data warehousing.

g. Costs:
Nowadays the maximum of the business started using techniques of the data warehouse. So the price range has fallen under the price range that most of the products towards design.

It complicated because even the small business details form when the situations are capable of designing the data provided. It manages the price range between the people in the company.

Thus, most of the tools that users begin with the transactions which in case accounts to the techniques of data warehousing. It groups all the transactions and signifies each operation that reports in detail.

It can access a large amount of information and will enable a neutral network that is replaced with the warehouse. Users supposed to train before using warehouse techniques
DIMENSIONAL MODELING (DM) is a data structure technique optimized for data storage in a Data warehouse. The purpose of dimensional model is to optimize the database for fast retrieval of data. The concept of Dimensional Modelling was developed by Ralph Kimball and consists of "fact" and "dimension" tables.

A Dimensional model is designed to read, summarize, analyze numeric information like values, balances, counts, weights, etc. in a data warehouse. In contrast, relation models are optimized for addition, updating and deletion of data in a real-time Online Transaction System.

These dimensional and relational models have their unique way of data storage that has specific advantages.

For instance, in the relational mode, normalization and ER models reduce redundancy in data. On the contrary, dimensional model arranges data in such a way that it is easier to retrieve information and generate reports.

Hence, Dimensional models are used in data warehouse systems and not a good fit for relational systems.

In this tutorial, you will learn-


Elements of Dimensional Data Model
Fact
Dimension
Attributes
Fact Table
Dimension table
Steps of Dimensional Modelling
Step 1) Identify the business process
Step 2) Identify the grain
Step 3) Identify the dimensions
Step 4) Identify the Fact
Step 5) Build Schema
Rules for Dimensional Modelling
Benefits of dimensional modeling
Elements of Dimensional Data Model
Fact
Facts are the measurements/metrics or facts from your business process. For a Sales business process, a measurement would be quarterly sales number

Dimension
Dimension provides the context surrounding a business process event. In simple terms, they give who, what, where of a fact. In the Sales business process, for the fact quarterly sales number, dimensions would be

Who – Customer Names
Where – Location
What – Product Name
In other words, a dimension is a window to view information in the facts.

Attributes
The Attributes are the various characteristics of the dimension.

In the Location dimension, the attributes can be

State
Country
Zipcode etc.
Attributes are used to search, filter, or classify facts. Dimension Tables contain Attributes

Fact Table
A fact table is a primary table in a dimensional model.

A Fact Table contains

Measurements/facts
Foreign key to dimension table
Dimension table
A dimension table contains dimensions of a fact.
They are joined to fact table via a foreign key.
Dimension tables are de-normalized tables.
The Dimension Attributes are the various columns in a dimension table
Dimensions offers descriptive characteristics of the facts with the help of their attributes
No set limit set for given for number of dimensions
The dimension can also contain one or more hierarchical relationships
Steps of Dimensional Modelling
The accuracy in creating your Dimensional modeling determines the success of your data warehouse implementation. Here are the steps to create Dimension Model

Identify Business Process
Identify Grain (level of detail)
Identify Dimensions
Identify Facts
Build Star
The model should describe the Why, How much, When/Where/Who and What of your business process

Step 1) Identify the business process
Identifying the actual business process a datarehouse should cover. This could be Marketing, Sales, HR, etc. as per the data analysis needs of the organization. The selection of the Business process also depends on the quality of data available for that process. It is the most important step of the Data Modelling process, and a failure here would have cascading and irreparable defects.

To describe the business process, you can use plain text or use basic Business Process Modelling Notation (BPMN) or Unified Modelling Language (UML).

Step 2) Identify the grain
The Grain describes the level of detail for the business problem/solution. It is the process of identifying the lowest level of information for any table in your data warehouse. If a table contains sales data for every day, then it should be daily granularity. If a table contains total sales data for each month, then it has monthly granularity.


Related Solutions

whats the purpose of data warehouse design ? how is dimensional data models implemented in a...
whats the purpose of data warehouse design ? how is dimensional data models implemented in a database covering relation implementation and multidemensional implementation.
Discuss the pros and cons of: – functional decomposition, – data flow design, – design based...
Discuss the pros and cons of: – functional decomposition, – data flow design, – design based on data structures, and – object-oriented design for the design of each of: – a compiler, – a patient monitoring system, and – a stock control system.
whats the purpose of the following software and tools in database warehousing? list some pros and...
whats the purpose of the following software and tools in database warehousing? list some pros and cons. vmware oracle database repository and workspaces design center
How can we implement PWM with another processor such as the Intel i7. what are some...
How can we implement PWM with another processor such as the Intel i7. what are some methods to achieve PWM with the Arduino?
Design and implement a program in python that takes a list of items along with quantities...
Design and implement a program in python that takes a list of items along with quantities or weights. The program should include at least two function definition that is called within the main part of your program. Each item has a price associated by quantity or weight. The user enters the item along with the quantity or weight and the program prints out a table for each item along with the quantity/weight and total price. Your program should be able...
C++ question: Design and implement your own linked list class to hold a sorted list of...
C++ question: Design and implement your own linked list class to hold a sorted list of integers in ascending order. The class should have member functions for inserting an item in the list, deleting an item from the list, and searching the list for an item. Note: the search function should return the position of the item in the list (first item at position 0) and -1 if not found. In addition, it should have member functions to display the...
In C++, Design and implement an ADT that represents a triangle. The data for the ADT...
In C++, Design and implement an ADT that represents a triangle. The data for the ADT should include the three sides of the triangle but could also include the triangle’s three angles. This data should be in the private section of the class that implements the ADT. Include at least two initialization operations: one that provides default values for the ADT’s data, and another that sets this data to client-supplied values. These operations are the class’s constructors. The ADT also...
Design and implement an algorithm that gets as input a list of k integer values N1,...
Design and implement an algorithm that gets as input a list of k integer values N1, N2, …., Nk, as well as a special value SUM. Your algorithm must locate a pair of values in the list N that sum to the value SUM. For example, If your list of values is 3, 8, 13, 2, 17, 18, 10, and the value of SUM is 20, then your algorithm would output either of the two values (2, 18) or (3,...
Some statisticians prefer complex models, models that try to fit the data as closely as one...
Some statisticians prefer complex models, models that try to fit the data as closely as one can. Others prefer a simple model. They claim that although simpler models are more remote from the data yet they are easier to interpret and thus provide more insight. What do you think? Which type of model is best to use? When formulating your answer to this question you may think of a situation that involves inference that you do and need to present...
Discuss the differences between monolithic and micro kernel design, and identify some pros and cons of...
Discuss the differences between monolithic and micro kernel design, and identify some pros and cons of each design. You need to discuss at least one pro and con for each design. You must also provide more information than just the bullet points in the slides and show critical thinking about the various pros and cons you discuss.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT