What do you mean by data ware house? What are the major concepts and terminology used in the study of data warehouse?
A data warehouse is a central repository of data which is created by integrating data from one or more disparate sources and are used for creating trending reports for management reporting and decision making.
A Data warehouse is a part of the data warehousing system. It provides consolidated, accessible and flexible collection of data for end user analysis and reporting. This may includes sales figures, market performance, accounts payables, and leave details of employees.
Major concepts of data warehousing:
Subject-Oriented: Data Warehouse is subject-oriented as the data gives information about a particular subject instead of about a company’s ongoing operation.
Integrated: Data Warehouse is integrated as the data is gathered from a variety of sources into the data warehouse and merged into a coherent whole.
Time Variant: Data warehouse is time-variant as all the data in it is identified with a particular time period.
Non-Volatile: Data is stable in a data warehouse. More data is added but data is never removed. Thus, the management can gain a constant picture of the business. Hence the data warehouse is non-volatile (long term storage).
Terminologies used in the study of Data Warehousing are
Data Warehouse:A data structure that is optimized for distribution. It collects and stores integrated sets of historical data from multiple operational systems and feeds them to one or more data marts. It may also provide end-user access to support enterprise views of data.
Data Mart:
A data structure that is optimized for access. It is designed to facilitate end-user analysis of data. It typically supports a single, analytic application used by a distinct set of workers.
Staging Area:
Any data store that is designed primarily to receive data into a warehousing environment.
Operational Data Store:
A collection of data that addresses operational needs of various operational units. It is not a component of a data warehousing architecture, but a solution to operational needs.
OLAP (On-Line Analytical Processing):
A method by which multidimensional analysis occurs.
Multidimensional Analysis:
The ability to manipulate information by a variety of relevant categories or “dimensions” to facilitate analysis and understanding of the underlying data. It is also sometimes referred to as “drilling-down”, “drilling-across” and “slicing and dicing”
Hypercube:
A means of visually representing multidimensional data.
Star Schema:
A means of aggregating data based on a set of known dimensions. It stores data multidimensional in a two dimensional Relational Database Management System (RDBMS), such as Oracle.
Snowflake Schema:
An extension of the star schema by means of applying additional dimensions to the dimensions of a star schema in a relational environment.
Multidimensional Database:
Also known as MDDB or MDDBS. A class of proprietary, non-relational database management tools that store and manage data in a multidimensional manner, as opposed to the two dimensions associated with traditional relational database management systems.
OLAP Tools:
A set of software products that attempt to facilitate multidimensional analysis. It can incorporate data acquisition, data access, data manipulation, or any combination thereof.
No comments:
Post a Comment