Can't find what you are looking for ?
Google
 



Tuesday, August 11, 2009

Functionality of Data Warehouses and Building a Data Warehouse

Data warehouses exist to facilitate complex, data-intensive, and frequent adhoc queries. The data warehouse access component supports enhanced spreadsheet functionality, efficient query processing, structured and adhoc queries, data mining, and materialized views. These offer pre-programmed functionalities such as :
- Roll-up : Data is summarized with increasing generalization.
- Drill-down : Increasing levels of details are revealed.
- Pivot : Cross tabulation is performed.
- Slice and dice : Performing projection operations on the dimensions.
- Sorting : Data is sorted by ordinal value.
- Selection : Data is available by value or range.
- Derived attributes : Attributes are computed by operations on stored and derived values.

BUILDING A DATA WAREHOUSE :
In constructing a data warehouse, builders should take a broad view of the anticipated use of the data warehouse. Acquisition of data for the warehouse involves the following steps :
- The data must be extracted from different sources.
- Data must be formatted for consistency within the data warehouse. Names, meanings, and domains of data from unrelated sources must be reconciled.
- Data must be cleaned to ensure validity.
- The data must be fitted into the data model of the data warehouse.
- The data must be loaded into the warehouse.
- How up-to-date must be data be ?
* Can the warehouse go off-line, and for how long.
* What are the data interdependencies ?
* What is the storage availability ?
* What are the distribution requirements ?
* What is the loading time ?
Data warehouses must also be designed with full consideration of the environment in which they will reside. Important design considerations include the following :
- Usage projections.
- The fit of the data model.
- Characteristics of available sources.
- Design of the metadata component.
- Modular component design.
- Design for manageability and change.
- Consideration of distributed and parallel architecture.

No comments: