In: Computer Science
Give an example of a RANGE partition for a fact table of a data warehouse and its major advantage.
Data warehouses often contain large tables and require techniques both for managing these large tables and for providing good query performance across these large tables.
Why Partition a Fact Table?
Large tables (i.e., tables with hundreds of millions of rows) can be difficult to manage because of their size and the amount of time it takes to do anything with them (e.g., rebuild an index). In a transactional database, the associative tables (i.e., those tables that involve the many to many—M:N—relationship) are often the tables with the most rows in the database. In dimensional modeling, a fact table is the equivalent of an associative table. Like an associative table in a transactional database, a fact table often has many more rows than its related dimensions, perhaps even as many as (# of rows in dimension 1) x (# of rows in dimension 2) … x (# of rows in dimension n) rows.
Range Partition is one of the technique in partitioning technique.
Range Partitioning (or table partitioning) allows you to create tables where the data is placed in multiple partitions according to data ranges that you specify. The range partitions can be placed into the same tablespaces or different tablespaces. The most common way to range partition is by date, but any data type or calculated columns can be used. An example is having rows with a date column and putting rows of different months in different partitions
Range partitioning maps data to partitions based on ranges of partition key values that you establish for each partition. It is the most common type of partitioning and is often used with dates. For example, you might want to partition sales data into monthly partitions.
Range partitioning maps rows to partitions based on ranges of column values. Range partitioning is defined by the partitioning specification for a table or index in PARTITION BY RANGE(column_list) and by the partitioning specifications for each individual partition in VALUES LESS THAN(value_list), where column_list is an ordered list of columns that determines the partition to which a row or an index entry belongs. These columns are called the partitioning columns. The values in the partitioning columns of a particular row constitute that row's partitioning key.
An ordered list of values for the columns in the column list is called a value_list. Each value must be either a literal or a TO_DATE or RPAD function with constant arguments. Only the VALUES LESS THAN clause is allowed. This clause specifies a non-inclusive upper bound for the partitions. All partitions, except the first, have an implicit low value specified by the VALUES LESS THAN literal on the previous partition. Any binary values of the partition key equal to or higher than this literal are added to the next higher partition. Highest partition being where MAXVALUE literal is defined. Keyword, MAXVALUE, represents a virtual infinite value that sorts higher than any other value for the data type, including the null value.
The following statement creates a table sales_range that is range partitioned on the sales_date field:
CREATE TABLE sales_range (salesman_id NUMBER(5), salesman_name VARCHAR2(30), sales_amount NUMBER(10), sales_date DATE) COMPRESS PARTITION BY RANGE(sales_date) (PARTITION sales_jan2000 VALUES LESS THAN(TO_DATE('02/01/2000','DD/MM/YYYY')), PARTITION sales_feb2000 VALUES LESS THAN(TO_DATE('03/01/2000','DD/MM/YYYY')), PARTITION sales_mar2000 VALUES LESS THAN(TO_DATE('04/01/2000','DD/MM/YYYY')), PARTITION sales_apr2000 VALUES LESS THAN(TO_DATE('05/01/2000','DD/MM/YYYY')));
The Advantages of Range partitioning: