In Big data, there are mainly two types of processing
workloads.
- History load - In a data warehouse, they store historical data
which dates back to long periods of time. Based on the time period
the end user wants to analyse the data, the data is stored. For
example, if for an organization, they want to compare monthly
sales, then monthly history data is stored.
- Incremental load - To keep the data warehouse updated, most
recent data is regularly updated in the data warehouse. It is a
type of periodic load whose periodicity depends on the availability
of the data source. It is a lifelong process that continues till
the lifetime of the warehouse. The incrments can be on a daily,
weekly, monthly or yearly basis or it can be a mixture of
them.
In general, the loading is carried out in two stages.
1. Loading transformed data to the staging database.
2. Loading data from the staging db to warehouse.
The discussed processing workloads can be applied to both or
either of the two stages.