In: Math
Consider a generalization of the inventory model of Sec. 3.2 in which unfilled orders may be backlogged indefinitely with a cost of b(u) if u units are backlogged for one period. Assume revenue is received at the end of the period in which orders are placed and that backlogging costs are charged only if a unit is backlogged for an entire month, in which case the backlogging cost is incurred at the beginning of that month. a. Identify the state space and derive transition probabilities and expected rewards.
The modified model can be formulated as a MDP as follows:
Decision epochs: T = {1, 2, · · · , N}.
States: S = {· · · , −1, 0, 1, · · · , M}, where st < 0 if there are unfilled orders at the beginning of the t’th period and st > 0 if there is stock left over from the preceding period.
Actions: As = {0, 1, · · · , M − s}, assuming that there is no constraint on the amount of product that can be ordered to fill backlogged orders, but that the warehouse capacity limits the amount of stock that can be held until the end of the month.
Transition probabilities: If pj, j ≥ 0 is the probability that j new orders are received during a period, then the transition probabilities are:
Pt( j | s, a) = p_(s+a−j) if j ≤ s + a,
0 if j > s + a.
Rewards: If we assume that newly ordered inventory is used to fill any backlogged orders as soon as it arrives at the beginning of each period, then
(s, a) = f_bar − O(a) − h([s + a]+) − b([s + a]−)
where [x]+ = x ∧ 0 is the positive part of x, [x]− = −x ∧ 0 is the negative part of x, and
f_bar = .