98) Why LP's Can't Schedule

Events


MUG 2025
September 28 - October 3, 2025

News

The difference between planning and scheduling is fairly obvious to those who have worked in the commercial/planning departments of refineries, but from the outside the distinction is less clear and I am sometimes asked to explain. I usually say something like, planning is about what you want to do while scheduling is deciding when you want to do it: I am buying crude X because it offers the best margin (planning decision) and it is going into tank 5 on Monday so that it is available to feed Tower 2 on Tuesday before the shift change (scheduling decision).

We make our plans using optimization models, in hopes that we can make them as good as possible (making the most money that we can out of our equipment while respecting all the rules). Scheduling needs to be done in order to implement those plans and to deal with any deviations from expectations (planning is forward looking, there are always assumptions in the model), while keeping as on target as possible. If we want the best possible schedule, why don’t we apply SLP and MIP to scheduling and optimize it too? The big answer is that the problems become too large and unwieldy if you try to put in enough detail to generate a precise enough schedule.

In setting a schedule we are interested in the sequence of activities in time. Actions have starts, durations and ends. This is something that can be simulated fairly easily, but becomes much more complex when we trip to map it onto the simultaneous equations that make up an LP model. To the scheduler the question of which tank to feed to a unit is a single choice. A choice that normally doesn't need to be made again until the tank is empty or some production target has been met. Having more tanks makes the decision tree more complicated, but it is still one variable to set from time to time. LP works by having all the possible activities laid out in advance so that it can choose which ones to do. Choosing amongst 5 crude tanks to feed the main tower requires 5 variables. Allocating all the flows and tanks in the refinery will require many more. These will mostly have to be replicated in every period where this activity could start, continue or end– and you are probably going to need a lot of those.

In LP’s time is quantum – it comes in discrete packets that we normally call periods. Within any period everything is optimized as if it happens at once. LP models can have elements of time in them –– you can use blocked operation (time slicing) or variable duration time periods to optimize how long you do something and to indicate that multiple activities will occur in the same period – but that doesn’t imply a sequence. You normally need to use multiple periods to separate activities into a time order. The granularity of the time divisions determines the precision of the schedule.

Start with a typical single period monthly planning model. The solution gives you targets for amounts (buys, sells, movements) and actions (processing, blending) that should be achieved by the end of that 30-day planning horizon.

Days 1 to 30

Crude A, Crude B, Crude C, Crude D, Crude E, Crude F

For example, it might indicate that 6 crudes should be processed. Such models don’t usually have representations for the 5 individual tanks that actually feed the tower. Nor do we know when the crudes are arriving. Probably you will need to do different things at different times as the month progresses in order to process that crude as it becomes available and make the various products – but there is little indication in the solution of this sort of model what that schedule should be.

Split it into 4 periods and you can separate out each activity to its value in a particular week. Perhaps we can model the crude supplies as tanks of feed now that we are narrowing the focus. (Assuming we know, or will sort out in the optimization, which crudes are in which tank).

Days 1 to 7	Days 8 to 15	Days 16 to 21	Days 22 to 30
T1, T3	T3, T4	T1, T5	T1, T2, T5

This should give us a better idea of what is optimal for each week, but "the feed tank needs to change over sometime in the next few days” is not much of a schedule.

Let’s split the month into 30 days so the schedule can be more precise. A bit of MIP can be used to limit the number of tanks per day to something sensible.

1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30
T1	T1 & T3	T3	T1 & T3	T3	T1	T3	T4	T3	T4 & T3	T4	T3	T4	T4	T3	T1	T5	T1 & T5	T1	T5	T5	T2	T2	T1	T2	T5	T5	T1 & T5	T2	T2

Good enough yet? Probably not. Which tank should be run first when there are two in the day? Is the tank going to need to be changed in the middle of the night or in the afternoon? To be more precise requires even more periods. 4-hour slots would need 180 periods. You want an answer to the nearest hour and you need 720 periods to cover the same time horizon. Is that starting to sound rather large?

Continuity requirements are another challenge to doing scheduling by LP. The need for precision pushes for higher granularity and more periods, but we don’t want to switch feed tanks repeatedly. If the period is shorter than the minimum run length, then we need to add constraints that enforce rules that make the next period(s) use the same resource until it has reached an acceptable duration. Run length requirements might be a question of time, or quantity, or be related to what will be left in the tank (run it to empty or leave at least enough to do another minimum run, for example). Consider just the simple rule that any tank has to be used for at least three periods. (If there isn’t enough in it for that long, then it simply can’t be selected until more has been added). If you start feeding the tower from tank 1 in period 1, then you have to also feed it from 1 in periods 2 and 3. If you feed from tank 2 in period 1, then you have to use tank 2 in period 2 and 3 etc. A constraint is needed for each decision linking the decisions for each tank in sets of contiguous periods. This requires some if… then MIP. But if this is only the minimum run length, not a fixed one, and tanks may have enough material to continue to be used in the 4th period, the constraints have to be written in such a way that this does not require it to be used for periods 5 and 6 too. It also means that the next tank might start in period 5, so it would need to run also in 6 and 7. Then our rule has to be set up to work from any starting period, not just multiples of our minimum length. Tricky!

You also have the problem of mid-period change overs. If you allow multiple choices to be active in the same period the continuity rules become even more complex, but if you limit each period to one choice, then there is a big risk of sub-optimality. If the quantity in the tank is not enough to finish the period the tank will be avoided because it could make the model infeasible. Or suppose that you have enough to run only if you drop back to the minimum rate, whereas in reality you could bring another tank on line to work at the maximum rate if mid-period switching is allowed. You could reduce the period durations even further to make this less of an issue - or allow mid-period switching and work out the implications for the run duration rules. A single tank active in each period is a simple SOS. Multiple tanks active in a period require blocked operation.

Quality tracking is also a big challenge for massively multi-period LP models. Every tank needs to be modelled with inventory. If the stream properties matter, then you have to pass them forward from one period to the next. For a scheduling tool which is simulating the movement of material from one period to the next, you take the tank qualities at the last point at which they changed and use them until something happens which means that they will be changing again. Things get tricky if a tank is filling and emptying at the same time, but there are methods. An LP model, however, has to deal with the pooling problem. The quality calculations that pass the properties from one period to the next don’t linearize so it has to be done with distributed recursion. Quality values have to be estimated in advance for every period and errors in those assumptions carried forward in the linear model via assumed distribution factors. Doing this over the many pool-to-pool blends that have to be done as the inventoried material is passed from one period to the next will almost certainly be a breeding ground for weird values (negative sulphur anyone?) and difficult to converge.

Another challenge to getting a stable model is the lack of economic drives. How do you price the choice of one tank over another for feeding the tower? The immediate implications of the choice at any one point are limited and it may be many periods ahead when the material is used in blending or the tank is refilled before the effects on profit are felt. The overall impact of using tank 1 then tank 2 then tank 3, might be very similar to tank 3 then tank 2 then tank 1, if all the material is ultimately used, particularly if the tanks contain similar materials. The model is quite likely to make a lot of almost random variations between recursion passes because there are so many choices that don’t have a big impact on the overall costs. That is, in LP terms, the problem is likely to be highly degenerate.

So it’s not surprising that SLP for refinery scheduling is not any sort of oil industry standard. Scheduling is done with tools that simulate the consequences of decisions so that they can predict when action needs to be taken to avoid problems (tank overflows, off-spec blends). Decision making beyond operability is driven mostly by avoidance of deviation from the targets set in the plan. Many systems use rules to automate some of the decision making so it is not an entirely manual process. Small parts of optimization are often fitted in – such as recipes for product blends given at-that-time component availabilities. Haverly’s H/SCHED utilizes “SimSmarts”, comprised of heuristics, LP models and other proprietary algorithms to help manage all of the event decisions that must be maintained by the schedulers. As the scheduling time horizon changes from day to day so does the need to add, replace, update and delete many of these event decisions. The “SimSmarts” features alleviates most of that task from the scheduler.

But, you may say, your arguments are that the refinery scheduling problem translates to a large and difficult SLP model, but not an impossible one. What if we narrow the focus to a smaller set of resources during a shorter time frame? Yes, I would say, some people have been successful with blurring the planning / scheduling difference by doing just that. For example, I have seen models that address the problem of crude feed from allocation arrivals to tanks and selecting unit feeds but don’t include any upstream processing. The planning model provides targets for quantities and qualities. The allocation model breaks down the time frame into smaller segments and tries to meet these. The number of periods is kept down by balancing time horizon against granularity. That is, the more periods you need to have an acceptable precision in timing, the shorter the total planning horizon that you apply the model too. The end result, however, is only a rough draft of a schedule. The results are passed to the scheduler for further refinement. I have also during my career written a couple of non-recursed scheduling like LPs for petrochemical applications for sequencing batch production on a single unit. Again, there were targets from a broader plan to say how much feed is available and how much of each product must be made. The sub-model applied sequencing and run-length rules to ensure that these considerations did not prevent the plan from being achieved, and to give at least a rough prediction of when materials would be ready for shipping. Working out the MIP rules for such problems is quite entertaining.

I continue to wonder if there are other approaches that might be useful. In refinery models we are usually focussed on variables that represent quantities. What if we made variables to count time elapsed for each activity? One variable X to indicate the start point and Y= for the duration which is a variable that depends on the rate at which the activity occurs. Then X+Y = stop time -but it becomes difficult to determine when a resource is not available. Another activity on the same resource would have to start before X or after X+Y, so definitely some MIP logic required. Alas, I haven’t had the time to mock it up in Excel to see how far it will take me. Or maybe I should stop trying to shoe-horn scheduling into LP simply because that is the technique I know best (if your best tool is a hammer, every problem looks like a nail, as they say.). Perhaps given the difficulty in applying traditional optimization algorithms, this is an area where AI could be usefully deployed. With a coordinated effort to post thousands of refinery schedules on line so that ChatGPT and its rivals can scrape them and learn what to say next when asked, which tank should I put that crude in? More realistically, it's going to take one of the oil majors to put resources into developing a purpose-built AI tool that understands about mass balance and quality blending. And then the rest of us will have to wait a bit longer until they are willing to share.

From Kathy's Desk, 20th August 2024.

Comments and suggestions gratefully received via the usual e-mail addresses or here.
You may also use this form to ask to be added to the distribution list so that you are notified via e-mail when new articles are posted.