Great reply with interesting details. However to be fair to the author, I don't think he was passing off his mathematics as a realistic business decision matrix. In the The College Mathematics Journal[1], a lot of presentations are of the "assume a spherical cow" type of stuff. It's presenting scenarios for education exploration that are more complicated/interesting than highschool algebra of "two trains leave the station, when do they meet?". That's why the paper's scope is limited to a Euclid geometry exercise.
To model the extra factors you listed, it is a complex multi-constraint optimization NP problem and I suspect the author probably knows that. He just didn't put that disclaimer explicitly because he didn't anticipate that the non-journal audience (e.g. savvy HN readers) would pick it apart.
In the same spirit, the Levenshtein distance can be studied for simple spell checking algorithms. Even though we know the more sophisticated ones can utilize machine learning with massive datasets, we can still explore Levenshtein for teaching purposes.
There is a lot of fun to be had considering the cases for passenger airlines in the US, since the large traditional hub-and-spoke airlines all maintain multiple hubs (in fact, following the most recent round of merger consolidation, each of the big three US airlines has at least seven hubs in the country).
That's a multi-way tradeoff between local population and desirability as a destination (to ensure plenty of profit-making opportunities on direct flights); proximity to other important destinations (to offer short connections to non-hub cities); coastal or northern locations (good for trans-oceanic or long great-circle routes); climate (to ensure your hub doesn't get frequently shut down by adverse weather); congestion (to ensure your hub isn't frequently delayed by the sheer amount of traffic to coordinate); airport layout (to give you maximum efficiency for lots of flights); etc., etc.
When people wonder, for example, why Atlanta is consistently the busiest or second-busiest airport despite not being an obvious first-tier American city (those being New York, Chicago and Los Angeles), the weather, the congestion and the airport layout are huge factors. Atlanta doesn't get a ton of snow or severe thunderstorms like the northeast or upper midwest, it isn't in an overloaded air corridor like the northeast, and the airport's runway and taxiway configuration allow multiple simultaneous takeoffs and landings with much more ease than someplace like O'Hare (which is basically beyond capacity at this point with its configuration).
> a complex multi-constraint optimization
NP problem
There was a day when I thought that!
So, I was thinking non-linear optimization
with, say, some of gradients, steepest
descent, conjugate gradients,
quasi-Newton, Kuhn-Tucker conditions,
maybe some Lagrangian relaxation, etc. and
saw no hope.
And I still remember the day when I got an
explanation and learned better! Can make
the problem 0-1 integer linear programming
but otherwise linear. The
non-linearities, really complicated costs,
say, have a plane step climb when have a
heavy cargo load, and really goofy
constraints, e.g., don't fly over any US
military bases or the White House, don't
fly over residential areas at dinner time,
etc., can handle in a simple way before
the optimization itself.
The technique has some generality and,
thus, is a good tool to have in a 'data
science' tool kit, especially with current
computer hardware, library software, and
practical data availability.
Here's an outline of the technique: For
one period from, say, 5 PM to midnight
(again from 1 AM to, say, 8 AM) set up a
table with one row for each city to be
served, 90 cities, and one column for each
candidate airplane trip, tour, from
Memphis and back.
In this table, for each city and column
put a 1 in the table for that city and
column if the tour for that column serves
that city and put a 0 otherwise. So,
generate all such columns; may have some
thousands of columns.
Considering all fine cost details want to
account for, and taking expectations for
random costs, say, from weather and air
traffic control, find for each of the
candidate tours its cost (first cut,
assume that if a plane goes to a city,
then it completely serves that city).
Then for the schedule, want to pick no
more than 33 columns (there were 33 planes
in the fleet) from the 1000s so that all
90 cities are served and total cost is
minimized.
So, each column serves, covers, some
cities (each city where the column has a
1), and want to pick columns that cover
all 90 cities -- so have a set covering
problem. Right, it's in NP-complete.
So, in particular, we have a linear
programming problem: We have one variable
for each column. The costs and those
variables form the linear function of the
variables to be minimized. The constraint
matrix is just that table we described.
The constraints are that each city is
served by some one column.
Or in matrix algebra, we have from the
table described above a matrix A with 90
rows and some thousands, say, m, columns.
We have the variables x, m x 1. We have
the m costs, 1 x m c. And we have the
right hand side, 90 x 1 b where each
component of b is 1. Then we want to find
x to minimize cx subject to Ax = b. So,
it's linear programming.
Then in addition we ask that the variables
take on values of only 0 or 1 -- so we
have a 0-1 integer linear programming
problem.
If ignore the 0-1 constraint, then have a
fairly routine linear programming problem
but will likely end up with some
fractional tours allocated.
For honoring the 0-1 constraints, on this
problem likely could do well enough with
branch and bound.
As FedEx grew, the nature of the
scheduling problem changed. The outline
above should have helped save significant
bucks at least in the early years.
Then, of course, that solution technique
could be used to evaluate the few
candidate hub locations.
With some generalization, the solution
technique could also be used to evaluate
airplane types to add to the fleet.
So, here we have 0-1 integer linear
programming set covering; keep it in mind
and see if you can find another good
application.
Thanks for mentioning Gurobi. I
just looked them up! So, they
are headed up by Bixby! Terrific!
After IBM bought ILOG and C-PLEX,
I assumed that Bixby would retire.
Apparently, nope!
Also Bixby has two students from
Georgia Tech so, likely from
George Nemhauser and Ellis
Johnson.
George is the person who explained
0-1 integer linear set covering to
me. That same day he gave me
three words of advice that became
the direction for my Ph.D. dissertation
(in stochastic optimal control)
at Johns Hopkins. I knew Ellis
when we were both at IBM's
Watson lab.
What you mentioned looks like
something quite common for
linear programming software.
E.g., think linear programming
tableaux and appending a new
constraint with its own, new
artificial variable. Then do
elementary row operations to zero
out the row of the new constraint
in the columns of the old basic
variables. Then do elementary row
operations to zero out the new
artificial column in all rows except
the row of the new constraint.
Then do simplex algorithm pivots
to get the new artificial
variable out of the basis.
Changing costs is also easy and
standard in convenient little operations
called post-optimality.
There are more details, but covering
all of those would need much of a
course in linear programming.
E.g., the IBM Optimization Subroutine
Library (OSL) is quite nice work,
likely heavily from Ellis Johnson
when he was still at IBM before he
went to Georgia Tech where George
Nemhauser has long been and does
what you mentioned.
Once I
wrote some 0-1 integer linear programming
software with some Lagrangian relaxation
based on the OSL -- it was nice.
From Bixby, the guy that did C-PLEX,
and two Georgia Tech students,
etc., I would expect more than the usual
or just the OSL! I'd guess that they have
some good results in integer programming,
numerical stability, much larger problems,
exploitation of multi-core processors,
and more.
To model the extra factors you listed, it is a complex multi-constraint optimization NP problem and I suspect the author probably knows that. He just didn't put that disclaimer explicitly because he didn't anticipate that the non-journal audience (e.g. savvy HN readers) would pick it apart.
In the same spirit, the Levenshtein distance can be studied for simple spell checking algorithms. Even though we know the more sophisticated ones can utilize machine learning with massive datasets, we can still explore Levenshtein for teaching purposes.
[1]http://www.maa.org/publications/periodicals/college-mathemat...