> In Laci Babai, you have one of the most legendary and fearsome theoretical computer scientists there ever was, and in graph isomorphism, one of the most legendary and fearsome problems,” wrote Scott Aaronson, a theoretical computer scientist at the University of Texas, Austin, in an email. “A year ago, Laci threw an unbelievable knockout punch at [graph isomorphism], and now the problem itself seems to have gotten off the mat and thrown a counterpunch.”
Scott Aaronson is just such a fantastic writer. So impressive.
Agreed, this is a fantastic description of one of the true legends of CS. I had the great fortune of having Laci as my algorithms professor. Many great researchers are lousy lecturers, Laci is anything but. He's a captivating lecturer. Listening to him it feel like your brain is moving twice as fast as normal. That's how clearly and quickly he explains concepts.
I was watching a talk of his the other day and thinking "he's old enough to be my dad, but damn, he's cute..." ¬_¬
Throughout my life, I've found that the best lecturers aren't necessarily the smartest or the most credentialed, they're the most apparently passionate and enthusiastic — so much so that their enthusiasm has a tendency to "rub off" on their audiences. Enthusiasm leads to engagement, which leads to attentiveness and self-motivation for learning. Even a mediocre teacher can be a great lecturer, and I've had a few in my lifetime (though unfortunately I can count their number on one hand).
I absolutely agree. Maths is a blind spot for me, never managed to advance beyound basic algebra, though I appreciate some of the beauty involved. But this explanation managed to convey at least some of the issues in a highly accessible way.
I wonder whether someone can improve the algorithm back to quasipolynomial. Hopefully Babai's approach is not a dead end, the improvement from his algorithm is so huge that there might still be hope to salvage the ideas.
It's already the fastest algorithm available. Not sure why anyone would need to salvage his ideas; anyone needing state-of-art graph isomorphism algorithm will implement it.
Nope, it's theoretically amazing interesting, but anyone who really wants to solve graph isomorphism would use Brendan McKay's Nauty, or one of the up-to-date implementations (Saucy, Bliss, Hungry, Ferret).
Personal interest: I wrote Ferret, my PhD student wrote Hungry.
VF2 is faster than Nauty for EXTREMELY trivial problems (and many problems are extremely triial), but I've found many real-world problems where it falls off a cliff.
VF2 is basically the most stupid search imaginable -- if the problem is easy it stumbles into the solution very quickly, but if any work is required it degrades into a horrible exponential search.
On the other hand, Nauty is very difficult to confuse -- as it reasons about the whole graph (whereas VF2 just does local reasoning), your graph has to have global properties which make it hard, which are rare (and basically never occur in real-world graphs, unless they come from some mathematical problem).
I recall him saying in a talk, that there is no need to implement his algorithm.
In a way the algorithm proves that there are no very hard isomorphism cases, which then guarantees that our other practical heuristic algorithms can't do as badly as we might have feared.
Asymptotically fastest is unfortunately very different from fastest in practice. I mean salvage his ideas towards a quasipolynomial (or even polynomial!) algorithm for graph isomorphism. I don't really expect a practical algorithm for the problem in the near future.
Graph isomorphism is interesting, in that practical algorithms already exist. It's actually very, very hard to produce difficult instances of the graph isomorphism problem (to be exact, instances where existing algorithms take non-polynomial time).
Much of the progress in recent years has come from first finding instances where existing algorithms (in particular, partitioning from Brendan McKay's Nauty) take exponential time, and understanding why.
I would love to know whether there are some practical (non-math/cs-user facing) problems which can be solved with graph (iso,sub-iso,mono,epi,...) morphism algorithms.
Sure, there's a lot you can do with it in the math and theoretical cs world, probably in chemistry (derivation of chemical reactions?) and maybe bioinformatics as well.
But despite the power of the approach - general comparison of structures and/or combinatoric enumeration of all "instances" of pattern in data graph - are there any success stories of companies with products which are killing it because of using graph morphisms under the hood? (or maybe even directly exposing pattern/data graph relations to the user)
The major use I know of is finding symmetries in other search problems.
Take your problem in Constraint Programming / Mixed Integer Programming / SAT / SMT, generate a parse tree, turn that parse tree into a graph, and find symmetries of that graph.
If done in the right way, symmetries of the graph are symmetries of the original problem, and knowledge of these symmetries can be used to avoid redundant symmetric search. This is done all the time in combinatorial search systems.
One problem for many other real-world problems is it is much more common to want "almost symmetries" (for various definitions of almost symmetry). It turns out this is a much harder problem, and algorithms for "pure" graph isomorphism can't be easily modified, to solve the 'almost symmetry' problem.
not disagreeing with you at all, more like expanding your comment.
for some definitions of 'almost', especially the vague human ones, there are very surprising results (well, not for people with CS education, but for the other ones). e.g. 2-SAT vs 3-SAT, hamiltonian path (visit all nodes on a graph) vs eulerian path (visit all edges), shortest path vs longest path, etc. like i said, the 'almost' is debatable, but for the untrained eye the problems are very similar.
Could you please state some examples of these real-world problems where "almost symmetry" would help?
Face/shape recognition comes to mind, where inexact graph matching would be useful, but are there any others?
I'm not trolling, just interested in graph matching in general as amateur. I have this feeling that there should be many more applications of these techniques, but amusingly, in practice there's always a more specialized, non matching algorithm used. (probably because of matching's algorithm complexity which doesn't scale for larger problems...)
Anytime where knowing a permutation of thing1 is the same as thing2 allows you to treat both thing1 and thing2 in a similar fashion. A related problem in canonization, finding a normal form you can put both thing1 and thing2 into knowing that if the normal form is the same they are isomorphic.
Compiler optimization, generating hardware layouts of circuits, and helping neural nets scale. Brute force isomorphism (where you check permutations but don't care if the object under study is a graph) is a bit lower than the n! naive method Babai uses as a subroutine in the paper, https://oeis.org/A186202
There are a couple of problems in cheminformatics that can be/are solved with graph isomorphism algorithms: maximum common substructure, graph edit distance, etc., the latter is quite useful to determine molecule similarity but very computationally expensive, especially compared to topological hashing algorithms.
It's interesting how nearly all algorithmic asymptotic running times can be specified just in terms of exp, log and powering. exp(exp(sqrt(log n)))=exp(exp(exp(log(log n)/2))) is a beautiful thing.
But how comes we so rarely encounter other functions? Clearly this class doesn't cover everything we might need. Where are the exp(exp(a^(-1)(n))) algorithms, where a^(-1) is the inverse ackerman function?
The iterated logarithm, a very slow-growing function, shows up naturally in algorithm analysis: https://en.wikipedia.org/wiki/Iterated_logarithm. And the inverse Ackermann of course shows up in union-find.
It's also worth pointing out that these functions sometimes turn up just in terms of the bounds that have been proven about algorithms -- the functions we see can reflect the kinds of analysis we can do of algorithms as much as the performance of the algorithms themselves.
And we also see some very weird exponents -- the best asymptotic bound we have on a matrix multiplication algorithm is n^2.3728639.
There are some such algorithms. You don't encounter them as often because we are biased towards encountering algorithms that can be implemented efficiently or at least can be approximated efficiently (i.e., a great many "NP-complete" problems admit of usable approximation algorithms with varying degrees of usability).
Plus, in perhaps a bit deeper philosophical sense, we are ourselves biased towards expressing problems that have relatively simple complexities as their solution. While there are some simple problem that have complicated optimal solutions, certainly, I'd say that in general the vast bulk of problems for which the optimal solution has Ackermann's complexity are problems that we can't really express or manipulate, either. We focus a lot on our ability to create and express solutions because problems in the real world tend to force themselves upon us since long since before computer science was even a thing, but our ability to express problems is limited too!
> There are some such algorithms. You don't encounter them as often because we are biased towards encountering algorithms that can be implemented efficiently
Oh, but inverse Ackerman is very slow, and exp(a^(-1)(n)) is very slow as well. Same with the iterated logarithm.
The later pops up in recursive algorithms that partition in parts of size log n. I don't think that's a very weird thing to do.
NP-Hardness proofs don't say "you would have to solve xyz NP-Hard problem to solve this", they say "if you could solve the problem you're interested in, you could use the solution to solve all other NP-complete problems". It's not that they incorporate a hard problem -- rather, you show that your problem is hard enough that it can be used to solve other hard problems.
Those fucntions all map quite well to programming or problem space constructs. For example, Polynomial time comes naturaly from nested loops over the input. Exponential and factorial complexities come naturally from exhaustive searches over combinatorial explosion, and exponentials also crop up in the context of naive recursion. Logs come from anything that looks like "recursively eliminate a fraction of the possibilities".
Scott Aaronson is just such a fantastic writer. So impressive.