Galerkin Approximation

activatedgeek · on March 28, 2021

And, guess what? Since, the Galerkin approximation requires one to choose a basis that is appropriate to the problem at hand, we now have a deep learning solution too (since neural network learning is essentially equivalent to learning an adaptive basis).

It is called the Deep Galerkin Method [1]. In a nutshell, the method directly minimizes the L2 error over the PDE, boundary conditions and initial conditions. The integral is tricky though, and computed via a Monte Carlo approximation.

[1]: https://arxiv.org/abs/1708.07469

pmiller2 · on March 28, 2021

Why would you use the Monte Carlo method when the quasi-Monte Carlo method converges so much more quickly? I admit, I am a little biased, because I worked on some QMC stuff in grad school, but it works really, really well in practice.

https://en.wikipedia.org/wiki/Quasi-Monte_Carlo_method

contravariant · on March 28, 2021

If the number of dimensions is high enough then pretty much all points are far away from each other so there's no real need to worry about points clustering. You're at a much greater risk of simply missing interesting parts of your space than you are of oversampling any part of it.

blake1 · on March 28, 2021

This can be especially bad if one corner hides a catastrophe. Any automatic method is likely to have an exponential struggle to sample appropriately. At a certain point, you just need to apply your own understanding of the problem you are trying to solve, and resort to an ad hoc method to do the “right” thing.

pmiller2 · on March 28, 2021

That's exactly the problem QMC methods solve. By choosing a sequence of sample points of low discrepancy, you make sure to sample the entire space as evenly as possible.

contravariant · on March 29, 2021

It does but if you've got 100s of dimensions then you have no hope of ever sampling anything close to the entire space. Even if you just pick 2 points per dimension you've got no way of trying every combination.

pvitz · on March 28, 2021

Just a guess, but if you think about Sobol sequences, maybe the available dimensions get exhausted in this use case and then QMC doesn't perform well anymore.

GlenTheMachine · on March 28, 2021

Check out the rest of the guy's blog. I desperately wish there were more resources where complex mathematical concepts were first introduced without all of the proofs.

magicalhippo · on March 28, 2021

A tangent, but I was exposed to the Galerkin approximation when learning about the Finite Element Method, well over 10 years ago.

As part of the course I got introduced to the FEniCS project[1].

They had Python code looking very much like the math equations generating C++ code at runtime, compiling it into a Python module which got dynamically loaded and executed.

This way they got speeds which rivaled or surpassed handwritten C++, as the C++ code could be optimized around the specific problem, but with superior ergonomics of writing the equations almost directly.

It really blew my mind. I had heard about Java doing JIT but this was on another level for me. Not terribly fancy these days but at the time it really helped me expand my thinking about how to solve problems.

[1]: https://fenicsproject.org/

nextaccountic · on March 28, 2021

Another project that works like this is devito https://www.devitoproject.org/ - the python code generates C code, calls gcc to compile it, dynamically links the object code with dlopen(), then calls the compiled code. That way, the hot code loop doesn't run in Python

adgjlsfhk1 · on March 28, 2021

You really should check out Approxfun.jl and DifferentialEquations.jl. The julia ecosystem for this type of thing is incredibly well developed. You get autodiff, auto-sparsity detection, auto-vectorization and more for free. Also, these tools leverage the fact that Julia is fast and has a good macro system so the code you generate is just normal Julia code, which can be used with any of the other tools of the language. Also multiple dispatch means that you don't have to use anything complicated to write this. Just write your equations "normally" and this all just works.

orbifold · on March 28, 2021

The constant shilling for Julia gets old after a while. When it comes to numerical mathematics the implementation language is just a way to express ideas that are more or less independent of whatever language you are using. Last time I checked Julia had close to no professional grade libraries for PDE solving , while there are multiple deal.ii, FenICs, petsc high quality libraries for C++. Don’t get me wrong Julia is great, but what really matters is the quality of the available libraries. The differential equations work in Julia is nice and obviously the resulting code is performant, but a ton of work in numerical mathematic is spent on adapting methods to specific problems and at that point C++ feels like a much nicer choice to me in many cases.

krull10 · on March 28, 2021

Julia has some young projects that look promising as future broad FEM/PDE discretization libraries, like Gripap.jl, but they are certainly not at the level of the more mature C++-based libraries you mention.

For my own work I would never say C++ is a nicer choice; I hate writing and dealing with C++ code. It is however a necessary choice since that is where many of best the libraries live.

knlje · on March 28, 2021

These packages you mention are not suitable for the typical use cases of the finite element method. There are some packages in the Julia ecosystem that may be suitable. I have no recent experience on using them though.

mscharrer · on March 28, 2021

I find the general idea of treating differential equations as infinitely dimensional linear systems quite powerful. I was first introduced to the concept while studying quantum mechanics, but applications are everywhere.

greesil · on March 28, 2021

I first encountered it for deriving finite element methods for structural analysis, and later as a general method for PDEs. Sadly this treatment is about as inscrutable as my grad level PDE course.

mpoteat · on March 28, 2021

This made as little sense to me as it did when I was talking the Finite Element Methods class during graduate school.

Still don't quite understand why you can't just use Runge Kutta methods to numerically solve these problems. I became quite good at manipulating the symbols to derive variational solutions while having absolutely no idea what any of it meant.

rahimiali · on March 28, 2021

Runge-kutta works when you’re given an initial condition and the derivative is with respect to just one variable (so you’re given f(t) just at t=0). What do you do when you’re given a boundary condition and the derivative is with respect to many variables? This.

the_svd_doctor · on March 28, 2021

I mean, you can also just do finite differences. Of course Galerkin opens up the very “rich” design space of finite elements methods.

bunje · on March 28, 2021

Finite differences are simple in cubes and domains that can be mapped to one. Not so much in an arbitrary tetrahedral mesh.

bunje · on March 28, 2021

I think variational problems are more naturally understood through energy minimization. You start with the energy of the system and try to minimize it via derivatives. Then you arrive at a variational problem. The differential equation is then more of an afterthought.

curt15 · on March 28, 2021

Finite differences is the PDE analogue of Runge-Kutta, and it is certainly used (in CFD for example). However, finite element methods have several advantages:

* It can handle PDEs on domains with complicated geometries, while finite differences really prefer rectangular domains. This consideration doesn't apply to ODEs which are always solved on one-dimensional intervals.

* For any numerical approximation it is important to have convergence guarantees, and as the blog post mentions, the analysis is much more well understood for finite elements, particularly on irregular geometries. Strang and Fix's 1973 book is the classic reference here.

neutronicus · on March 28, 2021

If your solution is troublesome in some way (discontinuous, singular), then you can't rely on methods that need it to have derivatives