Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's worth noting that unlike KF and particle filters, none of the deep learning stuff that's making headlines these days is actually capable of properly modeling / simulating / approximating a dynamic system with a continuous time variable.

Most striking in deep learning, is that when all is said and done (i.e. unwrapping the so-called "recurrent" part of a deep net) all you're left with is a dumb function from Rn to Rp.

In other words, while the term "recurrent" might fool you into believing it, there isn't any actual feedback loop in deep nets, and they therefore can't properly model something whose output evolves over time.

I think there's something to be explored in adding actual feedback into deep nets.

The price is of course that the output is not a vector anymore: you must compute feed forward passes until the net "settles" to a fixed point or rather until the desired behavior is observed at the outputs.

Backprop also becomes a whole new game in that setting.

[EDIT]: but the prize is: your deep net, instead of being an approximation of a function from Rn to Rp, becomes an approximation of something that can transform a function into another function.



"therefore can't properly model something whose output evolves over time."

I dunno, WaveNet seems to work pretty good... It's an autoregressive model, and absolutely responds to it's own output.

https://deepmind.com/blog/article/wavenet-generative-model-r...


You are confusing that recurrent networks are trained unrolled, i.e. with finite history, with them beeing evaluated with finite history, which they generally arent.

The key advantage a kalman filter has is in its long term properties being predictable, things like observability, controllability, and bibo stability.

The key downside of kalman filters is that their nice properties only apply to linear systems and linear observations.


Check out the Extended Kalman filter for non-linear systems.


Maybe better to check unscented Kalman filter instead, because the extended one is a mathematical horror.


The EKF works by linearizing the process, and suffers on even moderately nonlinear models as a result. It also suffers from the Gaussian assumption. So people switched to unscented (tries to model the underlying probability better) and particle filters (same idea), each giving better accuracy for most all problems. Now plenty of problems are switching to full Bayesian models as the ultimate in accuracy.

So if you like KF, EKF, UKF, be sure to look into this entire chain of accuracy vs computation tradeoff algorithms.


Regarding dynamical systems, Neural ODEs hold promise as they are more analogous to the numerical solvers. I think with any dynamical system, purely data driven approach can be problematic because, as you say, the standard NN architectures are not great for it. You can however add physical constraints, e.g. minimize the Lagrangian as the objective function, to bring stability to these NN emulators.


KF algorithm when layed out as:

1. Figure out the initial state

2. use process model to predict state at next time step

3. adjust belief of the true state by taking account prediction uncertainty

4. get a measurement and belief according to its uncertainty/accuracy

5. compute residual

6. compute scaling factor (what's more important, measurement or prediction)

7. set state with scaling factor

8. update belief in state given uncertainty of measurement

pretty much allows plugging in a black box anywere. from process model to belief updates.


The Missile Knows Where It Is... https://www.youtube.com/watch?v=bZe5J8SVCYQ


The video feels like a weird "monty python explains Kalman Filters" sketch.


Pretty funny video! I like the first comment on YouTube "It all gets worse when the missile knows where you are".


> you must compute feed forward passes until the net "settles" to a fixed point

Deep Equilibrium Models http://implicit-layers-tutorial.org/deep_equilibrium_models/


ODEs don't have feedback either. Their evolution is a function of it's point-in-time state, just like Recurrent nets.

But, you can build great priors with a well-modeled ODE


> but the prize is: your deep net, instead of being an approximation of a function from Rn to Rp, becomes an approximation of something that can transform a function into another function.

Won't it be more like an approximation of a process instead of an approximation of a function? (I mean better described as that)


Thats not true if your formulate you dl system in the same "two steps" ways of the kalman filter. Stuff like DynaNet or "backprop kalman" are really good deep learning extensions of the kalman filter for non linear / perceptual inputs.


How do you explain results like https://arxiv.org/abs/1708.01885, which uses LSTMs to do exactly what a Kalman filter would do except with less manual work?


Neural macros? :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: