Although I have only skimmed the paper, I think it's kinda trying to say (although someone with a better mathematical background than me might poke me for this) that the reward hypothesis (http://www.incompleteideas.net/rlai.cs.ualberta.ca/RLAI/rewa...) - the notion that every goal or purpose can be framed as the maximization of a real-valued function - isn't really applicable for most of the time. This is quite intuitively agreeable even without the math - do we really think that the many things we do in our lives were perfromed to optimize an "oracle" loss function? Our human mind is comprised of ridiculously complex systems of neurons and cells that generates a variety of emergent behaviors, and saying that those emergent behaviors are actually a solution of a very complex optimization problem is very, very bold. Often the reward functions are just abstractions of what we perceive (although they aren't entirely useless - keep in mind that all models are wrong but some are useful).
Although the paper is trying to say that the real number system isn't robust enough to express the goal/purpose of more complicated, "abstract" tasks, it speculates that a higher-order number system (such as the hyperreal or surreal numbers) would be able to achieve this. I currently disagree with this view - I view of "intelligence" as we know of today more as emergent phenomena of complex systems of autonomous agents (in the case of human intelligence, the emergent phenomena of neurons and other cells interacting with the external world), but that's a topic for another day.
I think you understood the basic gist of the paper quite well, that's a good way of describing it, and thanks for the link.
>it speculates that a higher-order number system (such as the hyperreal or surreal numbers) would be able to achieve this
I didn't mean to give that impression, sorry if it came off that way. Rather, what I say is that those number systems don't suffer the particular flaw that the real numbers suffer. There might still be other flaws. That's why in the beginning of Section 4 I wrote: "There are at least two potential ways to change RL so as to make it applicable to such tasks and, thus, at least potentially capable of leading to AGI. Of course, there is no guarantee that removing the roadblock in this paper will cause RL to lead to AGI. There might be other roadblocks besides the inadequate reward number system"
Although the paper is trying to say that the real number system isn't robust enough to express the goal/purpose of more complicated, "abstract" tasks, it speculates that a higher-order number system (such as the hyperreal or surreal numbers) would be able to achieve this. I currently disagree with this view - I view of "intelligence" as we know of today more as emergent phenomena of complex systems of autonomous agents (in the case of human intelligence, the emergent phenomena of neurons and other cells interacting with the external world), but that's a topic for another day.