Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, I mean in the article. Perhaps I was too brief, but the TL;DR is to use "finite automata based regex engines" and "backtracking based regex engines."

Basically, there's a long tradition of calling the latter "NFA engines" and the former "DFA engines." But doing so is inaccurate and confusing. Especially since "DFA engines" also include things called "NFA engines" that are different from the former "NFA engines" as used when describing backtracking engines.

If that doesn't make sense, then just remember this: DFAs and NFAs have equivalent expressive power. So if you're describing a regex engine that gives you thinks like backreferences, recursion and all sorts of other bells and whistles, then it can't be an NFA. It is almost certainly implemented with backtracking.

> Do you have any tips for learning as much as you have aside from mulling over papers and existing implementations?

I started with Russ Cox's article series and read the RE2 source code. Then I built my own.

I'd also check out anything related to Hyperscan. It's a deep rabbit hole to tumble down though.



Understood, this helped me a lot, thank you. I definitely had confusion around DFA/NFA being a _type of regex engine implementation_ and see now why that is incorrect.

I updated the article[0] so as to hopefully not confuse others, really appreciate you taking the time to respond to me again, and sorry for not understanding your earlier comment :)

[0] https://github.com/hexops/devlog/commit/0159b510556c2943cbca...


No worries. Feel free to email me if you want to geek out about regexes. :-) My email is on my web site, linked in my profile.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: