So as you're probably aware, languages arise from the composition of a few simpl...

So as you're probably aware, languages arise from the composition of a few simpler steps:

1. Strings to ASTs (parsing and, for efficiency and ergonomics, lexing)

2. AST to "outputs" (interpretation)

3. AST to "better" AST (optimization)

4. AST to "analysis outputs" (verification, maybe type checking)

5. AST to some other language, maybe assembly (compilation/transpilation, sometimes called codegen, depending on how you look at it)

You should pick the ones of these you are most interested in! Most likely, to be honest, this will not be parsing or lexing, but if they are you're in luck because there's a lot of research here.

Instead, just get enough parsing so that you can stop worrying about it. Parse a lisp-like language (very easy) or use an embedded DSL (harder, more flexible).

From here, focus on step (2) as it's a significant design challenge and will get your feet very wet in what PLs are. You may never have to focus on the others depending on your proclivities and the styles of languages you want to build.

Steps (3, 4, 5) are all deep rabbit holes themselves. Steps (3) and (5) mostly are required if you're interested in knowing how to make languages fast—although (5) is also necessary if you want to target a particular platform—e.g. write something to run on the Erlang VM. (4) is REALLY interesting, in my opinion, but optional if you want it to be.