For most people who just want to play around and are using MacOS or Windows, I'd just recommend lmstudio.ai. Nice interface, with super easy searching and downloading of new models.
on windows it's an unsigned binary , backed by a website that only indicates a twitter/discord/github as far as explaining their organization, and the github doesn't include source on the client itself, only models.
If by lower-end Macbook air, you mean with 8GB of memory, try the smaller models (Such as Orca Mini 3B). You can do this via LM Studio, Oogabooga/text-generation-webui, KoboldCPP, GPT4all, ctransformers, and more.
I'm biased since I work on Ollama, and if you want to try it out:
32GB should be fine. I went a little overboard and got a new MBP with M2 MAX and 96GB, but the hardware is really best suited at this point to a 30B model. I can and do play around with 65B models, but at that point you're making a fairly big tradeoff in generation speed for an incremental increase in quality.
As a datapoint, I have a 30B model [0] loaded right now and it's using 23.44GB of RAM. Getting around 9 tokens/sec, which is very usable. I also have the 65B version of the same model [1] and it's good for around 3.6 tokens/second, but it uses 44GB of RAM. Not unusably slow, but more often than not I opt for the 30B because it's good enough and a lot faster.
Local memory management will definitely get better in the future.
For now:
You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.
My personal recommendation is to get as much memory as you can if you want to work with local models [including VRAM if you are planning to be executing on GPU]
By lower-end I meant that the Airs are quite low-end in general (compared to Pro/Studio). I have the maxed-out 24gb, but 16gb may be more common among people who might use an Air for this kind of thing.