I tried something not dissimilar, but without AI models: having webpages rendere...

I tried something not dissimilar, but without AI models: having webpages rendered in a real browser, but "headless". But there was really no way for the webpage to detect it was "headless": I'd render in a real browser, in an actual X session, under Xephyr. And I wouldn't be showing that Xephyr X server (so I wouldn't see what that real browser was rendering).

I'd then put gray rectangles on what where ads and only show visually that, with the grey rectangles, after having covered the ads with gray rectangles.

I just did it as some quick proof-of-concept: there are plenty different ways to do this but I liked that one. It's not dissimilar to services that renders a webpage on x different devices, without you needing to open that webpage on all these devices.

But the issue is that while it's relatively easy to get rid of ads, it's near impossible to get rid of submarine articles/blogs and it's getting harder and harder by the day to get rid of all the pointless webpages generated by ChatGPT or other LLMs that are flooding the web.

Meanwhile sticking to the sites I know (Wikipedia / HN / used car sales websites / a few forums I frequent etc.) and running a DNS locally that blocks hundreds of thousands of domains (and entire countries) is quite effective (I run unbound, which I like a lot for it's got many features and can block domains/subdomains using wildcards).

I'm pretty sure detecting and covering ads before displaying a webpage can be done but I'd say the bigger problem is submarines and overall terribly poor quality LLM generated webpages.

So basically: is it even worth it to detect ads while the web has now got a much bigger problem than ads?