Not really. A websocket is "just" a TCP connection.
If the API provider is using a language+framework that can't easily handle lots of concurrent TCP connections, then yes, that could cause scaling issues, but... that's a choice.
There are plenty of ways to build scalable socket servers, and it's especially feasible in languages like Go and Elixir.
Server side state can take many forms. In general, you already have to track state per customer anyways, and that often goes in a database. You can also keep per client state in the database too, if it won’t fit in memory for some reason.
idk, it all depends on what you’re using the websockets for. If you have to send messages from one websocket client to another, things can get harder because those two websockets might not be connected to the same server.