Do the Homeservers act as Signaling, STUN and TURN servers or are there additional components necessary? Will the SFU part in the future also be part of Synapse or will/are these things split?
The future SFU will similarly be split from the homeserver, with the initial implementation based on either Signal-Calling-Service, ionsfu or mediasoup (we're evaluating all three). Of course, the point of being standards based is that you'll be able to mix & match SFUs and MCUs from other vendors.
As I understand it, signalling will happen over the matrix protocol. as matrix/element already had a 1:1 voice chat, synapse integrates well with coturn, so you typically run coturn along side synapse.