I think your first question is open for people to explore.
The answer to the second is yes - it's all vector embeddings, and they're aligned to each other by finding a dataset that matches pairs (eg. images with captions)
The real use for exotic embeddings will have to be in analyzing the embeddings themselves I think, otherwise it's easier to shove normal vectors downstream into other models.
The answer to the second is yes - it's all vector embeddings, and they're aligned to each other by finding a dataset that matches pairs (eg. images with captions)
The real use for exotic embeddings will have to be in analyzing the embeddings themselves I think, otherwise it's easier to shove normal vectors downstream into other models.