If diffing datasets within the same physical database, generate SQL, execute in the database, analyze and render results.
If diffing datasets across physically different databases, e.g. PostgreSQL <> Snowflake or 2 distinct MySQL servers, pull data in our engine from both sources, diff, and show results.
Sampling is optional but helpful to keep compute costs low for large Mill/Bill/Trill-row datasets.