I really like the idea of building TypeScript off of SQL types. But my ideal tool would be able to infer types from queries, not schemas. For example.. if I did:
await db.query("SELECT user_id, COUNT(*) AS count FROM comments GROUP BY user_id");
(If you mean infer solely off of queries, I don't think that's possible unless you jam a bunch of type casts into the field selection part of the query.)
import { Query } from "@codemix/ts-sql";
const db = {
things: [
{ id: 1, name: "a", active: true },
{ id: 2, name: "b", active: false },
{ id: 3, name: "c", active: true },
],
} as const;
type ActiveThings = Query<
"SELECT id, name AS nom FROM things WHERE active = true",
typeof db
>;
// ActiveThings is now equal to the following type:
type Expected = [{ id: 1; nom: "a" }, { id: 3; nom: "c" }];
@jbaudanza sorry my response actually didn't reflect your suggestion at all!
Yeah I would love that sort of tool! I think theres some magic you can do with the es6 templating parsing, but that would be a bit complex to do as a supporting library
No worries! I enjoyed checkout out your implementation!
I think such a tool would be complicated to implemented. I was thinking maybe of integrating with tagged template literals somehow. Maybe something like:
db.query(sql`SELECT COUNT(*) FROM users`)
And then some tool could parse the AST to find all the SQL and generate types.
How about keeping the query as a separate, importable .sql file that can be analyzed by a tool and it could spit out the typings to a .sql.d.ts file? The editor support for mixing languages in a single file was not great, last time I checked.
sqlx is able to do this, but using compile time macros. For a typescript project you'd possibly need a running process reading your source files for queries and probing the database for what will come back.
These barely work. Even their example shows it generating `type : string[]` but `weaknesses : Weakness[]`. You would need to start from a schema definition (JSON Schema, SQL CREATE TABLE, etc.) to make the result any good.
Yeah the good thing about using such a low level tool is that everything remains typed until you reach the query statement. Postgres and typescript enums work great together.
For one of my projects I check in the generated types and replace 'unknown' (from jsonb) with the specific object manually. That way the concept of any or unknown gets pushed mostly out of the codebase (for simple crud. When joining we use normal SQL queries with some helper tools that verify the fields we are picking exist). Hopefully will manage to figure out how to do with with postgres comments at somepoint.
I am making a competitor to this and I am really happy that the approach seems to be gaining traction. Having the database as the source of truth and migrating it with SQL feels very right for me.
It does mean that architectural decisions start in the database as well, which I am trying to govern with "schema linting" ([link redacted]). If you are using Postgres, I would love help with building this.
I'm also massively in favour of having the db as the source of truth. When I first started looking into building my own SaaS I actually tried to go down the database first design, where most of your functionality runs within postgres triggers. But it wasn't near as easy to do things and writing SQL logic is much harder to test.
Figured the next best thing is to just tie the database in extremely closely using typescript into the backend (I rarely define custom types, always try to use Pick/Omit/Partial combinations all the way to the client SDKs).
I ended building generating forms based on the meta data in postgres (select fields know the values from enums and multiselect from enum arrays), so if you define a table in postgres, you just specify which fields you want to render (out of the available columns) and in what order and it pretty much gives you a UI form that is directly rendered in react. Works really well when you have a form component library at your disposal.
I am looking to release the setup I use for deployment but tbh it isn't tested (via tests, since I rely on typescript for most of my data shuttling) and there's alot to it so need to write alot of documentation which i don't have time for.
But, the jist is, we APIs like this (this can be improved with less generic json schemas):
export const updateSamf: APIFunction<UpdateSamf, void> = async ({ database }, { samfId, ...data }) => {
if (Object.keys(data).length === 0) {
throw new InvalidParametersError('No fields to update')
}
if (data.title && data.title.length < 3) {
throw new InvalidParametersError('Title needs to be at least 3 characters')
}
if (data.tags) {
data.tags = data.tags.map((t) => t.toLowerCase())
}
await database.crudUpdate<Samf>('samf', data, { samfId }, new SamfNotFoundError())
if (data.cover && previousCover) {
await content.delete(previousCover)
}
}
export const routes: APIRoutes = [{
type: 'patch',
route: StudioRoutePath.SAMF_CRUD,
func: updateSamf,
schema: 'UpdateSamf',
permissions: isSamfOwner
}]
The form building part I don't no, I built it for a client that has a few dozen forms with hundreds of fields / options which is hard to keep up with. So we settled on using a csv file that is essentially a products owner way of saying, i want this field on this entity of this type.
Our CI then checks that the field actually exists in the typescript schema, and if not the build fails on our development environment (and suggests the basic SQL to fix it). The company is small enough for us to be able copy the sql into a migration script and let it run without skipping a beat.
It's not a magical pill, renaming, deleting and custom migrations need to be dealt with manually. But for the most part 80%_of our changes are additions and this way it just works.
There's a similar project (https://github.com/erik/squabble), that uses a portable version of Postgres' SQL parser to spit out an AST, which lints can consequently traverse.
This approach alleviates the need for bringing up a Postgres cluster to lint the schema.
I'd be curious to know if there's something to be gained by depending on a running database instance (especially because it complicates CI).
I usually run a postgres db (as a docker image on circleCI) when I'm running the build. I'm not sure how scalable it is (currently we have a hundred migrations and it's a few seconds so not bad).
The flow is:
- run migrations on an empty DB
- create the interfaces and enums
- run typescript
- derive all the json schemas from the interfaces for API validation
This means we gitignore the actual DB Types which is great in order to reduce the amount of sources of truth and deal with possible issues
This all runs as a pre-commit hook as well. I guess by doing so we sort of validate the SQL is legit via postgres itself. The only issue we face after a commit gets passed the setup stage and tries to run the migration itself is if theres a data* inconsistency we weren't aware of (trying an enum value we thought wasn't used but was). But that sort of things requires special tasks anyways and the migration fails so nothing ever goes wierd (which is why I <3 SQL migrations )
Keep in mind that this might become problematic for folks that are running Postgres inside Docker on macOS or Windows, as IO performance is quite poor.
I put together a up / down migration validation system a while ago (start up two databases, apply all-1 up migrations on A and all up migrations + final down migration on B) for a pretty sizable schema.
Folks that were using Docker had to wait upwards of two minutes, while natively installed Postgres would finish in under 10sec.
As long as their database in docker is being incrementally updated (using the same migration scripts as on production which only apply new ones) is this hit only when spinning up a clean docker image?
I'm considering exporting an SQL file whenever we run the migrations for our dev tools / tests to use the one large file. But I haven't found it as easy as export -> import to try it out yet.
> As long as their database in docker is being incrementally updated (using the same migration scripts as on production which only apply new ones) is this hit only when spinning up a clean docker image?
Yes, but being able to quickly return to a known-good state by trashing the database is still useful because getting migrations right without testing them is hard.
Using named volumes helps because they don't need to be bridged through to the host filesystem.
The difference, I suppose, is that by inspecting the schema, you can make assertions that take the whole architecture into account, rather than just the specific piece of SQL of one migration. I don't personally run the tool in CI, I run it before generating types, which is when I am writing a new migration.
Prisma is not mature, they keep putting major breaking changes in patch (not even minor) semvar updates. They're the only package I've had to lock patch version on.
They just underwent an obvious coder restructuring/hiring after recent funding, which has caused some of the chaos.
Migrations are only recently out of alpha, don't support some common operations (rename?), and DOWN migrations are not supported.
Prisma is innovative, and there may be some fair tradeoffs to their declarative approach and fast pace. But I've never used a less mature ORM.
It looks like Kanel is actually alot more feature rich, I wouldn't see any advantage of switching.
I started using schemats because it's a single CLI interface and gives me all I need. I put together a few tools for a typescript heavy backend development experience and one of my requirements was that each one would be powerful enough to achieve my goals + not powerful enough to trip on how to use it (or in schemats case, patch it).
Currently schemats and ts-json-schema-generator got me there, only issue was schemats was a few years out of date so the types weren't as friendly.
Ive been dreaming up a TS SQL library that would use the new template literal features to implement a full postgres SQL parser in the type system. I imagine this would allow you to define a series of db migration statements (alter table, add column, etc) and the net result of that series of migrations would be rolled up into a schema definition as a type.
Then you could take this schema type and feed it into some function as a generic to generate a raw query executor that could again fully parse the SQL and perfectly infer the result type.
I believe all this is possible but I don't really know where to start.
Author of ts-sql here. Shin Ando has forked my proof of concept / toy implementation and made it much more practical, if you're thinking about spending time on this idea that's probably a better starting point: https://github.com/andoshin11/sqlpture
Hasura / GraphQL / GraphQL-codegen does pretty much exactly what you describe wrt fully typed query results, except you type graphql queries rather than sql ones
If you like this idea, Zapatos[1] does something similar but provides more comprehensive types (differentiated into what you get from a SELECT, what you can INSERT, etc.), plus some help using these types to interact with the database.
I wrote a similar tool for Go and Postgres but it also generates the serialization code. [1]
- I prefer generating code from queries instead of generating active record style models based on tables.
- I think it’s worth it to spin up Postgres to get type info instead of trying to parse queries. Reimplementing Postgres’ type inference is a Herculean task even with an AST. Sqlc, which directly inspired pggen does this.
- It’s hard to support customization of generated code in a structured way. The way I’m leaning towards is if you need customization, you need to use the SDK which provides plugin hooks. Using flags for customization is the road to madness.
- What I would like to do is serialize directly to protobuf to avoid the serialization dance of Go structs to Proto structs.
I've been using a fork of schemats for a while, it is a great tool.
We quickly figured that using the table definitions as our app layer models wouldn't cut it, notably to handle polymorphism and non ideal (sic) table design.
We decided to "hand write" (mostly codegen) the types we use in the business logic and have a translation layer between those and the interfaces generated by schemats. For simple tables the mapping is 1:1, but for complicated tables we centralize type checks and defensive assertions there. For polymorphism it would look like this:
CREATE TYPE invoice_line_type AS ENUM('product', 'shipment');
CREATE TABLE invoice_lines AS (
id SERIAL PRIMARY KEY,
type invoice_line_type NOT NULL,
product_id INT REFERENCES products(id),
shipment_id INT REFERENCES shipments(id),
amount NUMERIC NOT NULL,
CHECK (CASE WHEN type = 'product' THEN product_id IS NOT NULL ELSE TRUE END),
CHECK (CASE WHEN type = 'shipment' THEN shipment_id IS NOT NULL ELSE TRUE END)
);
The translation layer:
import { invoice_lines, invoice_line_type } from 'src/schema'
export type ProductInvoiceLine = {
id: number
type: invoice_line_type.product
product_id: number
amount: number
}
export type ShipmentInvoiceLine = {
id: number
type: invoice_line_type.shipment
shipment_id: number
amount: number
}
export const parse = (row: invoice_lines): ProductInvoiceLine | ShipmentInvoiceLine => {
const base = {
id: row.id,
amount: row.amount,
}
switch (row.type) {
case invoice_line_type.product:
return {
...base,
type: invoice_line_type.product,
product_id: ensureNumber(row.product_id),
}
case invoice_line_type.shipment:
return {
...base,
type: invoice_line_type.shipment,
shipment_id: ensureNumber(row.shipment_id),
}
}
}
How does this relate to Prisma? https://www.prisma.io/ Do any of you have some experience with it? Also, I think it would be even better if there was a library like JOOQ for Java: https://www.jooq.org/ It not only generates your entity classes, but also gives a full-blown type-safe query DSL!
It copies Postgres comments over to JSDoc/TSDoc comments, emits some data about foreign key relationships and supports TS types for json/jsonb columns via @type comments. Feel free to copy any of those feature if you think they’re good ideas :)
So far I have addressed the issues related to using it with node 14, package dependency issues and getting it inline with modern typescript (removing concept of namespaces and favouring Pick/Omit helpers instead).
That being said given how lightweight the actual package is I would be more than happy to address those issues when raised.
Seems like it generates TS types, but doesn’t help with data access? i.e. not trying to compete with tools that generate full data access layers, like Prisma https://www.prisma.io/ ?
If so, out of curiosity, what’s the use case for generating just the types? How are you actually getting data to/from the DB?
I think the comment I made above in in regards to our db crud updates might help answer that!
But short answer is we just use node-postgres. If the get/update/delete/insert is a simple fields/table/where we use a helper function. If it's more complex we write custom SQL and use the following approach:
database.query<DB.User & DB.Pet>(`
SELECT *
FROM "pet"
LEFT JOIN user on pet.owner = user.id
`)
What are the tradeoffs of generating the unique ID uuid's from the database vs from the code?
Compared to ORM's, is one of the main differences that the db objects define the types and constraints and those propagate out to the code? And ORM's start with code and propagate out to the DB?
Honestly I don't know what the tradeoffs are . I usually rely on the db to create the id (regardless if its SERIAL / UUID). The only time I get create a UUID outside is if the client needs a UUID in its initial state. For example if tracking a users progress through a form it allows you to skip a useless creation roundtrip in order to perform the first update).
Yeah thats pretty much it! Less code in the functions -> less inconsistencies with the db. And for example removing an enum in the db would fail all your front end components using it, so it's really across the whole pipeline.
One alternative approach: one could write SQL queries to generate the interfaces for a set of tables (or views), and customize those to your codebase/needs.
I've been generating Java code from postgres (created by liquibase migrations) using Jooq, then that Java code is transformed into a OpenAPI spec using Springfox, which in turn is turned into any language of choice (such as typescript types). It's really good.