Postgres to TypeScript Interfaces and Enums

jbaudanza · on June 20, 2021

I really like the idea of building TypeScript off of SQL types. But my ideal tool would be able to infer types from queries, not schemas. For example.. if I did:

   await db.query("SELECT user_id, COUNT(*) AS count FROM comments GROUP BY user_id");

I would like a TypeScript object that looks like:

  Array<{ user_id: number, count: number }>

throwaway858 · on June 21, 2021

This library does exactly what you want: https://github.com/MedFlyt/mfsqlchecker

and it also validates the types of the parameters that go into the query

cldellow · on June 20, 2021

If I'm understanding you, you might like pgtyped: https://github.com/adelsz/pgtyped

(If you mean infer solely off of queries, I don't think that's possible unless you jam a bunch of type casts into the field selection part of the query.)

yasserf · on June 20, 2021

Yup! I actually use that in my own node-postgres api.

So the code itself with typescript is here:

https://github.com/vramework/vramework/blob/main/backend-com...

But the general gist is we have generic crud operators that are away of what the table are and based on that know the types of what we are returned.

  const {viewId } = await database.crudInsert<UserJournal>(
  'user_journal', 
  { viewId, userId, srcOgg: src[0], sprite: JSON.stringify(sprite), duration }, 
  ['viewId']
  )

  await database.crudUpdate<UserJournal>('user_journal', { srcOgg: src[0], sprite: JSON.stringify(sprite), duration }, { viewId })

  const { srcOgg, sprite, duration } = await database.crudGet<UserJournal>(
  'user_journal', 
  ['srcOgg', 'sprite', 'duration'], 
  { viewId, userId }, 
  new UserJournalNotFoundError()
  )

spiralx · on June 22, 2021

Here you go!

https://github.com/codemix/ts-sql

    import { Query } from "@codemix/ts-sql";
    
    const db = {
      things: [
        { id: 1, name: "a", active: true },
        { id: 2, name: "b", active: false },
        { id: 3, name: "c", active: true },
      ],
    } as const;
    
    type ActiveThings = Query<
      "SELECT id, name AS nom FROM things WHERE active = true",
      typeof db
    >;
    
    // ActiveThings is now equal to the following type:
    type Expected = [{ id: 1; nom: "a" }, { id: 3; nom: "c" }];

ducaale · on June 20, 2021

It might be possible to have a babel macro function that inspects the passed SQL query and generate TS type from it.

However, the query might contain a wild card which means the macro function will also need access to the current DB schema.

presentation · on June 20, 2021

If you’re ok with a query builder interface that matches postgres you might like Mammoth.

https://github.com/Ff00ff/mammoth

oauea · on June 21, 2021

Just create your views from those queries, so you have schemas. And now your queries are reusable :)

yasserf · on June 20, 2021

@jbaudanza sorry my response actually didn't reflect your suggestion at all!

Yeah I would love that sort of tool! I think theres some magic you can do with the es6 templating parsing, but that would be a bit complex to do as a supporting library

jbaudanza · on June 20, 2021

No worries! I enjoyed checkout out your implementation!

I think such a tool would be complicated to implemented. I was thinking maybe of integrating with tagged template literals somehow. Maybe something like:

  db.query(sql`SELECT COUNT(*) FROM users`)

And then some tool could parse the AST to find all the SQL and generate types.

guntars · on June 20, 2021

How about keeping the query as a separate, importable .sql file that can be analyzed by a tool and it could spit out the typings to a .sql.d.ts file? The editor support for mixing languages in a single file was not great, last time I checked.

PudgePacket · on June 21, 2021

sqlx is able to do this, but using compile time macros. For a typescript project you'd possibly need a running process reading your source files for queries and probing the database for what will come back.

https://github.com/launchbadge/sqlx/#compile-time-verificati...

febrilian · on June 20, 2021

doesn't it already exist? but it converts from your json https://quicktype.io/

morelisp · on June 20, 2021

These barely work. Even their example shows it generating `type : string[]` but `weaknesses : Weakness[]`. You would need to start from a schema definition (JSON Schema, SQL CREATE TABLE, etc.) to make the result any good.

yasserf · on June 20, 2021

Yeah the good thing about using such a low level tool is that everything remains typed until you reach the query statement. Postgres and typescript enums work great together.

For one of my projects I check in the generated types and replace 'unknown' (from jsonb) with the specific object manually. That way the concept of any or unknown gets pushed mostly out of the codebase (for simple crud. When joining we use normal SQL queries with some helper tools that verify the fields we are picking exist). Hopefully will manage to figure out how to do with with postgres comments at somepoint.

cachehit · on June 20, 2021

I am making a competitor to this and I am really happy that the approach seems to be gaining traction. Having the database as the source of truth and migrating it with SQL feels very right for me.

It does mean that architectural decisions start in the database as well, which I am trying to govern with "schema linting" ([link redacted]). If you are using Postgres, I would love help with building this.

yasserf · on June 20, 2021

Thats very cool!

I'm also massively in favour of having the db as the source of truth. When I first started looking into building my own SaaS I actually tried to go down the database first design, where most of your functionality runs within postgres triggers. But it wasn't near as easy to do things and writing SQL logic is much harder to test.

Figured the next best thing is to just tie the database in extremely closely using typescript into the backend (I rarely define custom types, always try to use Pick/Omit/Partial combinations all the way to the client SDKs).

I ended building generating forms based on the meta data in postgres (select fields know the values from enums and multiselect from enum arrays), so if you define a table in postgres, you just specify which fields you want to render (out of the available columns) and in what order and it pretty much gives you a UI form that is directly rendered in react. Works really well when you have a form component library at your disposal.

random_kris · on June 20, 2021

You have any of this on github?

yasserf · on June 20, 2021

I am looking to release the setup I use for deployment but tbh it isn't tested (via tests, since I rely on typescript for most of my data shuttling) and there's alot to it so need to write alot of documentation which i don't have time for.

But, the jist is, we APIs like this (this can be improved with less generic json schemas):

  export const updateSamf: APIFunction<UpdateSamf, void> = async ({ database }, { samfId, ...data }) => {
    if (Object.keys(data).length === 0) {
        throw new InvalidParametersError('No fields to update')
    }
    if (data.title && data.title.length < 3) {
        throw new InvalidParametersError('Title needs to be at least 3 characters')
    }
    if (data.tags) {
        data.tags = data.tags.map((t) => t.toLowerCase())
    }
    await database.crudUpdate<Samf>('samf', data, { samfId }, new SamfNotFoundError())
    if (data.cover && previousCover) {
         await content.delete(previousCover)
    }
  }

  export const routes: APIRoutes = [{
    type: 'patch',
    route: StudioRoutePath.SAMF_CRUD,
    func:  updateSamf,
    schema: 'UpdateSamf',
    permissions: isSamfOwner
  }]

yasserf · on June 20, 2021

The form building part I don't no, I built it for a client that has a few dozen forms with hundreds of fields / options which is hard to keep up with. So we settled on using a csv file that is essentially a products owner way of saying, i want this field on this entity of this type.

For example:

  entity,field,type,constraints
  delivery,arrived_at,Date,{ before: 0 }

And in the form:

  form,field_1,field_2,...
  delivery_details,left_at,arrived_at,...

Our CI then checks that the field actually exists in the typescript schema, and if not the build fails on our development environment (and suggests the basic SQL to fix it). The company is small enough for us to be able copy the sql into a migration script and let it run without skipping a beat.

It's not a magical pill, renaming, deleting and custom migrations need to be dealt with manually. But for the most part 80%_of our changes are additions and this way it just works.

PhilipTrauner · on June 20, 2021

There's a similar project (https://github.com/erik/squabble), that uses a portable version of Postgres' SQL parser to spit out an AST, which lints can consequently traverse. This approach alleviates the need for bringing up a Postgres cluster to lint the schema. I'd be curious to know if there's something to be gained by depending on a running database instance (especially because it complicates CI).

yasserf · on June 20, 2021

That's an interesting approach.

I usually run a postgres db (as a docker image on circleCI) when I'm running the build. I'm not sure how scalable it is (currently we have a hundred migrations and it's a few seconds so not bad).

The flow is:

- run migrations on an empty DB - create the interfaces and enums - run typescript - derive all the json schemas from the interfaces for API validation

This means we gitignore the actual DB Types which is great in order to reduce the amount of sources of truth and deal with possible issues

This all runs as a pre-commit hook as well. I guess by doing so we sort of validate the SQL is legit via postgres itself. The only issue we face after a commit gets passed the setup stage and tries to run the migration itself is if theres a data* inconsistency we weren't aware of (trying an enum value we thought wasn't used but was). But that sort of things requires special tasks anyways and the migration fails so nothing ever goes wierd (which is why I <3 SQL migrations )

PhilipTrauner · on June 20, 2021

> This all runs as a pre-commit hook as well.

Keep in mind that this might become problematic for folks that are running Postgres inside Docker on macOS or Windows, as IO performance is quite poor.

I put together a up / down migration validation system a while ago (start up two databases, apply all-1 up migrations on A and all up migrations + final down migration on B) for a pretty sizable schema.

Folks that were using Docker had to wait upwards of two minutes, while natively installed Postgres would finish in under 10sec.

yasserf · on June 20, 2021

Thats good to know.

As long as their database in docker is being incrementally updated (using the same migration scripts as on production which only apply new ones) is this hit only when spinning up a clean docker image?

I'm considering exporting an SQL file whenever we run the migrations for our dev tools / tests to use the one large file. But I haven't found it as easy as export -> import to try it out yet.

PhilipTrauner · on June 20, 2021

> As long as their database in docker is being incrementally updated (using the same migration scripts as on production which only apply new ones) is this hit only when spinning up a clean docker image?

Yes, but being able to quickly return to a known-good state by trashing the database is still useful because getting migrations right without testing them is hard.

Using named volumes helps because they don't need to be bridged through to the host filesystem.

cachehit · on June 20, 2021

That's neat, I wasn't aware of it!

The difference, I suppose, is that by inspecting the schema, you can make assertions that take the whole architecture into account, rather than just the specific piece of SQL of one migration. I don't personally run the tool in CI, I run it before generating types, which is when I am writing a new migration.

brailsafe · on June 21, 2021

What exactly do you mean by competitor? Is there a reason you'd want more people using your open source project than another?

iddan · on June 20, 2021

Prisma is a really mature project that does just and much more https://www.prisma.io/docs/concepts/components/introspection

EarthLaunch · on June 21, 2021

Prisma is not mature, they keep putting major breaking changes in patch (not even minor) semvar updates. They're the only package I've had to lock patch version on.

They just underwent an obvious coder restructuring/hiring after recent funding, which has caused some of the chaos.

Migrations are only recently out of alpha, don't support some common operations (rename?), and DOWN migrations are not supported.

Prisma is innovative, and there may be some fair tradeoffs to their declarative approach and fast pace. But I've never used a less mature ORM.

brycelarkin · on June 20, 2021

I think Prisma requires you to use their client to get the typing benefits, while this package just generates typings.

taylorlapeyre · on June 20, 2021

We are using Kanel[1] for this right now — is there an advantage to using this instead?

[1]: https://github.com/kristiandupont/kanel

yasserf · on June 20, 2021

It looks like Kanel is actually alot more feature rich, I wouldn't see any advantage of switching.

I started using schemats because it's a single CLI interface and gives me all I need. I put together a few tools for a typescript heavy backend development experience and one of my requirements was that each one would be powerful enough to achieve my goals + not powerful enough to trip on how to use it (or in schemats case, patch it).

Currently schemats and ts-json-schema-generator got me there, only issue was schemats was a few years out of date so the types weren't as friendly.

tengbretson · on June 20, 2021

Ive been dreaming up a TS SQL library that would use the new template literal features to implement a full postgres SQL parser in the type system. I imagine this would allow you to define a series of db migration statements (alter table, add column, etc) and the net result of that series of migrations would be rolled up into a schema definition as a type.

Then you could take this schema type and feed it into some function as a generic to generate a raw query executor that could again fully parse the SQL and perfectly infer the result type.

I believe all this is possible but I don't really know where to start.

lukeramsden · on June 20, 2021

Maybe this may help as some inspiration https://github.com/codemix/ts-sql

phpnode · on June 20, 2021

Author of ts-sql here. Shin Ando has forked my proof of concept / toy implementation and made it much more practical, if you're thinking about spending time on this idea that's probably a better starting point: https://github.com/andoshin11/sqlpture

seer · on June 20, 2021

We’ve been using pg-typed to give us almost the exact thing you are talking about.

While it does have its limitations, its vastly better than raw sql queries.

https://pgtyped.vercel.app/

furstenheim · on June 20, 2021

It looks interesting, but how does it behave with dynamically created queries?

cldellow · on June 20, 2021

Dynamic queries aren't supported, eg if you want to conditionally join a table or generate where clauses programmatically.

nawgz · on June 20, 2021

Hasura / GraphQL / GraphQL-codegen does pretty much exactly what you describe wrt fully typed query results, except you type graphql queries rather than sql ones

DylanSp · on June 20, 2021

https://github.com/codemix/ts-sql might be worth looking at, at least the parsing section.

presentation · on June 20, 2021

Here’s another one I found:

https://github.com/Ff00ff/mammoth

gmac · on June 20, 2021

If you like this idea, Zapatos[1] does something similar but provides more comprehensive types (differentiated into what you get from a SELECT, what you can INSERT, etc.), plus some help using these types to interact with the database.

[1] https://jawj.github.io/zapatos/

throwout12345x · on June 20, 2021

Wow! I need to try this out.

sa46 · on June 20, 2021

I wrote a similar tool for Go and Postgres but it also generates the serialization code. [1]

- I prefer generating code from queries instead of generating active record style models based on tables.

- I think it’s worth it to spin up Postgres to get type info instead of trying to parse queries. Reimplementing Postgres’ type inference is a Herculean task even with an AST. Sqlc, which directly inspired pggen does this.

- It’s hard to support customization of generated code in a structured way. The way I’m leaning towards is if you need customization, you need to use the SDK which provides plugin hooks. Using flags for customization is the road to madness.

- What I would like to do is serialize directly to protobuf to avoid the serialization dance of Go structs to Proto structs.

[1]: https://github.com/jschaf/pggen

ht85 · on June 21, 2021

I've been using a fork of schemats for a while, it is a great tool.

We quickly figured that using the table definitions as our app layer models wouldn't cut it, notably to handle polymorphism and non ideal (sic) table design.

We decided to "hand write" (mostly codegen) the types we use in the business logic and have a translation layer between those and the interfaces generated by schemats. For simple tables the mapping is 1:1, but for complicated tables we centralize type checks and defensive assertions there. For polymorphism it would look like this:

    CREATE TYPE invoice_line_type AS ENUM('product', 'shipment');
    
    CREATE TABLE invoice_lines AS (
      id SERIAL PRIMARY KEY,
      type invoice_line_type NOT NULL,
      product_id INT REFERENCES products(id),
      shipment_id INT REFERENCES shipments(id),
      amount NUMERIC NOT NULL,
      CHECK (CASE WHEN type = 'product' THEN product_id IS NOT NULL ELSE TRUE END),
      CHECK (CASE WHEN type = 'shipment' THEN shipment_id IS NOT NULL ELSE TRUE END)
    );

The translation layer:

    import { invoice_lines, invoice_line_type } from 'src/schema'

    export type ProductInvoiceLine = {
      id: number
      type: invoice_line_type.product
      product_id: number
      amount: number
    }
    
    export type ShipmentInvoiceLine = {
      id: number
      type: invoice_line_type.shipment
      shipment_id: number
      amount: number
    }

    export const parse = (row: invoice_lines): ProductInvoiceLine | ShipmentInvoiceLine => {
      const base = {
        id: row.id,
        amount: row.amount,
      }

      switch (row.type) {
        case invoice_line_type.product:
          return {
            ...base,
            type: invoice_line_type.product,
            product_id: ensureNumber(row.product_id),
          }
        case invoice_line_type.shipment:
          return {
            ...base,
            type: invoice_line_type.shipment,
            shipment_id: ensureNumber(row.shipment_id),
          }
      }
    }

edem · on June 20, 2021

How does this relate to Prisma? https://www.prisma.io/ Do any of you have some experience with it? Also, I think it would be even better if there was a library like JOOQ for Java: https://www.jooq.org/ It not only generates your entity classes, but also gives a full-blown type-safe query DSL!

danvk · on June 20, 2021

There are many forks of schemats out there that add various features. Here’s mine: https://github.com/danvk/pg-to-ts

It copies Postgres comments over to JSDoc/TSDoc comments, emits some data about foreign key relationships and supports TS types for json/jsonb columns via @type comments. Feel free to copy any of those feature if you think they’re good ideas :)

yasserf · on June 20, 2021

This is awesome! the json types part is perfect. If I found your fork earlier I would have just used that as its what I need for jsonb.

The dates solution I fixed in a rest API wrapper on the client side. It's not great but it hides away the date serialisation issues

https://github.com/vramework/vramework/blob/243db4830d22c537...

Is it performant? Not really. But it stopped us from having to deal with the string issue which worked great.

PhilipTrauner · on June 20, 2021

Quoting the issue that the creator opened on the original repo (https://github.com/SweetIQ/schemats/issues/127):

> I figured having one that upgrades all the dependencies and uses some of the newer JS features would be a nice win.

Is this "fork" also going to address all the other Postgres-related issues of the original?

yasserf · on June 20, 2021

So far I have addressed the issues related to using it with node 14, package dependency issues and getting it inline with modern typescript (removing concept of namespaces and favouring Pick/Omit helpers instead).

That being said given how lightweight the actual package is I would be more than happy to address those issues when raised.

yashap · on June 20, 2021

Seems like it generates TS types, but doesn’t help with data access? i.e. not trying to compete with tools that generate full data access layers, like Prisma https://www.prisma.io/ ?

If so, out of curiosity, what’s the use case for generating just the types? How are you actually getting data to/from the DB?

yasserf · on June 20, 2021

I think the comment I made above in in regards to our db crud updates might help answer that!

But short answer is we just use node-postgres. If the get/update/delete/insert is a simple fields/table/where we use a helper function. If it's more complex we write custom SQL and use the following approach:

database.query<DB.User & DB.Pet>(` SELECT * FROM "pet" LEFT JOIN user on pet.owner = user.id `)

yashap · on June 20, 2021

Ah cool, makes sense :)

stadium · on June 20, 2021

This is really interesting.

What are the tradeoffs of generating the unique ID uuid's from the database vs from the code?

Compared to ORM's, is one of the main differences that the db objects define the types and constraints and those propagate out to the code? And ORM's start with code and propagate out to the DB?

yasserf · on June 20, 2021

Honestly I don't know what the tradeoffs are . I usually rely on the db to create the id (regardless if its SERIAL / UUID). The only time I get create a UUID outside is if the client needs a UUID in its initial state. For example if tracking a users progress through a form it allows you to skip a useless creation roundtrip in order to perform the first update).

Yeah thats pretty much it! Less code in the functions -> less inconsistencies with the db. And for example removing an enum in the db would fail all your front end components using it, so it's really across the whole pipeline.

stadium · on June 20, 2021

Thanks!

Does anything like this exist for Python?

yasserf · on June 20, 2021

No idea! Hopefully someone else might know, first time I heard about alot of these typescript tooling libraries today.

krzkaczor · on June 20, 2021

It seems like it only generates types for entities.

My dream library would be basically a generated client that looks like knex or objection but with full type-safety and mapping.

I don't like prisma that much as it takes control/power from the developer. But it can be great if you don't know SQL.

topicseed · on June 20, 2021

Check out zapatos[0]. Codegen tool that generates interfaces and type-safe queries from a running schema (db).

I've tried them all and this one allows me the most flexibility without abstracting much.

[0] https://github.com/jawj/zapatos

random_savv · on June 20, 2021

One alternative approach: one could write SQL queries to generate the interfaces for a set of tables (or views), and customize those to your codebase/needs.

peter_neumark · on June 20, 2021

I’ve used vramework, and it’s an incredible tool for rapid development. Really good stuff!

yasserf · on June 20, 2021

Thanks Peter!

topicseed · on June 20, 2021

How does it compare to zapatos?

oauea · on June 21, 2021

I've been generating Java code from postgres (created by liquibase migrations) using Jooq, then that Java code is transformed into a OpenAPI spec using Springfox, which in turn is turned into any language of choice (such as typescript types). It's really good.