Life Altering PostgreSQL Patterns

(mccue.dev)

26 points | by thunderbong 4 months ago ago

12 comments

leighleighleigh 3 months ago
I also use UUID's everywhere - primarily due to concurrency requirements - but I ended up writing a bunch of functions (`MACRO`s) that lookup UUIDs and join them to other tables where needed, instead of using VIEWs or JSON queries.
I've not seen this approach documented much online - but it works really well for me. It has the advantage of keeping all my tables flat, while still being able to encode business logic into the database.
A typical query I might type, would look something like..
`SELECT ,trip_start_date(id) FROM trips WHERE trip2region(id) = 'AU'`
Where the two macros are just simple table joins, saving me the boilerplate work. Where this approach has really shined, though, is through function composition...
`SELECT FROM trips WHERE loc2region(trip_origin(id)) NOT IN loc2region(location_is_hotel(trip_locations_visited(id)))`
So something like the above, which is a mix of scalar and table functions, would give me all the trips where the traveler stayed in hotels outside of their home region. Maybe not the best example, because I don't actually work with trip/hotel/travel data at all, but I'd be interested reading more about this approach. I was surprised how well-optimized the queries are.... \endyap
aarmenaa 4 months ago
I like most of these patterns and have used them all before, but a word of caution using UUID primary keys: performance will suffer for large tables unless you take extra care. Truly random values such as UUIDv4 result in very inefficient indexing, because values don't "cluster." For databases, the best solution is to use a combination of a timestamp and a random value, and there are multiple implementations of UUID-like formats that do this. The one I'm familiar with is ULID. It's become a common enough pattern that UUIDv7 was created, which does exactly this. I don't know if it's possible to generate UUIDv7 in Postgres yet.
[-]
- emccue 4 months ago
  Yeah - in another place this was shared a few people mentioned that. It is slated to be part of postgres 18, but as of right now to get the "good UUIDs" you need to homeroll something.
  Thats the only reason I didn't mention it, seemed a bit of a rabbit hole.
frou_dh 3 months ago
Nice article, bookmarked.
For the "Mechanically name join tables" point, I would at least put a tiny bit of effort into their names, rather than zero. For example, using a naming convention to make that kind of table instantly recognisable, e.g. 'person_x_pet' instead of just 'person_pet'.
mamcx 4 months ago
> They are terrible in that removing obsolete columns requires a drop and recreation, which can become a nightmare when you build views on views.
This is a problem with the `alter` abilities, but in fact the lack of using `views` is a problem.
`views` is how you abstract the database. Also, combined with `instead of` triggers become amazing to simplify the other triggers.
Nothing more awesome than flatten your db so external code barely ever do joins themselves.
My trick to solve the problem of migrations is that I put all the `views, functions, etc` in their own section of a `sql` file:
-- DEPS --
-- VIEWS
-- FUNCTIONS
-- TRIGGERS
And then I have a special `DROP ALL VIEWS, CREATE ALL VIEWS` comment in my migrations that do that. It remove the problem greatly!
[-]
- emccue 4 months ago
  So an intrinsic problem with flattening the database is that a lot of applications actually do want to work on a tree of information (that's what the json query part is all about)
  Dropping a view can work, but not when the system is under load (found that out the hard way.) I'm not saying its the worst possible situation, just a greater burden than you would expect going into it.
  Can you elaborate on `instead of` triggers? How do you make use of those?
joewood1972 4 months ago
Loving this list. One thing I would totally recommend though is to look at tools like DBT, as this supports a lot of these patterns out of the box (snapshots, materialized views, seeds for 'system tables' etc.). It also addresses the concerns with views and schema management.
[-]
- emccue 4 months ago
  So I looked up DBT - it seems to be a company where the first thing I saw on their website was "Gen AI."
  That doesn't inspire confidence in the longevity of their offering - but i'm also unclear exactly what it is they are offering. Can you give the audience the pitch?
  [-]
  - joewood1972 4 months ago
    Sure, DBT (Data Build Tool) has been around for a while. It's a way to manage SQL (and also Python) data pipelines - the "T" in ELT. Home page for DBT Core is here https://docs.getdbt.com/ (there is also a hosted DBT cloud premium offering). There are some good books on Data Engineering with DBT - I particularly liked "Data Engineering with dbt: A practical guide to building a cloud-based, pragmatic, and dependable data platform with SQL" by Roberto Zagni.
hdjjhhvvhga 4 months ago
> Truly random UUIDs doesn't sort well (and this has implications for indexes)
This has been solved already in more than one way (UUIDv7, ULIDs, Snowflake IDs to name a few).
ahofmann 4 months ago
Most of those patterns are usable in any SQL database like mysql, sqlite et al. So read the article, even if you're not using postgresql.
4 months ago
[deleted]