post image

Don’t Compromise, embed postgres.

Jan 22, 2025 Sam Redai

We’ve all been there. You start building a simple tool, the next big thing of course, but you want to keep things light, fast, and nimble-especially for your prototype. But adding postgres feels like bringing a grand piano to a camping trip. So, you settle for something “simpler.” Maybe you store data in JSON files or sprinkle your state across a few in-memory structures. Maybe you even build a small abstraction over it all, promising yourself, “When this thing gets big, I’ll swap this logic out for a real database.”

Spoiler alert: you won’t.

Let me tell you what actually happens.

The Price of “Simple”

At first, it’s great. There’s no database server to manage, no pesky SQL migrations, and no annoying pg_hba.conf to decipher. You’re flying high! Then your users, probably just your friends at first, start pushing boundaries and suddenly your app feels less like a slick prototype and more like last place at the corporate hackathon.

Pretty soon, you realize you’ve coded yourself into a corner. Your JSON storage doesn’t scale. Debugging is impossible. That clever in-memory solution loses everything when the app crashes (and it will crash). And worst of all, you’ve built core (very core) features around these hacks. By the time you introduce a real database, it’s like trying to retrofit a Mazda with wings from a piper warrior—it looks ridiculous, it doesn’t work, and you’re questioning every decision that led you here.

Big Fat Postgres

You see, postgres doesn’t have to be that heavy. These days, you can embed it directly into your application in pretty much any programming language. That means all the power of a relational database, none of the “install this, configure that” drama. Your users don’t even need to know it’s there and you can get right to the point and develop your application without cutting corners.

Take Go, for example. The github.com/fergusstrange/embedded-postgres library is great for embedding postgres directly into your app. No separate server, no setup headaches, just raw SQL power at your fingertips. Let’s see how this works in practice.

Setting Up Embedded postgres

First, you’ll need the library:

go get github.com/fergusstrange/embedded-postgres

Next, a basic setup to get postgres running inside your Go application looks like this:

package main

import (
	"log"
	"github.com/fergusstrange/embedded-postgres"
)

func main() {
	postgres := embeddedpostgres.NewDatabase()
	if err := postgres.Start(); err != nil {
		log.Fatalf("Failed to start embedded Postgres: %v", err)
	}
	defer func() {
		if err := postgres.Stop(); err != nil {
			log.Fatalf("Failed to stop embedded Postgres: %v", err)
		}
	}()

	log.Println("Postgres is running. Can you believe it?")

}

Congratulations, you’ve just embedded postgres into your app. It really is that simple!

Where Does the Data Go?

By default, postgres will store its data in a temporary directory—perfect for ephemeral use cases. But of course, data is the new gold, right? Most likely, your users will want to know they won’t lose their data, even if they’re just running your app locally. To persist the data to a specific directory, just configure the embedded postgres with a local path to use:

postgres := embeddedpostgres.NewDatabase(
	embeddedpostgres.DefaultConfig().
		DataPath("db/data").
		Port(5433),
)

Now, every time your app runs, it uses the same data directory. Users won’t even notice it’s there. They’ll just think your app is magic. The coolest part about this is you can set that data path dynamically. For example, you can create a hidden directory within any directory where the application is launched from and use that as the postgres data directory. The application’s state is then managed based on the directory the user launches the app from, which is a similar pattern you see with other tools that want to remain “simple” and not be bothered with postgres–they typically work by searching for some user maintained config file stored within the directory.

How Are The Tables Created?

There are a ton of SQL migration tools out there. Different language communities usually gravitate towards certain ones; Java and flyway, python and alembic, etc. For go applications I like to use the golang-migrate library. This lets you store SQL files in a migrations/ directory and embed the directory into your app’s binary. You can then use it in a RunMigrations function like this that you can run on every startup of your application.


import (
	"embed"
	"github.com/golang-migrate/migrate/v4/source/iofs"
	"log"
	"log/slog"
	"os"

	"github.com/golang-migrate/migrate/v4"
	_ "github.com/golang-migrate/migrate/v4/database/pgx"
	_ "github.com/golang-migrate/migrate/v4/source/file"
)

//go:embed migrations/*.sql
var migrationsFS embed.FS

func RunMigrations(postgresURI string) {
	source, err := iofs.New(migrationsFS, "migrations")
	if err != nil {
    log.Fatalf("Failed to read SQL migrations script", err)
	}

	m, err := migrate.NewWithSourceInstance("iofs", source, postgresURI)
	if err != nil {
		log.Fatalf("Failed to initialize migrate instance: %v", err)
	}

	if err := m.Up(); err != nil && err != migrate.ErrNoChange {
    log.Fatalf("Migration failed")
	}
}

Keeping It Lightweight

“But Sam,” I hear you saying, “isn’t postgres overkill for local development??”

To that I say, not if you tweak it to be lightweight.

For example:

postgres := embeddedpostgres.NewDatabase(
	embeddedpostgres.DefaultConfig().
		AdditionalConfig(map[string]string{
			"shared_buffers": "16MB",
			"max_connections": "10",
		}),
)

At this point, your embedded database feels as light as SQLite but packs a punch like postgres. Your users won’t even know to thank you because it’s entirely seamless to them.

The Backup Story

I went back and forth a bit wondering if such an illustrative example of allowing users to backup/restore their application state was really necessary for the case this post is trying to make. I decided it is, because after seeing that postgres can be reasonably embedded into a typical client-side-only flavored application, the next apprehension is usually whether you would lose the simplicity of state just being one or more config files that your users are already aware of since…well since they had to create them to use your app. The short answer is, you’d be right to think this. Your app’s state being captured in a disk-persisted embedded postgres database does make it a little more work to help users easily transfer the current state of the app, whether to make backups, store snapshots using a source control system, or even to recreate it on another person’s machine.

However, with a reasonable amount of code, you can make all of these things just as simple by providing an export/import functionality. The idea here is the user builds up state organically by using your app’s interface, whatever that may be (CLI, browser UI, etc.). Then, your interface allows for exporting the state of the app to a single file. That file, carries all of the data contained in the postgres instance and, as you can guess, the import functionality simply allows loading that data into a fresh instance of your app. Let’s stick with go and see what this looks like using pgx.

For each table in your data model, you’ll want to query for all of the rows.

query := fmt.Sprintf("SELECT * FROM %s", table)
rows, err := conn.Query(ctx, query)  // conn is a *pgx.Conn

Then you just iterate over all the rows, pulling the column names from the descriptor and placing the data into an array of maps.

var tableData []map[string]interface{}
for rows.Next() {
	values, err := rows.Values()
	if err != nil {
		return nil, fmt.Errorf("failed to get row values for table %s: %w", table, err)
	}

	rowMap := make(map[string]interface{})
	for i, col := range rows.FieldDescriptions() {
		rowMap[string(col.Name)] = values[i]
	}

	tableData = append(tableData, rowMap)
}

Most likely your data model includes multiple tables so you should do this for each table, making an outer struct where the key is the table name and the value is the array of rows for that table.

backupData := make(map[string][]map[string]interface{})
backupData[table] = tableData  // table is the name of the table in the postgres data model

Lastly, just write it all to a backup file.

var buffer bytes.Buffer
json.NewEncoder(&buffer).Encode(backupData)

All of this wrapped up into a function looks like this.

package db

import (
	"bytes"
	"context"
	"encoding/json"
	"fmt"
)

func BackupDatabase(ctx context.Context, conn *pgx.Conn, tables []string) ([]byte, error) {
	backupData := make(map[string][]map[string]interface{})

	for _, table := range tables {
		query := fmt.Sprintf("SELECT * FROM %s", table)
		rows, err := conn.Query(ctx, query)
		if err != nil {
			return nil, fmt.Errorf("failed to query table %s: %w", table, err)
		}
		defer rows.Close()

		var tableData []map[string]interface{}
		for rows.Next() {
			values, err := rows.Values()
			if err != nil {
				return nil, fmt.Errorf("failed to get row values for table %s: %w", table, err)
			}

			rowMap := make(map[string]interface{})
			for i, col := range rows.FieldDescriptions() {
				rowMap[string(col.Name)] = values[i]
			}

			tableData = append(tableData, rowMap)
		}

		if err := rows.Err(); err != nil {
			return nil, fmt.Errorf("row iteration error for table %s: %w", table, err)
		}

		backupData[table] = tableData
	}

	var buffer bytes.Buffer
	if err := json.NewEncoder(&buffer).Encode(backupData); err != nil {
		return nil, fmt.Errorf("failed to encode backup data: %w", err)
	}

	return buffer.Bytes(), nil
}

Restoring, as you might have guessed, is just the reverse process.

Read in the backup file.

var data map[string][]map[string]interface{}
json.Unmarshal(backupData, &data)
rows, _ := data[table]  // table is the table name being restored

Insert the rows into the database.

for _, row := range rows {
    columns := []string{}
    placeholders := []string{}
    values := []interface{}{}

    i := 1
    for col, val := range row {
        columns = append(columns, col)
        placeholders = append(placeholders, fmt.Sprintf("$%d", i))
        values = append(values, val)
        i++
    }

    query := fmt.Sprintf(
        "INSERT INTO %s (%s) VALUES (%s) ON CONFLICT DO NOTHING",
        table, strings.Join(columns, ", "), strings.Join(placeholders, ", "),
    )

    if _, err := conn.Exec(ctx, query, values...); err != nil {
        return fmt.Errorf("failed to insert into table %s: %w", table, err)
    }
}

All together, this is what your restore database function might look like.

func RestoreDatabase(ctx context.Context, conn *pgx.Conn, backupData []byte, tables []string) error {
	var data map[string][]map[string]interface{}
	if err := json.Unmarshal(backupData, &data); err != nil {
		return fmt.Errorf("failed to parse backup data: %w", err)
	}

	for _, table := range tables {
		rows, exists := data[table]
		if !exists {
			continue
		}

		for _, row := range rows {
			columns := []string{}
			placeholders := []string{}
			values := []interface{}{}

			i := 1
			for col, val := range row {
				columns = append(columns, col)
				placeholders = append(placeholders, fmt.Sprintf("$%d", i))
				values = append(values, val)
				i++
			}

			query := fmt.Sprintf(
				"INSERT INTO %s (%s) VALUES (%s) ON CONFLICT DO NOTHING",
				table, strings.Join(columns, ", "), strings.Join(placeholders, ", "),
			)

			if _, err := conn.Exec(ctx, query, values...); err != nil {
				return fmt.Errorf("failed to insert into table %s: %w", table, err)
			}
		}
	}

	return nil
}

This will mostly get the job done, but here are three easy improvements:

  • Store the version of the app or hash of the app’s data model in order to identify if backups are compatible
  • Include a check of the current state of the database to make sure you’re restoring into a clean state of the app
  • Include additional metadata such as when the backup was made and some summary stats on the content

To summarize my thoughts on serializing the state of the app–it is a bit more code but I would argue that it replaces the code you would have needed to write to ingest the config-type files in what some may consider simpler alternatives. Furthermore, you can provide a rich experience for content creation through the app interface instead of having users muck around with super generic file formats in their IDE. Most importantly of all, deploying a cloud hosted version is now as simple as just pointing the postgres client connection to a cloud hosted postgres instance, which is probably the main reason why anyone makes plans to eventually use postgres in the first place.

What About SQLite or DuckDB?

Sure, you could use SQLite or DuckDB. They’re great tools! But here’s what it comes down to. These databases are excellent only if you’re certain they’re what you’ll use in production. If you eventually need the power of postgres, you’ve just created a mountain of work for yourself. Worse, you’ve probably written a bunch of features around your chosen database that won’t translate well when you inevitably make the switch.

The beauty of embedding postgres is that you start with the end in mind. You get all the production-grade features—transactions, full-text search, geospatial queries—without pretending your JSON file is anything more than a glorified notepad.

Closing Thoughts

Prototyping is all about speed, but shortcuts can turn into dead ends. I’ve come to feel strongly that embedding postgres into your application allows you to get the best of both worlds: a lightweight, developer-friendly setup that scales effortlessly into production.

Sure, not every app needs a database. But when you do, don’t settle for less. Postgres might feel like overkill at first, but trust me, your future self will thank you.

-Sam

Other Posts

post image
Don’t Compromise, embed postgres.

We’ve all been there. You start building a simple tool, the next big thing of course, but you want to keep things light, fast, and nimble-especially for your prototype. But adding postgres feels like bringing a grand piano to a camping trip. So, you settle for something “simpler.” Maybe you store data in JSON files or sprinkle your state across a few in-memory structures. Maybe you even build a small abstraction over it all, promising yourself, “When this thing gets big, I’ll swap this logic out for a real database.”

post image
The Git-Backed UI: A Design Catastrophe Wrapped in Complexity

In software, simplicity is the key to good design. Users don’t need to struggle with complexity; they want tools that help them work, not get in the way. But tools like dbt Cloud, which should make data transformation easier, do the opposite. Instead of a simple process, they wrap users in layers of Git integration, adding confusion where there should be none.