Building a relational database: File structure

For around a year I've thought about writing a toy database, but not been sure where to start. Recently a couple of things aligned and I dipped my toes into the water.

The database that I use most is PostgreSQL and Postgres is well documented, so I'm using it as a rough model. The number of potential features is huge and overwhelming, so I'm thinking in very small steps. However, the database will need at least these things:

I've started with the storage engine because it took my interest and the first steps seemed surprisingly tractable.

Folders and files

Postgres running on 1 machine can have many databases. The databases are stored in 1 directory, conventionally called PGDATA. Each database has its own subdirectory and within a database's subdirectory, each table has its own file.


PGDATA/
├── db1/
└── db2/
    ├── tableA
    └── tableB

A table's file is carefully structured. The file is subdivided into 8kb sections, most often referred to as "pages". 8kb is just the norm for Postgres. Other databases have a different, fixed page size or let pages vary in size.

I don't understand the details yet, but page size relates to the performance of the underlying storage, e.g. your HDD or SSD. The main sticking point seems to be that the physical storage doesn't read 1 bit at a time. Instead, each device has a minimum amount of data that it can read and write. If your HDD reads in 4kb blocks, asking for 1 bit is wasteful because the HDD still has to read 4kb. At this point I don't understand what happens to the excess data, but presumably it gets discarded somewhere between the HDD read and returning the requested data to the user?

Anyway, to recap to the file format: a table's data is 1 file and each file has many 8kb chunks of data called pages. Each page has a specific structure, with Postgres using a format called a "slotted page". Each page has

The slotted file format seems odd at first, but it has a big advantage: records can vary in size. If I have a table with a string column, 1 row might contain just 1 letter in that column and another row might contain 100s of letters. That would be awkward to store if the database record had a fixed width. With a slotted page, however, it's fine because each pointer stores the start location of a record, without making assumptions about record length.

The records (your table rows) start at the end of the 8kb page and grow towards the start of the page, eating into the free space. A new pointer is added to the start of the free space (the opposite end to the record) and stores the location of the new record. Over time, the records and pointers grow towards each other. That's convenient because it avoids the need to guess how many pointers might be needed or to keep moving records around to make room for new pointers, as would be necessary if pointers and records were stored next to each other.

Because a page is always 8kb, on creation a page might be mostly free space between the pointers and the records. Over time the free space fills, there isn't enough space to fit another record between the existing records and the pointers, so another 8kb page is added to the file.

That's pretty much where I am. The more I read, the more I realise that database design involves a mind boggling number of small details, some of which matter a great deal to performance. I'm defaulting to "what Postgres does" and have so far avoided decision paralysis. And for now I'm not worrying about performance -- I'm just taking tiny steps towards getting something working.

Roughly how I broke down the work

This isn't a code-along "Build your own database" kind of post. But I thought I'd include some first steps for the benefit of people who, like me, are curious to try but don't know where to start.

As described in the post, the basic structure of the database will be a series of directories and files, so I wrote functions to:

Each step is very small, but together they're a good start.

Useful things

Shell commands

Resources