jp armstrong

Virtual File System for KDB/Q

qfuse is a virtual file system for KDB/Q that unifies multiple HDBs into a single VDB. It actively maintains a mapping of all files in source historical databases (HDBs) and collates their content into a single mounted directory simulating a virtual database (VDB). This utilizes FUSE which enables non-root applications to interact with POSIX disk commands on a mounted directory.

Source code on Github

Screenshot_20250829_220112

enhance image

How did we get here?

A problem with large kdb infrastructure, especially written by different teams with a common end users, the users generally want to see all HDBs from one API. There are several patterns for tackling this with various degrees of technical complexity and maintenance.

One option leverages par.txt which was meant for loading one segmented HDB across multiple volumes to instead load various HDBs. This works if there's no collisions in table names, a BIG if. Here's a previous article on how to implement it.

Another option is to use symlinks. Scan all source HDBs and collate them into one VDB with symlinks. Name collisions can be handled by renaming the symlinks. Only things to consider are schema prototypes and the date ranges for each of the tables or risk creating dead links to non-existent HDBs.

The 3rd option, create your own virtual file system. qfuse utilizes FUSE (Filesystem in Userspace) library to register custom POSIX file system calls (e.g. open, read, write, stat) functions on a specific mounted directory. When a user ls mount the readdir system call is routed to qfuse where it can list a custom directory tree structure. Partitioned tables are collated into dated folders, splays and symfiles are at the root. It looks for directories containing .d (aka splayed tables) files are prefixed with namespace.

Source HDBs:

bars/
├── 2025.08.30/
│   ├── daily/
│   └── minute/
└── sym_daily
ref/
└── sec/
└── sym_sec

Target VDB generated by qfuse:

vdb/
├── 2025.08.30/
│   ├── bars.daily/
│   ├── bars.minute/
└── ref.sec/
└── sym_daily
└── sym_sec

Internals

On start up, qfuse reads a config that lists all the source paths and their desired namespace names.

namespace,source
bars,hdb/bars
ref,hdb/ref

It will iterate through each path, scan the contents, and insert every sub-directory path and their files into a directory tree. A tree was chosen as there's 2 operations qfuse needs to be good at: listing contents of a sub-directory and finding the original path of a specific file.

There is a timer thread that will periodically re-scan the source directory for new files and remove any files that no longer exist. The child nodes are sorted to make look ups faster with a binary search.

Future Work

First iteration was to get the file system operational with decent performance. There's a few performance improvements I'll investigate further such as:

As for feature enhancements, adding support for:

#kdb #qfuse