How to Build a Custom SQLite Extension: Step-by-Step Guide

Extending SQLite with C and Rust: Best Practices and ExamplesSQLite is a compact, reliable, serverless SQL database engine embedded into countless applications. One of its strengths is extensibility: you can add custom functions, virtual tables, collations, and modules to tailor SQLite to specific needs. This article explores best practices and practical examples for extending SQLite using C and Rust. It compares approaches, shows code samples, explains safety and performance trade-offs, and offers recommendations for packaging and testing extensions.


Why extend SQLite?

  • Flexibility: Add domain-specific functions (e.g., geospatial calculations, custom aggregations).
  • Performance: Implement compute-heavy logic in native code rather than in SQL or application logic.
  • Integration: Expose existing native libraries to SQL queries.
  • Feature gaps: Provide features not bundled into core SQLite or optional extensions (e.g., specialized text processing).

Extension types you can add

  • User-defined scalar functions (sqlite3_create_function_v2)
  • Aggregate functions
  • Virtual tables and modules (sqlite3_create_module)
  • Collations
  • Loadable extensions (shared libraries loaded at runtime)
  • Built-in extensions (compiled into the SQLite core)

C: The canonical way

SQLite is written in C, and the C API is the most direct and widely supported route for extensions.

Pros

  • Direct access to the full SQLite C API.
  • Minimal runtime overhead.
  • Easy to compile into a loadable .so/.dll/.dylib or statically link into an application.

Cons

  • Manual memory management increases risk of bugs.
  • Security issues if careless with inputs (buffer overflows).
  • More boilerplate for safety/error handling.

Build basics

  1. Include sqlite3.h in your C source.
  2. Implement function callbacks with signatures SQLite expects.
  3. Register functions using sqlite3_create_function_v2 or modules with sqlite3_create_module.
  4. Build a shared library and load with SELECT load_extension(…) or sqlite3_enable_load_extension + sqlite3_load_extension.

Example: simple scalar function “reverse_text”

// reverse_text.c #include <sqlite3.h> #include <string.h> #include <stdlib.h> static void reverse_text(sqlite3_context *ctx, int argc, sqlite3_value **argv){     if(argc < 1 || sqlite3_value_type(argv[0]) == SQLITE_NULL){         sqlite3_result_null(ctx);         return;     }     const unsigned char *s = sqlite3_value_text(argv[0]);     int n = (int)strlen((const char*)s);     char *r = (char*)sqlite3_malloc(n + 1);     if(!r){         sqlite3_result_error_nomem(ctx);         return;     }     for(int i = 0; i < n; ++i) r[i] = s[n - 1 - i];     r[n] = '';     sqlite3_result_text(ctx, r, n, sqlite3_free); } #ifdef _WIN32 __declspec(dllexport) #endif int sqlite3_extension_init(sqlite3 *db, char **pzErrMsg, const sqlite3_api_routines *pApi){     SQLITE_EXTENSION_INIT2(pApi);     int rc = sqlite3_create_function_v2(db, "reverse_text", 1, SQLITE_UTF8 | SQLITE_DETERMINISTIC, NULL, reverse_text, NULL, NULL, NULL);     return rc; } 

Build (Unix-like):

gcc -fPIC -shared -o reverse_text.so reverse_text.c -I/path/to/sqlite 

Load:

SELECT load_extension('./reverse_text.so'); SELECT reverse_text('hello'); -- 'olleh' 

Rust: memory safety and ergonomics

Rust offers memory safety, modern tooling, and good FFI capabilities. There are two common approaches to extend SQLite from Rust:

  1. Write a loadable extension in Rust exposing a C ABI.
  2. Use Rust to implement logic and call it from a small C shim (or via cbindgen).

Pros

  • Memory safety reduces many classes of bugs.
  • Modern tooling (cargo) and package ecosystem.
  • Easier to write complex logic and tests.

Cons

  • FFI boundary adds complexity; you must still follow SQLite’s threading and lifetime rules.
  • Slightly more build/tooling setup to produce dynamic libraries compatible with SQLite.
  • Must be careful with panic handling across FFI (avoid unwinding into C).

Rust crates and tooling

  • rusqlite: safe high-level bindings to SQLite for embedding in Rust applications (not for loadable extensions, but useful when SQLite is embedded in Rust apps).
  • sqlite-loadable: crate for writing SQLite loadable extensions in Rust with helpers for common patterns.
  • libsqlite3-sys: low-level raw bindings to the SQLite C API.

Example: a Rust scalar function “is_palindrome”

This example uses raw FFI to export the required sqlite3_extension_init. Use cargo to build a cdylib.

Cargo.toml (relevant parts):

[package] name = "sqlite_ext" version = "0.1.0" edition = "2021" [lib] crate-type = ["cdylib"] [dependencies] libsqlite3-sys = "0.29" 

src/lib.rs:

use std::ffi::{CStr, CString}; use std::os::raw::{c_char, c_int}; use libsqlite3_sys::{     sqlite3, sqlite3_api_routines, sqlite3_context, sqlite3_value, sqlite3_malloc,     sqlite3_result_null, sqlite3_result_text, sqlite3_result_error_nomem, SQLITE_UTF8,     SQLITE_DETERMINISTIC, SQLITE_OK, SQLITE_EXTENSION_INIT1, SQLITE_EXTENSION_INIT2, }; unsafe extern "C" fn is_palindrome(ctx: *mut sqlite3_context, argc: c_int, argv: *mut *mut sqlite3_value) {     if argc < 1 || (*argv).is_null() {         sqlite3_result_null(ctx);         return;     }     let val = *argv;     let text_ptr = libsqlite3_sys::sqlite3_value_text(val);     if text_ptr.is_null() {         sqlite3_result_null(ctx);         return;     }     let cstr = CStr::from_ptr(text_ptr as *const c_char);     let s = match cstr.to_str() {         Ok(v) => v,         Err(_) => { sqlite3_result_null(ctx); return; }     };     let rev: String = s.chars().rev().collect();     let out = CString::new((s == rev).to_string()).unwrap();     sqlite3_result_text(ctx, out.into_raw(), out.as_bytes().len() as c_int, Some(libsqlite3_sys::sqlite3_free)); } #[no_mangle] pub unsafe extern "C" fn sqlite3_extension_init(db: *mut sqlite3, pzErrMsg: *mut *mut c_char, pApi: *const sqlite3_api_routines) -> c_int {     if pApi.is_null() { return 1; }     SQLITE_EXTENSION_INIT1;     SQLITE_EXTENSION_INIT2(pApi);     let rc = libsqlite3_sys::sqlite3_create_function_v2(         db,         CString::new("is_palindrome").unwrap().as_ptr(),         1,         SQLITE_UTF8 | SQLITE_DETERMINISTIC,         std::ptr::null_mut(),         Some(is_palindrome),         None,         None,         None,     );     rc } 

Build:

cargo build --release # resulting in target/release/libsqlite_ext.so (Unix) 

Load and use:

SELECT load_extension('./target/release/libsqlite_ext.so'); SELECT is_palindrome('level'); -- 'true' 

Notes:

  • Ensure you properly manage CString lifetimes and ffi-freeing. Above uses sqlite3_free; link against the right allocator if needed.
  • Avoid Rust panics crossing FFI boundaries: mark extern “C” functions with std::panic::catch_unwind if they could panic.

Best practices (applies to C and Rust)

  • Threading and contexts:

    • SQLite extension functions run in the same thread/context as the database connection. Do not spawn threads that access SQLite objects without proper connection usage.
    • Respect SQLite’s threading mode (single-thread, multi-thread, serialized) and ensure your extension is safe in the chosen mode.
  • Error reporting:

    • Use sqlite3_result_error or sqlite3_result_error_nomem to signal errors from functions.
    • For modules, return appropriate SQLite error codes from xConnect/xCreate/xBestIndex etc.
  • Memory management:

    • Use sqlite3_malloc/sqlite3_free for allocations returned to SQLite so SQLite can manage them.
    • In Rust, ensure pointers given to SQLite remain valid until SQLite is done with them; often you must transfer ownership and provide a destructor callback.
  • Determinism and sqlite flags:

    • Mark deterministic functions with SQLITE_DETERMINISTIC when appropriate; enables better query planning and caching.
    • Use SQLITE_UTF8 or SQLITE_UTF16 flags depending on expected encoding.
  • Panic/exception safety:

    • Never allow Rust panics or C++ exceptions to unwind into SQLite C code. Catch and convert to SQLite errors.
  • Testing:

    • Write unit tests for logic in Rust/C and integration tests that load extension into SQLite and run queries.
    • Test under different SQLite threading modes and with concurrent access patterns.
  • Packaging:

    • Provide prebuilt binaries for common platforms or a simple build script for users.
    • For Rust, use cargo features to produce cdylibs and include a small C shim if needed to ensure broad compatibility.
  • Security:

    • Sanitize inputs if exposing file or system access.
    • Minimize privileges and avoid executing arbitrary code or shell commands.

Virtual tables and modules

Virtual tables let you expose external data sources (files, network, in-memory structures) as SQL tables. They require more boilerplate but are powerful.

Key callbacks:

  • xCreate/xConnect: initialize module instance
  • xBestIndex: inform SQLite about indexing and constraints
  • xFilter/xNext/xEof/xColumn/xRowid: iterate result rows
  • xUpdate/xBegin/xSync/xCommit/xRollback: for writable modules

C example resources:

  • SQLite docs include sample modules (e.g., series, csv virtual table). Rust approach:
  • Use a C shim that forwards callbacks to Rust, or use crates that simplify module creation (sqlite-loadable aims to help).

Performance considerations

  • Keep hot-path code in native layer; avoid unnecessary allocations or copying.
  • For large binary blobs, use sqlite3_result_blob with SQLITE_TRANSIENT or pass ownership carefully to avoid copying.
  • Mark functions deterministic where applicable to allow SQLite optimizations.
  • Profile with representative queries; use EXPLAIN QUERY PLAN to understand how your function/module affects query plans.

Comparison: C vs Rust

Aspect C Rust
API access Direct, native Via FFI (full access through bindings)
Safety Manual memory management (unsafe) Memory-safe by default; must handle FFI boundaries
Tooling Standard C toolchain Cargo, crates, modern testing
Ease of writing complex logic Lower-level, more boilerplate Higher-level abstractions, fewer bugs
Binary size Smaller Possibly larger due to runtime/static linking unless optimized
Panic/UB risk Higher (buffer overflows, UB) Lower for Rust code; still must prevent panics across FFI

Example: Registering an aggregate ©

Aggregate functions need step and final callbacks. Example: a simple variance aggregator.

// variance.c (sketch) #include <sqlite3.h> #include <stdlib.h> typedef struct { double sum; double sumsq; int n; } Variance; static void variance_step(sqlite3_context *ctx, int argc, sqlite3_value **argv){     if(argc < 1) return;     if(sqlite3_value_type(argv[0]) == SQLITE_NULL) return;     Variance *v = sqlite3_aggregate_context(ctx, sizeof(*v));     if(!v) { sqlite3_result_error_nomem(ctx); return; }     double x = sqlite3_value_double(argv[0]);     v->n += 1;     v->sum += x;     v->sumsq += x*x; } static void variance_final(sqlite3_context *ctx){     Variance *v = sqlite3_aggregate_context(ctx, 0);     if(!v || v->n < 2) { sqlite3_result_null(ctx); return; }     double var = (v->sumsq - (v->sum * v->sum)/v->n) / (v->n - 1);     sqlite3_result_double(ctx, var); } 

Register with sqlite3_create_function_v2 specifying step and final callbacks.


Testing and CI

  • Unit test native code where possible.
  • Integration tests: run SQLite and load the compiled extension, execute queries, assert results.
  • CI tips:
    • Build artifacts for target platforms.
    • Use GitHub Actions or similar to test Linux, macOS, Windows.
    • On macOS and Windows ensure correct library naming (.dylib/.dll) and export symbols.

Distribution and versioning

  • Semantic version your extension.
  • Document compatibility with SQLite versions.
  • Consider bundling with the application vs providing as a loadable module; bundling reduces runtime loading complexity and ensures API compatibility.

Common pitfalls

  • Returning pointers to stack-allocated memory — always allocate with sqlite3_malloc or ensure lifetime.
  • Panics/unwinding in Rust across FFI.
  • Mismatched calling conventions or missing exported symbol names.
  • Not marking functions deterministic when they are — reduces optimization.
  • Not testing under different SQLite threading modes.

Resources and further reading

  • SQLite Loadable Extensions documentation (official docs)
  • SQLite C API reference (sqlite3.h)
  • rusqlite and libsqlite3-sys crates for Rust integration
  • Example virtual table implementations in SQLite source distribution
  • sqlite-loadable crate for Rust loadable extensions

Extending SQLite with native code unlocks powerful capabilities. Use C for direct, minimal-overhead access and Rust when you want memory safety and modern language ergonomics. Follow the best practices above to avoid common errors, ensure safe memory handling, and keep extensions robust and maintainable.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *