Searching Certificate Transparency Logs (Part 3)
In this post we’ll build a Clickhouse database schema to store billions of Certificate Transparency Log entries.
Jordan Griffin is a software engineer at CertKit who spends his days convincing databases to store unreasonable amounts of data in reasonable amounts of space. His current obsession is Certificate Transparency logs, where he’s built systems to index billions of certificates and query them faster than you can say “ReplacingMergeTree.”
Before discovering that 8 billion rows could fit in 600GB if you squint hard enough at your schema, Jordan worked on distributed systems and data pipelines. Now he writes about Clickhouse optimizations, CT log parsing, and the dark arts of columnar storage.
He believes the best database schema is the one that makes your queries fast and your storage bills small. He also believes that if your table scan takes 32 seconds, you should probably make another table.
When he’s not reversing strings for performance gains or partitioning by expiry date, Jordan is building the infrastructure that powers CertKit’s certificate monitoring and alerting.
In this post we’ll build a Clickhouse database schema to store billions of Certificate Transparency Log entries.