How we scaled file processing to support 100,000+ files

As Dynbox has grown, we've seen the number of files managed by our system skyrocket from a few thousand to over 100,000. It's an incredible milestone, and it's all thanks to you, our early users. This growth brought on a new, exciting challenge: how do we keep up?

I want to take you behind the scenes to share how we rebuilt our file processing engine to handle this scale and what it means for the future of Dynbox.

What is File Indexing?

When you connect your files to Dynbox, we don't just list them in the explorer. To make them truly useful for our AI, we need to "index" them.

Indexing is the process where we create a summary and an "embedding"—a mathematical representation of the file's content. This allows the AI to quickly grasp the meaning of your documents without having to read them from scratch every single time. It's the magic that powers fast, accurate search and analysis.

This all happens in the background, but as you can imagine, it's a resource-intensive process that takes time and money.

How It Was Done Before

Our initial system was straightforward: a scheduled job that would run every 30 minutes and process up to 200 files from a queue. This worked fine in the early days.

However, as usage grew, this approach showed its limits. Running the job more frequently could lead to empty runs, wasting resources. On the other hand, during a large upload, the queue would grow faster than we could process it, leading to delays. We were hitting a scaling wall.

A New, Scalable Engine

To solve this, we moved to a new architecture built on Cloudflare Queues and Workers.

Now, when a new file is added, it's immediately pushed to a queue. This queue automatically triggers a fleet of workers that scale with the demand. If there's a sudden spike of 5,000 new files, the system dynamically scales up to process them all in parallel.

The result? We can now index thousands of files in minutes, ensuring the AI has access to your latest information almost instantly.

Limitations and Rethinking Pricing

This new power comes with its own set of challenges. Indexing, especially with powerful AI models, is computationally expensive. To manage this, files larger than 20MB are currently indexed based on metadata rather than full content, due to memory and cost constraints.

This led us to an important realization: our current pricing model needs to evolve. The cost of indexing a vault with thousands of files is significant, and we don't want users to burn through their monthly AI credits before they've even had a chance to chat with the AI.

We believe a fairer approach is to separate the cost of indexing from general AI usage. Here's what we're planning:

Dedicated Indexing Quota: Your plan will include a generous quota for monthly file indexing, separate from your other AI credits. This means you can organize large vaults without worrying about running out of credits.
Simplified AI Credits: We want to make AI usage clearer. Instead of abstract "tokens," we're moving towards a simple message-based system. A query with a fast model might cost 1 message, while a more complex one with a smart model might cost 2.

We're still finalizing the details, but our goal is to make our pricing more transparent, predictable, and fair for everyone.

Stay tuned for another blog post soon with a full breakdown of the new pricing structure. Thank you for being on this journey with us!