I was a bit worried about how we were doing partial pulls, even after I got our “meta-file filters” working. It turns out that, at present, we have no way of knowing when to pull bare files (files without meta-files) at all.
In order to do that, we need to track the dependencies of each file. Luckily, we can just store that as a list of URIs in each meta-file, and there is no need to handle recursive dependencies between meta-files. Each meta-file must store all of its direct and indirect dependencies, and meta-files may not be dependency targets.
Then we have to enhance our pull system in order to guarantee the dependencies are stored before the meta-file that declares them. That is a little bit complicated if we’re still shooting for high throughput with lots of queuing, but it shouldn’t be too bad.
The end result is that meta-files have a well-defined order within a repository, but bare files do not. Seems reasonable.
I don’t like that we’re slowly slipping from linear order to a dependency graph, but partial pulls are too important to give up on (just like every other feature).
We’re probably going to blow our “end of September” self-imposed deadline, but that’s alright I guess.
One minor problem is that if a file has dependencies it doesn’t declare, it can appear to work until the file is pulled, and then it breaks. But dependencies can be updated after the fact, and broken dependencies can be automatically detected and fixed (if you have a checker for the file format).