HomeLog InSource


[View] [Short] [Hash] [Raw]

EarthFS 2014-09-17



Many modern filesystems are littered with the reeking remains of attempts at supporting metadata (for example, NTFS), most of which nobody cares about and which just add implementation complexity.

I read this… And went into a cold sweat.

Our Firefox extension is in pretty sorry shape. I figured I’d look up the source code for nsIWebBrowserPersist, which comes so close to what we want to do, but even then it’s worse than that.

I realized that if we just stored plain WARC files in EarthFS, all of the dependencies would be in one file and we wouldn’t need dependency tracking at all. In fact we could do that for everything, people already use archives for manga so there’s no reason to convert them to URI lists or anything. In fact, with our system of “invisible dependencies,” having content links is no use because those files aren’t visible.

In fact I was thinking maybe we need a way to garbage collect invisible files that aren’t depended on by anything else. Which goes to show we’re on the wrong path.

You could say breaking these files out into their own addresses gives the opportunity for de-duplication, but as we’ve learned over and over, de-duplication is a red herring.

And our archive extraction system was about to get even worse because we couldn’t use simple URI lists. The meta-data has to be part of the archive, not part of the individual targets, meaning it’d become some JSON monstrosity.

Having files scoped by purpose is absolutely critical. One of the criticisms of Evernote (?) is that when you start saving a lot of web pages with it, they get all mixed together with your notes. But the solution doesn’t involve hiding web pages or notes.

Likewise, if applications want to store non-user-data in EarthFS, the solution isn’t hidden files, because without meta-files you can’t add any meta-data and even the application can’t find them again.

So I’d like to thank angersock for this series of comments, even though an earlier one was a bit harshly worded.

If you want to pitch a more useful and more abstract version of what you’re describing (“how can we present searching and accessing a metadata forest backed by traditional hierarchical file stores”) then by all means I’ll be friendlier but right now you’re coming across as a crank ignorant of the history of the ideas you’re decrying.

Well, this one is pretty harsh too but it’s completely reasonable too. Library Transfer Protocol guy sounds like a crank, and so do I. :(


https://github.com/priestc/LibraryDSS thats an old repo from when I tried to write code for this about a year ago. Most likely if I ever get around to working on it again, I’m going to make a new repo for it.

Oh, too bad, it really is dead.

New plan:

  1. Get rid of dependencies
  2. Figure out our meta-data system, paying due consideration to past failures and the fact there’s probably a very good reason for them
  3. Hopefully change as little as possible, we still wanna ship in two weeks (lol)

BTW after the video on Phil Fish[#] I’m really reconsidering publishing even a portion of my notes. And without them I know EarthFS is DOA. The hero you deserve.

Working at a big company is sounding better than ever.