HomeLog InSource

notes.public

[View] [Short] [Hash] [Raw]

EarthFS 2014-09-23

hash://sha256/d42ed2d16c95156176238699799504bf52a37c0637dbbd54878d58064da63827[#]

So I started writing the header file last night, which some might argue was a violation of my rule of sleeping on new ideas[#]. But the goal of the rule is to spend time thinking about them before digging in, and laying out the header file is a way of doing that.

Anyway, I started working on the implementation this morning and the first 80% came together really fast. I have all the basic (non-cursor) ops written, a good start on the cursor ops, and manual compaction done.

The remaining cursor ops are hung up on the API for jumping to the first/last item and initializing the cursor if necessary when doing next/previous.

The other big thing left is auto-compaction. We have to track meta-data on each level of the tree (mainly just the number of items, not sure what else), and we have to come up with a way of deciding how much to compact on each commit.

http://www.odbms.org/blog/2012/10/scaling-mysql-and-mariadb-to-tbs-interview-with-martin-farach-colton/
Scaling MySQL and MariaDB to TBs: Interview with Martín Farach-Colton.

TokuDB has much higher insertion rates, and in particular, very high single-threaded insertion rates. Thus, the slave can keep up with a much more demanding workload.

Oh, I get it. “Slave lag” is the same problem we’re having with doing fast syncs in EarthFS, even though it isn’t a master/slave system.

The Log-Structured Merge Tree (LSM) also achieves very high indexing rates, but at the cost of so-called read amplification, which means that indexing is much faster than for B-trees,

Oh, I get it. When you walk down the fractal tree, you pick up any intermediate messages along the way, since they’re in the same place.

However, that’s why LSM-trees usually support bloom filters for lookup. So the complaint isn’t actually valid, and in fact bloom filters might actually be faster. Fractal trees have to pay for their message system with smaller branches.

Q9. How is TokuDB different with respect to SchoonerSQL, NuoDB and VoltDB?

You have to be pretty badass to handle questions like this in an interview. I’ve never even heard of SchoonerSQL.

https://leveldb.googlecode.com/svn/trunk/doc/impl.html
LevelDB file layout and compactions

Reading over this again. AFAICT, it’s just so needlessly complicated. Their 2MB files are basically pages in a regular b-tree.

Compactions for a particular level rotate through the key space. In more detail, for each level L, we remember the ending key of the last compaction at level L. The next compaction for level L will pick the first file that starts after this key (wrapping around to the beginning of the key space if there is no such file).

They have to do this because files never get rewritten once they’re stored. However, we’re using a b-tree, so it automatically handles dividing things up so branches stay relatively balanced. That means we don’t have to rotate, and in fact maybe we could see better performance by not rotation.

By always compacting from the head or tail of each level, we an intentionally skew data across the levels so that perhaps cursors will be more efficient. Instead of hitting the cursor for each level randomly, we’d mostly hit the cursor for one level over and over, then swich to the next level, etc.

Worth trying at least.

Also, I think that if we compact the last level from the tail, and/or the first level from the head, we can save the underlying b-tree a lot of work because the items don’t actually have to move at all. I’m not sure which one we should choose, or if we should do both.

Right now we always fully compact level zero at the end of each transaction. You can even do compaction in the middle of a transaction and open cursors (should hopefully) keep working.

Oh whoops, no they won’t. When we move an item from one level to another, the cursor at the new level has to pick up on it. Fixing that would take a lot of extra work too.

Well, the cursors can lose their positions, but at least they won’t completely break.

Figuring out how to do auto-compaction intelligently seems tough. If you have a “full” tree, one write at the top needs an additional write at every level. Perhaps the rule is number of writes * number of levels * overall fullness. A tree that’s mostly empty should be allowed to fill up, whereas a tree that’s extremely full has to be pruned back.

We have to be careful that if one row is too full and another row is brand new and mostly empty, they don’t average out. Or actually, that might be okay, as long as we prune the full row whenever we prune anything.

We should also make sure that we don’t do compactions below a certain size, because they just aren’t efficient.

It’s actually probably better not to take the current write size into account at all. We don’t want to make big transactions even slower, we want to shave the peaks.

With any luck I should be able to finish this up tomorrow. Pretty excited to test it out.

[View] [Short] [Hash] [Raw]

EarthFS 2014-09-22

https://news.ycombinator.com/item?id=8275970
https://moderncrypto.org/mail-archive/messaging/2014/000780.html
Modern anti-spam and E2E crypto (moderncrypto.org)

It took me ages to get around to reading this post.

I think the solution is obvious, which is never a good start for anyone talking about email spam. But bear with me.

http://inessential.com/2014/09/17/spam
Spam

You know what great technology doesn’t have a spam problem at all? RSS. […] What you don’t get with RSS is blog posts from some entirely other blog than what you asked for. If you subscribe to a podcast, you don’t get episodes from some other scammy podcast.

There was a sort-of spam problem many years ago. Back then there were blog search engines (which have all shut down, as far as I know), and those search engines would index spam blogs, and so if you had a feed that was a search you could end up with posts from spam blogs.

Even if your favorite blog gets hacked or sold to a spammer, the scope of your problem is bounded by the number of subscriptions you have.

Now, the key advantage of email is that anyone can send you messages. An email that only worked between people who already had established relationships wouldn’t be nearly as useful.

But once a connection between two people is established, they can move off email to a secure medium instead. The only communication that needs to be sent in the clear is the initial contact.

How does this fit with EarthFS? Well, EarthFS follows the RSS/Twitter subscription/follower model. So it isn’t a replacement for email. But once someone emails you with a pointer to their encrypted repo, you can browse it securely and subscribe if you want to.

So basically, everything is perfect. Email can’t be secured but it doesn’t need to be. Just don’t use it for secure communications.

Botnets appeared as a way to get around RBLs, and in response spam fighters mapped out the internet to create a “policy block list” - ranges of IPs that were assigned to residential connections and thus should not be sending any email at all.

This bothers me, as does the reverse (blocking server IPs from doing things that only residential connections are “supposed” to do).

[View] [Short] [Hash] [Raw]

EarthFS 2014-09-21

hash://sha256/1aa20aed3c7ca4b5862bc185ee6582c9a4ed11426a71334d3cc9483f9f19b303[#]

I just wanted to ask: if LevelDB has better write performance than MDB, how can we possibly build an abstraction on MDB that has write performance anywhere near LevelDB?

Well, uh… When you put it like that, good question.

People talk about write amplification in LSM-trees, but b-trees have a form of write amplification too. When you tough a page to just change one tiny thing, you have to rewrite the whole page. When a tree is small and you write two random rows, there’s a decent chance they’ll be on the same page and the writes can be combined. As the tree grows, the writes spread out and the overhead increases.

I checked the source and mdb_cursor_put calls mdb_cursor_set, which starts by checking if the cursor is already on the right page. I think that’s just about all it should take for our merges to be efficient.

Oh, the other thing is, how does it handle multiple writes to the same page within a transaction?

Hmm, it looks like dupsort doesn’t carelessly waste pages after all. Then I have no explanation for what was happening.

/* Was a single item before, must convert now */

Uh… Maybe the overhead was from keys with two items?

Probably.

It took some figuring, but I think multiple writes to the same page within a transaction are indeed perfectly efficient. Without MDB_WRITEMAP, the page gets allocated on the heap and is tracked by the transaction, not by the cursor. The cursor basically keeps a weak reference. When you copy a cursor, they both point to the same page buffers.

That’s all extremely reasonable for a production-quality database. My question is, why are its writes slow then?

Writes to a cursor that’s on the right page should be very fast. If the cursor is on the wrong page, it seems to start over from the root, which would be slow. It would be nice if it searched on its way up from the current page to the root, in case the target page was close to the current page (somewhere on the same branch).

But when our depth is only 3 or 4, it would only save a few comparisons and a few pages in the cache. And it would be slower if you’re trying to jump the cursor from one position to a completely different one. Maybe MDB could have a flag for it.

At any rate, if our merge factor is two, then each source page should only hit two or three target pages during a merge. And even then we’re still not searching hopefully because we’re carefully stepping through during the merge. If there is a right page the cursor should always be on it when we do an insertion.

The merge factor isn’t about searching, it’s about stepping. With a high merge factor, you have to scan through a lot of junk to find the right places to insert at. In that case a merge factor of ten seems perfectly reasonable.

Okay… I think we’re basically set. This is an awesome idea and it’ll be super fast. I don’t see any limitations in MDB that’ll bite us.

Without nested b-trees, MDB_APPEND is a lot less useful. You can only append the last value in the last table, rather than the last value in any table. Maybe we should design our LSM-tree to work with MDB’s multiple table support.

Many tables are append-only and don’t need LSM-trees either. Although we could handle that either way (with a special level for non-LSM mode).

We could support our own approximation of MDB_APPEND using a cursor and either scanning bakward or just letting it find the appropriate slot on the last page.

I guess the LSM logic would be a lot easier without having to worry about compacting each separate table. And potentially have less overhead, since compactions below a certain size are just more trouble than they’re worth.

If we’re using an LSM-tree, we’d actually want append mode pretty much always. Basically just use mdb_cursor_put for everything and assume it’ll be on the last page 90% of the time.

Okay? Okay.

[View] [Short] [Hash] [Raw]

EarthFS 2014-09-21

hash://sha256/ecbaf4a41347c3bf2430254eec9c9d33c910f1591673d84a0b965058786d9623[#]

I suspect more frequently that the dark lord MDB_DUPSORT after failing with rings has moved on to disk overhead.

Basically, I think that, since dupsort involves nested b-trees, it has a considerable overhead when there aren’t actually any duplicates. It seems like MDB allocates at least one page for every dupsort value.

Switching off dupsort on the same workload:

To be clear, this comparison is unfair because with dupsort off, the duplicates obviously aren’t being stored. However, I don’t think that accounts for the difference at all. 18MB is comparable to the equivalent database in SQLite. 44MB is like four times larger than the files we are indexing (the full-text index is the biggest chunk of the rows).

From that point this morning, here’s my thought process throughout the day:

  1. Let’s rewrite the code that uses dupsort to stuff everything in the keys instead.
  2. Dupsort was a big selling point of MDB over Sophia. Without dupsort, and if we’re still concerned about write performance, how about we just switch?
  3. Or how about writing a shim, based on the combined APIs of MDB and Sophia?
  4. Uh, looks like Sophia is still missing a lot of critical features and has some serious bugs. Plus I’m just hesitant to go with such a small/low profile project.
  5. Going by the old MDB benchmark without MDB_WRITEMAP (which seems to be slow on many versions of Linux, including mine) LevelDB write performance kicks some serious ass, and its read performance is still no slouch.

Another note is this:

http://www.openldap.org/lists/openldap-technical/201405/msg00109.html
Re: [LMDB] Single writer question
Howard Chu

I basically got the same result (no performance improvement), and decided it wasn’t worthwhile to merge the changes.

At this point, all of the pieces are in place. Let’s write “LSMDB.”

Wait, why aren’t we just using LevelDB? Well, because I’m tired of trying to save work and it totally not paying off. All that time we tried to get filters based on dynamic SQL working? Wasted. Time spent using SQLite (including writing a custom async VFS)? Wasted. Fighting dupsort, switching to Sophia, switching to LevelDB? Potentially wasted.

I’m pretty much convinced that MDB is the world’s best b-tree database. The problem is, we have too many writes for a b-tree to work. The solution is, LSM-trees are based on b-trees so we can build the best LSM-tree by using the best b-tree.

https://leveldb.googlecode.com/svn/trunk/doc/impl.html
LevelDB file layout and compactions

I believe this is needlessly complex. We’ll find out if I’m right.

LSMDB is to be an interface layer on top of LMDB. It should be possible to implement without touching any of the MDB source or using any of its private functions.

LSMDB uses MDB in “flat” mode, without named databases. Likewise, it doesn’t expose named databases itself (not to mention dupsort). It will hopefully have a convenient interface for iterating over portions of the keyspace, which should suffice (and be more efficient in general).

LSMDB reserves the first byte of every key. With this byte, it arranges user keys into discrete levels. New data is always added at level 0. Over time it is compacted and merged into higher levels. Each level spans the entire key-space.

MDB doesn’t support concurrent writes, so compactions are done on the foreground thread during transaction commit. Each compaction runs on a single level and merges items into the next higher level. The number of items merged in a single compaction is based on a floating average, the level is whichever level needs compaction most (actual size divided by target size). The size of each level is also stored in the database.

Lookups are done by creating a cursor and checking each level from lowest to highest. User cursors require one internal cursor for every level. EarthFS doesn’t need fast deletions, so they would be done by explicitly deleting the key at every level (they could also be done by reserving the initial byte of every value).

No special work is done to keep level 0 small during a transaction, so large transactions would potentially slow down as they go along. Bloom filters would not be used, hopefully offset by MDB’s extremely high read performance. Custom comparison functions would be unsupported, at least initially (I think they’re generally a bad idea). Level 255 might be reserved for any data the user wished to store outside of of the LSM-tree.

For optimal write performance, it seems to me like each level should be no more than twice as large as the previous. However, that might lead to a absurd number of levels and cursors. It looks like in LevelDB, each level is ten times as large as the previous. Obviously this parameter would be configurable.

Oh… If we reverse the level numbers, we can support MDB_APPEND. Conceptually, it makes some sense to have the newest rows at the end of the key-space.

How does that sound?

It sounds like a fair amount of work and totally unrelated to our core goal of EarthFS, but if you want to break an omelet, you have to make a few eggs.

http://www.openldap.org/lists/openldap-technical/201407/msg00099.html
Re: Will lmdb suit my needs ?
Howard Chu

Answering the subject first: probably not. Someone else with a similar application, maybe, but that depends a lot on one’s ability to read and research.

lol.

Keywords: tech support, creative process, burnout

The mailing list was certainly full of people asking stupid questions, and Howard telling them to “RTFM”. On one hand he’s kind of an asshole, but on the other what else is there to be done?

To be fair, MDB is very tersely documented and I had to dig into the source to answer a lot of more obscure questions. And it’s even worse for people using bridges from high level languages, because 1. they can’t read the source since they probably don’t know C, and 2. the bridge probably sucks.

Maybe my escapades in Firefox extension development would’ve been a lot less painful if I had just dug out the Firefox source. I did eventually download it, but I’ll probably never look at it at this rate.

So… LSMDB?

[View] [Short] [Hash] [Raw]

EarthFS 2014-09-20

hash://sha256/dbad021392865720cd2ad488f3398d4e411e889e404440852fdb6af1d8b7e7fa[#]

Today I mostly got exiting working. You hit control-C and it unhooks the event listeners and the process terminates gracefully. “Mostly” because if a pull is blocking while reading a URI list, we currently have no way to stop it. In theory it should wake up on its own every 30 seconds but even if that seemed to be working it wouldn’t be good enough.

The two basic approaches are some sort of async select(2) or just having a way to cancel reads in our HTTP library. Doing our own select seems like a ton of work and I’d rather avoid it if possible, but the other way is kind of a hack.

After getting the process to exit cleanly, I was able to verify that Valgrind says we have basically no leaks. It thinks a couple Objective-C runtime things are leaked, and our async worker pool, but those all have the lifespan of the process. There’s probably an API for telling Valgrind to ignore them.

What I thought was an enormous memory leak was actually just MDB working as intended. I just didn’t realize it would count towards the resident memory count. Virtual memory is around 400MB for a MDB mmap of 256MB. Resident memory starts at like 2MB and climbs to 100MB as we do a large pull. Shared memory climbs to around 70-80MB, which is about the size of the database file. So that would be like 20MB of application overhead.

I’m a bit worried about how that would work on iOS or a shared host with low memory limits.

Cutting our pull queue size from 512 to 256 files seems to cause throughput to decay faster than normal. If so, it seems like an argument in favor of switching to an LSM-tree database. I might end up writing a shim layer if we’re serious about switching again.

I ended up realizing that dependency tracking is not just a minor feature. It’s critical for just about everything beyond simple notetaking. Even just embedding photos in a blog post requires it. So we might end up adding that. Now that I have a pretty good grasp of what it does and why it’s necessary, it doesn’t actually seem that complex.

Without dependency tracking, our Firefox extension isn’t worth doing. Even with dependency tracking, I’m not 100% sure it is.

The only sane way to do it is to track the request hierarchy as resources are being loaded. How we treat a particular resource depends on how it was requested, not what its stated MIME type is or anything else. Maybe it’s possible to do what we want in Firefox with nsIDocumentLoader, but they don’t make it easy.

For doing the actual loading, conversion, and storage, there is a convenient lack of overlap between files that need to be converted and files that are likely to be large. We can just keep the text-based files in memory, and media like images can be streamed directly to the repo.

Doing the import first and then having the browser always load the converted version of the page seems like a bad idea for dynamic sites and general usability.

I haven’t dared to begin changing our meta-file system to add meta-files for all files by default. After flipping-flopping back and forth on it so many times, I’m just hesitant to change it, even though I think this is the right way to do it.

Likewise, I still haven’t worked out the details for our intersection filter caching idea. I think it could work with MDB if we used tiny map sizes by default, like 2MB, and supported growing them. But I don’t really like it. LSM-trees are a bad fit too, unless we could use a single one for all the caches, but that’s limited to a single writer in every DB I know of.

It’s starting to get cold. I want to have this done before winter starts.

[View] [Short] [Hash] [Raw]

EarthFS 2014-09-20

hash://sha256/c5f7f4d70d464b03aabc6fd5cda2b7ad3e13efaacd2ce9117b9ffba2a5285dcd[#]

So today was nearly wasted. I tried to optimize our page loads before realizing Chrome was lying to us. Its inspector was saying ~120ms for our main page, and even ~15ms for tiny, static resources. I tried mmap, I tried caching buffers, I tried everything before realizing that Firefox reports ~30ms and <1ms, respectively.

Then I tried to use weighttp to benchmark, but it seems to have some serious bugs AFAICT. I’m pretty sure their chunked encoding support is broken.

Then I tried writing my own benchmark tool in Node.js, which I’m aware is a pretty terrible idea but I was desperate. I had some bugs in my code that took forever to figure out because Node was just silently accepting the empty request object I was passing it. Some of these empty requests were succeeding, and some were failing! It varied based on the maxSockets parameter! WTF!

Anyway, we’re able to get 9MB/s on our dynamic query page, with Node sucking half our anemic CPU, so I think that’s pretty good. That’s only 50 requests per second, but on queries with fewer results we’re able to get close to 1000 requests per second. Obviously I need a better benchmarking setup, but this seems pretty good for now.

Then I got online and made two discoveries.

https://news.ycombinator.com/item?id=8342131
https://github.com/kazuho/h2o
H2O – an optimized HTTP server and library implementation (github.com)

It supports HTTPS and even the latest draft of HTTP/2. I also looked at the code and I don’t like it. It has its own system of generators and filters and dynamic vectors and memory pools. Our system is more like direct mode in graphics programming: you just call the function you want and it does it. It means you have to keep track of whether a particular connection is (say) chunked or not, but it’s also a lot simpler to use.

I’m curious about their performance claims since they appear to use printf all over the place, when we were just trying to replace it because it was our biggest CPU bottleneck. Maybe on faster computers it doesn’t matter, but then when you have 10,000 connections it seems like it would start to matter again…

https://news.ycombinator.com/item?id=8342877
SwellJoe

The database is still gonna be a bottleneck. The application is still gonna be a bottleneck. The disks are still gonna be a bottleneck. The network is still gonna be a bottleneck (Apache can saturate a Gbit NIC, so can nginx, so can H2O).

This is close to not being true for us. And static sites are super popular these days, meaning the server itself really is the bottleneck.

Keywords: dismissal

https://news.ycombinator.com/item?id=8342982
SwellJoe

In the “Internet of things”, small embeddable web servers will be important. If H2O uses less memory than nginx, that’d be interesting (I think more interesting than performance).

Hmm.

http_parser, which we use, talks about having 40 bytes per connection of overhead. Of course, that goes down the drain since we use a fiber per connection, each with 48KB of stack. Even Golang’s 8KB stacks[#] dwarf it.

And stud requires 200KB per HTTPS connection.

https://news.ycombinator.com/item?id=8342471
marktangotango

I’ve often wondered at the lack of embeddable http servers in the c/c++ world. Are there any other libraries that do the same thing?

Given the replies: just the ones we already knew of.

The other thing I found was Memento:

http://www.mementoweb.org/
Memento: Adding Time to the Web

If you know the URI of a Web resource, the technical framework proposed by Memento allows you to see a version of that resource as it existed at some date in the past, by entering that URI in your browser like you always do and by specifying the desired date in a browser plug-in. Or you can actually browse the Web of the past by selecting a date and clicking away.

Hey, it’s Rewind Proxy!

Except it’s not. It doesn’t do its own recording, and it relies on external “timegates” in addition to archives like the Wayback Machine.

https://www.youtube.com/watch?v=xYVxREPvLS0
CNI - Memento - Giant Leaps Towards Seamless Navigation of the Past Web

This presentation was pretty painful even on 1.5x speed. The guy made sure to say everything at least twice. And I’ve gotta make sure I avoid that tone of voice if/when I ever speak again.

I was surprised they got a million dollars from the Library of Congress when I had never heard of this project before. I checked and it was apparently only mentioned once on Hacker News. According to their news page it’s still going, so I might submit it…

Anyway, reading a bit about this made me realize my ideas for a Firefox extension are completely premature. This is actually a whole field, and if I want to make a contribution I’ll have to figure out what has actually been done.

That said, Memento doesn’t actually seem to solve any of the problems I have. I want to have access to pages I’ve previously viewed, even if I’m offline, and I want them to be searchable. I’d also like to be able to queue up pages to be saved the next time I get online.

I found a project called WAIL (unfortunate name) that is somehow related to WARCreate and integrates with a local copy of the Wayback server to let you store and retrieve local copies.

http://matkelly.com/wail/
Web Archiving Integration Layer - One-Click User Instigated Preservation

One interesting thing is that it uses a background process to do the actual page archiving, instead of doing it through the browser. But, AFAICT, it uses several background servers to do just about everything. The Git repository was like 150MB.

So today seems like the day of negative results. Nothing we can use directly, but a lot we can learn from.

[View] [Short] [Hash] [Raw]

EarthFS 2014-09-18

hash://sha256/c3a02cc2e0f9676e97c4c4b8a5accbef44b659aa2b0ee150dbfcce1b17146432[#]

Intersection filter caching:

  1. In order to have deterministic cache names, we have to sort the filters lexicographically.
  2. In each cache, we just need to store the sort ID and file ID, both of which would be part of the key. None of the rows will have any data, except possibly the one that tracks the update info.
  3. Hopefully the storage system is efficient for that use case, because MDB requires us to map a fixed chunk of memory for every cache. If every cache is 50MB and we want to keep 50 caches open… That doesn’t work.
  4. The other thing to worry about is the maximum number of open file descriptors. On a lot of systems, I think the default is like 250. Obviously on a high profile server it would be raised.

I also want to speed up our blog page rendering. It seems like that’s a 2ms lower bound for every file we open, read, and close. We can’t use MDB because the files can be huge. Being limited to a single writer is unfortunate too.

I’d like to concatenate them all into a small number of files, but that’s a pain to manage.

What if we opened all the files at once in parallel, memory mapped them (using the MAP_POPULATE flag), and then closed the files so we didn’t use too many file descriptors per connection? In fact, we could then use writev extremely efficiently.

Well, there’s the risk of running out of memory doing that. So we could have a cap on the amount of memory per fiber, and open files until we reached that amount.

Another concern is the portability of MAP_POPULATE. But there’s some workarounds there if it’s a problem.

In order to actually do everything in parallel we won’t be able to use our async workers as-is.

What a pain.

[View] [Short] [Hash] [Raw]

EarthFS 2014-09-18

hash://sha256/3b4bf58cd2f28b1c0a3af4476ca9f646388e58240dd3537b873fb3075e844c6f[#]

I realized the reason behind a lot of the trouble we had with the filter system, and why we thought we had to switch to the current design in the first place.

The problem is that unions are fast, but intersections are necessarily slow.

Intersections basically have to union first, and then weed out the bad matches afterward. The reason is that our ordering is extremely demanding.

I came up with several potential solutions, along with the downsides of each:

The idea is that every intersection filter (whether it’s a pathological case or not) will order its sub-filters by cost, then generate its results (which will be in random order), and then write them to an MDB index in its own file. The file will also have a special key that says when it was last updated, so we’ll know if we need to update it during future queries or not.

We’ll keep a table of these caches in memory so we don’t have to open and close them all the time, and so that the same cache can be used by more than one query at a time (MDB relies on POSIX locking which means you can’t open the same file more than once…).

To use a cache, first we begin a read-only transaction and check if it’s up to date. If not, we switch to a write transaction and update it (if necessary; it could’ve been updated already in the meanwhile). Then we switch back to a read transaction and actually use it.

These caches don’t need durability, so we can use MDB_NOMETASYNC if we want.

Oh I forgot: how do we identify which cache to use? Well, we hash an unambiguous string representation of that branch of the filter tree. We can even do it without buffering.

This is a considerable pain in the ass, but the alternative, having queries return results in the wrong order, is a non-starter. None of the other solutions (that I can think of) are acceptable either.

For now we probably won’t bother and we’ll just stick with slow intersections, since we don’t have that many files yet.

If you’re wondering why EarthFS hasn’t been built already, by me or someone else, here’s your answer.

[View] [Short] [Hash] [Raw]

2014-09-18

hash://sha256/3b4bf58cd2f28b1c0a3af4476ca9f646388e58240dd3537b873fb3075e844c6f[#]

BTW after the video on Phil Fish I’m really reconsidering publishing even a portion of my notes. And without them I know EarthFS is DOA. The hero you deserve.

Here’s a stab at understanding the problem: talking about people.

Instead of talking about people, just do stuff. If you can learn from people, and apply it to what you’re doing, then great. But we don’t need the personal equivalent of football fandom.

BTW, I was at the grocery store recently (like usual) and they had an advertisement over the intercom (which now that I think about it is really obnoxious). It was like, “football season is starting up, get ready for it by stocking up on snacks!” I was like WTF? What is wrong with the world that those two thoughts are connected?

Keywords: spectator sports, cheerleading, bias, learning, social pressure, corruption

Back when I still posted random comments on Hacker News, I commented on the story where Steve Yegge clarified that he wasn’t actually quitting Google. People were disappointed that he wasn’t going to be tackling sufficiently hard problems, as they had inferred from his presentation. I wrote a comment saying, “it shouldn’t matter what Steve Yegge decides to do, you can work on hard problems if you want.” The reply I got could easily be predicted based on my choice of words: “it shouldn’t matter, but it does.”

In fact, that’s why I said “shouldn’t” instead of “doesn’t.” I wanted to see if the reply would be as shitty and lazy as I predicted.

http://www.reddit.com/r/programming/comments/2gn4it/the_shutting_down_of_mozilla_labs/
http://www.ianbicking.org/blog/2014/09/professional-transitions.html
The shutting down of Mozilla Labs (ianbicking.org)

When I first started writing this I felt that a leadership misdirection was at the root of Labs’ missteps. But leadership fetish feels a lot like a Great Man theory, an expectation that we are doomed or blessed only by the wisdom and fortitude of our leaders.

Seems relevant.

Note that Mozilla Labs is distinct from Mozilla Research – Research is the home of projects like Rust and ASM.js. Mozilla Research is still going strong.

It definitely seems like there are two communities, and one of them is pretty hopeless. When people make fun of MongoDB versus real engineering[#], this is what they’re attacking, and I agree[#]. Of course, it’s also possible to throw the baby out with the bathwater.

I’m also doing what I can to position EarthFS on the real engineering side. One part of this you can control just by having a well-written readme on GitHub rather than an over-designed landing page with big images and bullet-point features.

https://news.ycombinator.com/item?id=8329636
http://structr.org/
Show HN: Structr - Next-Generation Data CMS Based on Neo4j (structr.org)

I don’t know what this project is or does, but I’ll point out that a CMS and a FS also seem to reflect this divide. And they’re getting blasted in the comments, as usual.

Of course, named data networking is way more cool engineering than EarthFS is… Beyond practicality, IMHO.

Kids who think they’re going to write a new OS (of which I was one) have to eventually learn that we’ve got operating systems covered, thanks. This is why oldies are still played on the radio, too.

Of course, I don’t think Van Jacobson is doing it that way because he wants to be cool. He’s doing it that way because that’s his background. My background is applications because I’m younger and this is where the opportunities are today.

That’s actually a scary but probably very true lesson… To keep with the music analogy, you gotta change with the times, like Weird Al.

[View] [Short] [Hash] [Raw]

2014-09-18

hash://sha256/7402d6d7e55ca779952612220dd4827bbcaa46b0a2999766b51c447dc24f224b[#]

So yesterday I wrote that and today I read this:

http://www.reddit.com/r/programming/comments/2gn8da/tales_of_a_former_disassembly_addict/
http://prog21.dadgum.com/50.html?print
Tales of a Former Disassembly Addict (prog21.dadgum.com)

In fact, generated code can be so ridiculous and verbose that I finally came up with an across-the-board solution which works for all compilers on all systems: I don’t look at the disassembled output.

I wonder if C on -O0 is faster than JavaScript on V8. Probably not on microbenchmarks but maybe on real-world code.

I saw a comparison recently that ASM.js is faster on Android than the “native” Java environment. That made me wonder if C code compiled to ASM.js is faster than Swift for the same safety guarantees. I bet it is.

http://www.reddit.com/r/programming/comments/2gn8da/tales_of_a_former_disassembly_addict/ckkwxxv
jerf

The languages that can compile to fast code are also extremely likely to be the languages that let you manipulate memory layout in some useful way (even if you don’t get total control), and the “slow” languages that can’t emit fast code (probably due to excessive dynamicness) are also the languages that won’t even let you see memory layout, let alone do anything useful with it. In practice the “slow” languages both run slower, and consume more memory for everything, since not only do they box absolutely everything, they probably have to access several memory slots to do so much as add two numbers together to make sure the class hasn’t changed, and nobody’s changed the add code, and so on and so on.

Keywords: optimization, complexity, general intelligence, abstraction inversion

NewestNewerOlderOldest