Tips for C libraries on GNU/Linux (kernel.org)
All of these rules seem to have the same hidden statement at their core: "We tried this, and it didn't scale."
Secure File Descriptor Handling
There’s a general rule in writing, one that probably should go without saying but it is violated regularly. It’s simply this: Everything you write should advance the plot and build your characters. When I criticize my own writing on this metric (which I should do more often), I chunk things by paragraph. Did that paragraph move the plot AND develop character? No? That paragraph’s not working hard enough then.
I had no intention of writing a procedural, but it turns out I’m writing a lot of scenes that fall into that category. In a procedural, the author liberally sprinkles long passages through the book that are merely lists of things people did. In my case, the procedures are usually medical. First the doctor did this thing, then that thing, then another thing. It’s all very technical and makes the author sound like an expert, but it doesn’t move the story. It’s filler.
(I’m pretty sure my long scene in which the doctor inserts a tube into the lung of a pneumonia sufferer is ludicrously inaccurate.)
That last bit is something by which I would be embarrassed. If it's supposed to be factual and it's not, how can you put it in public?
The only reason I finally managed to start taking these notes is because they are explicitly imperfect, just notes, and I can go back and fix things at any time (but I have to append it as a new note instead of fixing/removing the original).
- Time-ordered list plus tree
- Time-ordered list plus alphabetical list (painful for appending to though)
- Time-ordered list only (load chunks into a in-memory hashmap)
That's all I can think of right now.
- Use a global monotonically incrementing "primary key" instead of a hash
- Use a per-tag primary key plus hash (wait... I don't think that would work)
A per-tag primary key would be a nice compromise, but I don't think it would work because we wouldn't know the key until we found the entry by hash.
I seem to be doing a pretty good mix of "real" work and other stuff. My notes are bouncing back and forth.
This is probably one of the few places you'll see "procrastination" mentioned without any self-flagellation. I'm proud to find value in almost everything I do (and what I don't find value in, I don't do).
I also avoid setting deadlines. Stopping setting my alarm clock was the best idea I ever had.
I'm obsessed with the idea of NoDB.
> NoDB: Putting the Computer Science back in Computer Science
Not saying everyone should write their own persistent data structures from scratch, but having direct access to a toolkit of data structures would probably give people a much more direct understanding of what is going on.
It's like the DIY ethos applied to software.
So I was working on our "real" index class. I haven't tested it yet, but I think this `intersect()` algorithm might actually work.
The problem is it only performs an intersection on two indices at a time. So we need not just output index, but N-1 outputs, all but one of which are intermediate.
Also, the problem with this algorithm is that in order to find a pivot it is O(M*N). As we refine the search with more and more tags, the likelihood of a match gets lower and lower so the algorithm will get slower and slower.
The only way I can possibly think of making this work is to come up with a real unique index for each entry.
We should be able to reuse our old OMeta parser. The performance problems we were having before should not be an issue now that we are only parsing individual entries at a time.
We want to linkify our hash tags.
And we want to linkify our entry hash tags too.
And we want to generate our HTML statically.
That means that entries must have the same URL structure as tag lookups.
/ <- Just show entry field, or redirect to index?
Thus, the logic for displaying our lookup pages should be:
1. If there is just one search term, and it matches an entry, show the entry first, its replies, and a new entry field at the bottom
2. If there are multiple terms, or a single term that does not match an entry, perform a search and show the new entry field at the top, then the results in reverse order
This seems a bit confusing/messy...
1. We need a method for iterating through the index that can automatically read in manageable chunks at a time for us
2. We need to be able to enqueue a function that relies on other queued functions without creating a deadlock
The best way to achieve 1 is with 2. The solution to 2 is to not enqueue the top-level function... Although the problem with that is that we can have other functions (writes) interfering with whatever operation we're performing.
We probably need the queue equivalent of a read-write lock. It should be a separate class.
1. Read a set of values X from the middle of the index A
2. Try to find the first index from X in B
3. While the value isn't found and there are still values in X, repeat 2
4. If none of the values from X were in B, repeat 1-4 with a new X
5. Once a value is found, repeat 1-5 with the first half of A and B
6. Write the values found by 5 to target index T
7. Write the value found, if any, by 4 to T
8. Repeat 1-6 with the second half of A and B