arbtt: use of database like sqlite3?

Gwern Branwen gwern at gwern.net
Mon Dec 15 17:19:23 CET 2014


On Mon, Dec 15, 2014 at 3:35 AM, Joachim Breitner
<mail at joachim-breitner.de> wrote:
> But now the insertion is even more expensive: Upon every sample, for
> every open window, sqlite will have to traverse an index of over a
> million¹ entries to see if this particular window title has occurred
> before.

No; come on, these are *databases*, their raison d'etre is looking up
stuff. The ideas behind them are now at least 44 years old - give them
a little credit, they can do better than a linear scan.

Suppose inserts were as dumb as a binary tree - then the traverse only
involves ~14 lookups (log_2(1million) = 13.8 levels of a tree). I
would be surprised if sqlite3 couldn't do thousands of linked inserts
per second, and I'd expect it to require microbenchmarking to show the
difference between a regular insert and an index-linked insert like
proposed.
It's really not an issue.

The real question is whether anyone wants the faster queries enough to
rewrite the backend for and make everyone convert their old logs over.
(The existing append-only logs may be terrible for queries, but all
the code exists and presumably is debugged.)

-- 
gwern
http://www.gwern.net




More information about the arbtt mailing list