[Bugs] [Bug 1529883] glusterfind is extremely slow if there are lots of changes

bugzilla at redhat.com bugzilla at redhat.com
Sat Dec 30 20:40:12 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1529883



--- Comment #2 from Worker Ant <bugzilla-bot at gluster.org> ---
COMMIT: https://review.gluster.org/19114 committed in master by  with a commit
message- glusterfind: Speed up gfid lookup 100x by using an SQL index

Fixes #1529883.

This fixes some bits of `glusterfind`'s horrible performance,
making it 100x faster.

Until now, glusterfind was, for each line in each CHANGELOG.* file,
linearly reading the entire contents of the sqlite database in
4096-bytes-sized pread64() syscalls when executing the

  SELECT COUNT(1) FROM %s WHERE 1=1 AND gfid = ?

query through the code path:

  get_changes()
    parse_changelog_to_db()
      when_data_meta()
        gfidpath_exists()
          _exists()

In a quick benchmark on my laptop, doing one such `SELECT` query
took ~75ms on a 10MB-sized sqlite DB, while doing the same query
with an index took < 1ms.

Change-Id: I8e7fe60f1f45a06c102f56b54d2ead9e0377794e
BUG: 1529883
Signed-off-by: Niklas Hambüchen <mail at nh2.me>

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list