Skip to content

Running out of memory during reindexing (allow reindex of parts of the mailbox) #1

@petterreinholdtsen

Description

@petterreinholdtsen

Hi.

I noticed the statement "Initial reindexing is memory-hungry and it could fail on some humongous mailbox (but I doubt anybody has such a mailbox)" in the README, and can confirm that this is a real problem. When trying, I ran out of memory with my 1.4 million email mailbox.

To work around it, I try a patch to only reindex parts of the mail box, using a patch like this to accept a tag argument to the reindex command:

diff --git a/notmuch-response b/notmuch-response
index 9d0f759..c7fbef7 100755
--- a/notmuch-response
+++ b/notmuch-response
@@ -69,25 +69,30 @@ def index():
         print('IDs to untag ({}): {}'.format(len(replied), ', '.join(replied)))
         untag_ids(replied)

-def reindex():
+def reindex(tags = None):
     # reindexes EVERYTHING, it may take some time
     #1. tag everything with unreplied
-    cmd_run([ 'notmuch', 'tag', '+noresponse', '-response', '*' ])
+    if tags is None:
+        tags = '*'
+    cmd_run([ 'notmuch', 'tag', '+noresponse', '-response', tags ])
     #3. get all reply ids
-    replied = get_replied_ids('*')
+    replied = get_replied_ids(tags)
     print('Replies to untag: {}'.format(len(replied)))
     #4. untag these ids
     untag_ids(replied)

 if __name__ == '__main__':
-    if len(sys.argv) != 2:
-        print('Usage: {} <index|reindex>'.format(sys.argv[0]))
+    if len(sys.argv) != 2 and len(sys.argv) != 3:
+        print('Usage: {} <index|reindex [tags]>'.format(sys.argv[0]))
         sys.exit(1)
     cmd = sys.argv[1]
     if cmd == 'index':
         index()
     elif cmd == 'reindex':
-        reindex()
+        tags = None
+        if 2 < len(sys.argv):
+            tags = sys.argv[2]
+        reindex(tags)
     else:
         print('Unknown command: {}'.format(cmd))
         sys.exit(1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions