Commit Graph

17 Commits

Author SHA1 Message Date
18092c7ff9 indexer: minor tweaking 2025-02-23 11:39:21 +02:00
cdb619e4f5 Improve performance of index cleanup: use readdir(3), not access(2)
This change makes index cleanup ~4x faster by changing how we
determine whether a file mentioned by the database still exists on
disk.  Previously, we'd call access(2) for each file the database
mentioned.  Doing so produced a lot of system call overhead.  Now, we
read the directory entries of the directories containing the files
whose existence we're checking, build a hash table from what we find,
then do the existence check against this hash table instead of
entering the kernel.

The semantics of the cleanup check do change subtly, however.
Previously, we checked whether the mentioned file was *readable*.
Now we check merely that it exists.  Extant but unreadable files in
maildirs should be rare.

BEFORE:

$ time mu index --lazy-check
lazily indexing maildir /home/dancol/Mail -> store /home/dancol/.cache/mu/xapian
/ indexing messages; checked: 0; updated/new: 0; cleaned-up: 0

real    0m19.310s
user    0m1.803s
sys     0m12.999s

AFTER:

$ time mu --debug index --lazy-check
lazily indexing maildir /home/dancol/Mail -> store /home/dancol/.cache/mu/xapian
- indexing messages; checked: 0; updated/new: 0; cleaned-up: 0

real    0m4.584s
user    0m2.433s
sys     0m2.133s
2025-02-23 11:39:17 +02:00
d5d57b4327 remove non-single-threaded option
Single-threaded is the build-default, and seems to work well enough for
1.12.7, so remove the option to turn it off.

This is because build-options that influence such low-level/core
behavior are a pain to maintain.
2024-11-26 10:27:52 +02:00
b0d8d42dd2 indexer: make lazy check even lazier
In lazy-mode, we were skipping directories that did not change; however,
this didn't help for the case were users received new messages in big
maildirs.

So, add another check where we compare the ctime of message files with
the time of the last indexing operation. If it's smaller, ignore the
message-file. This is faster than having to consult the Xapian database
for each message.

Note that this requires in mu4e:
      (setq mu4e-index-lazy-check t)
or
   --lazy-check
as a parameter for 'mu index'.
2024-11-10 13:47:54 +02:00
d2343c6d62 mu-server: try avoiding xapian multi-threaded access
Try to avoid multi-threaded operations with Xapian.

This remove the thread workers during indexing, and avoids the indexing
background thread. So, mu4e has to wait once again during indexing.

We can improve upon that, but first we need to know if it avoids the
problem of issue #2756.
2024-10-08 11:23:04 +03:00
29dc1cea0c Fix typos. 2024-09-22 17:27:18 +00:00
f01360ae9f lib: commit to disk after indexing 2024-08-04 22:28:13 +03:00
5bd439271d store-worker: temporarily revert
Of course, after merging some problems come up.
Let's fix those first.

This reverts commit f2f01595a5.
2024-06-05 12:21:24 +03:00
f2f01595a5 indexer: use store-worker
Use the store worker (-thread) to do all database modification.

Currently, the "removed" field of Progress is always 0.
2024-06-03 21:01:17 +03:00
c05b28e761 xapian-db: remove locks, transaction levels
Simplify xapian-db: locks should go elsewhere; transaction level add
too much complication.
2024-06-03 21:01:07 +03:00
aeb6d44172 mu-store/indexer: consume messages from workers
Add store::consume_message, which is like add message but std::move from
the caller such that the messages longer has copies (with
Xapian::Document) on the caller side; this is to avoid threading issues.
2024-05-08 19:11:40 +03:00
4938d98f76 mu-indexer: re-enable database lock
Seeing some db corruption; re-enabling this (old) lock to see if it
helps. It _does_ slow down indexing significantly.
2024-04-10 21:47:04 +03:00
146b80113f lib: move transaction handling to mu-xapian
Instead of handling transactions in the store, handle it in xapian-db.
Make the code a bit more natural / cleaner-out

Handle transaction automatically (with a batch-size) and add some RAII
Transaction object, which makes all database interaction transactable
for the duration. So, no more need for explicit parameters to
add_message while indexing.
2023-12-22 21:24:41 +02:00
a2046dc2b1 mu-index: add blocking start()
Useful for unit tests
2023-09-16 11:12:16 +03:00
9dcbe1d96c lib: unit tests: improve / better coverage 2023-09-13 23:02:53 +03:00
2f5602b938 unit tests: improve
and add a new one for the indexer
2023-09-12 21:38:57 +03:00
53c7381929 lib: move index/ into main lib/
simplify things a bit
2023-09-10 08:55:25 +03:00