Some message can have an _empty_ message-id, e.g. with:
In-Reply-To: <>
which we weren't filter out.
This would yield and _empty_ Thread-Id, in mu-message.cc
And this would make mu-query believe it had no matches in the first
query, in Query::Private::run_related, and effectively throw away the
results. (Xapian using empty string both for a "not found" result, and
"found an empty string doesn't help either).
So, avoid having an empty reference. Also add a unit-test.
Fixes#2812.
Only include xapian.h in one place, so we can have consistent options.
With that in place, we can enable C++ move semantics.
We don't do anything with that yet, but we check in the meson.build file
to see if we have the required xapian version.
Flag message that merely have a List-Unsubscribe header with
Flags::MailingList too (some marketing message have this header, yet
miss "List-Id".
Add a test as well.
Xapian supports an "ngrams" option to help with languages/scripts
without explicit wordbreaks, such as Chinese / Japanese / Korean.
Add some plumbing for supporting this in mu as well. Experimental for
now.
When passing messages to mu, often we got a (parsed from string)
message-sexp from the message document; then appended some more
properties ("build_message_sexp").
Instead, we can do it in terms of the strings; this is _a little_
inelegant, but also much faster; compare:
(base)
[mu4e] Found 500 matching messages; 0 hidden; search: 1298.0 ms (2.60 ms/msg); render: 642.1 ms (1.28 ms/msg)
(with temp-file optimization (earlier commit)
[mu4e] Found 500 matching messages; 0 hidden; search: 1152.7 ms (2.31 ms/msg); render: 270.1 ms (0.54 ms/msg)
(with temp file optimize _and_ the string opt (this commit)
[mu4e] Found 500 matching messages; 0 hidden; search: 266.0 ms (0.53 ms/msg); render: 199.7 ms (0.40 ms/msg)
This makes queries where we don't need the sexp much faster; e.g.
before:
mu find "a" --include-related 47,51s user 2,68s system 99% cpu 50,651 total
after:
mu find "a" --include-related 7,12s user 1,97s system 87% cpu 10,363 total
We were dumping the HTML-parts as-is in the Xapian indexer; however,
it's better to remove the html decoration first, and just pass the text.
We use the new built-in html->text scraper for that.
This is a bit of hack to include html text in results.
Of course, html text is not really plain text, so this is a bit of a
hack until we introduce some html parsing step.