support xapian ngrams

Xapian supports an "ngrams" option to help with languages/scripts
without explicit wordbreaks, such as Chinese / Japanese / Korean.

Add some plumbing for supporting this in mu as well. Experimental for
now.
This commit is contained in:
Dirk-Jan C. Binnema
2023-09-09 11:57:05 +03:00
parent f6122ecc9e
commit 264bb092f0
20 changed files with 207 additions and 81 deletions

View File

@ -1,5 +1,5 @@
/*
** Copyright (C) 2022 Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>
** Copyright (C) 2022-2023 Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>
**
** This program is free software; you can redistribute it and/or modify it
** under the terms of the GNU General Public License as published by the
@ -49,8 +49,10 @@ public:
Decrypt = 1 << 0, /**< Attempt to decrypt */
RetrieveKeys = 1 << 1, /**< Auto-retrieve crypto keys (implies network
* access) */
AllowRelativePath = 1 << 2, /**< Allow relateive paths for filename
AllowRelativePath = 1 << 2, /**< Allow relative paths for filename
* in make_from_path */
SupportNgrams = 1 << 3, /**< Support ngrams, as used in
* CJK and other languages. */
};
/**
@ -60,7 +62,6 @@ public:
*/
Message(Message&& other) noexcept;
/**
* operator=
*
@ -147,6 +148,14 @@ public:
const Document& document() const;
/**
* The message options for this message
*
* @return message options
*/
Options options() const;
/**
* Get the document-id, or 0 if non-existent.
*