man: generate manpages from .org files

Generate the manpages from org-documents which makes it a bit easier to keep them update to date since I find org-syntax easier than troff, and we can use include files.
2022-12-18 00:21:52 +02:00
parent ba09b8fc2c
commit a259ae4162
45 changed files with 1667 additions and 1972 deletions
--- a/man/mu-index.1.org
+++ b/man/mu-index.1.org
@ -0,0 +1,182 @@
+#+title:  MU-INDEX
+
+* NAME
+
+*mu index* -- index e-mail messages stored in Maildirs
+
+* SYNOPSIS
+
+*mu [common-options] index*
+
+* DESCRIPTION
+
+*mu index* is the *mu* command for scanning the contents of Maildir directories and
+storing the results in a Xapian database. The data can then be queried using
+*mu-find(1)*.
+
+Before the first time you run *mu index*, you must run *mu init* to initialize the
+database.
+
+*index* understands Maildirs as defined by Daniel Bernstein for *qmail(7)*. In
+addition, it understands recursive Maildirs (Maildirs within Maildirs),
+Maildir++. It also supports VFAT-based Maildirs which use '!' or ';' as the
+separators instead of ':'.
+
+E-mail messages which are not stored in something resembling a maildir
+leaf-directory (=cur= and =new=) are ignored, as are the cache directories for
+=notmuch= and =gnus=, and any dot-directory.
+
+Starting with mu 1.5.x, symlinks are followed, and can be spread over multiple
+filesystems; however note that moving files around is much faster when multiple
+filesystems are not involved.
+
+If there is a file called =.noindex= in a directory, the contents of that
+directory and all of its subdirectories will be ignored. This can be useful to
+exclude certain directories from the indexing process, for example directories
+with spam-messages.
+
+If there is a file called =.noupdate= in a directory, the contents of that
+directory and all of its subdirectories will be ignored, unless we do a full
+rebuild (with *mu init*). This can be useful to speed up things you have some
+maildirs that never change. Note that you can still search for these messages,
+this only affects updating the database. =.noupdate= is ignored when you start
+indexing with an empty database (such as directly after =mu init=.
+
+There also the *--lazy-check* which can greatly speed up indexing; see below for
+details.
+
+The first run of *mu index* may take a few minutes if you have a lot of mail (tens
+of thousands of messages). Fortunately, such a full scan needs to be done only
+once; after that it suffices to index the changes, which goes much faster. See
+the 'Note on performance (i,ii,iii)' below for more information.
+
+The optional 'phase two' of the indexing-process is the removal of messages from
+the database for which there is no longer a corresponding file in the Maildir.
+If you do not want this, you can use ~-n~, ~--nocleanup~.
+
+When *mu index* catches one of the signals *SIGINT*, *SIGHUP* or *SIGTERM* (e.g., when
+you press Ctrl-C during the indexing process), it attempts to shutdown
+gracefully; it tries to save and commit data, and close the database etc. If it
+receives another signal (e.g., when pressing Ctrl-C once more), *mu index* will
+terminate immediately.
+
+* INDEX OPTIONS
+
+** --lazy-check
+in lazy-check mode, *mu* does not consider messages for which the time-stamp
+(ctime) of the directory they reside in has not changed since the previous
+indexing run. This is much faster than the non-lazy check, but won't update
+messages that have change (rather than having been added or removed), since
+merely editing a message does not update the directory time-stamp. Of course,
+you can run *mu-index* occasionally without ~--lazy-check~, to pick up such
+messages.
+
+** --nocleanup
+disable the database cleanup that *mu* does by default after indexing.
+
+#+include: "muhome.inc" :minlevel 2
+
+#+include: "common-options.inc" :minlevel 1
+
+* PERFORMANCE
+
+** indexing in ancient times (2009?)
+
+As a non-scientific benchmark, a simple test on the author's machine (a Thinkpad
+X61s laptop using Linux 2.6.35 and an ext3 file system) with no existing
+database, and a maildir with 27273 messages:
+
+#+begin_example
+$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
+$ time mu index --quiet
+66,65s user 6,05s system 27% cpu 4:24,20 total
+#+end_example
+(about 103 messages per second)
+
+A second run, which is the more typical use case when there is a database
+already, goes much faster:
+
+#+begin_example
+$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
+$ time mu index --quiet
+0,48s user 0,76s system 10% cpu 11,796 total
+#+end_example
+(more than 56818 messages per second)
+
+Note that each test flushes the caches first; a more common use case might be to
+run *mu index* when new mail has arrived; the cache may stay quite 'warm' in that
+case:
+
+#+begin_example
+ $ time mu index --quiet
+ 0,33s user 0,40s system 80% cpu 0,905 total
+#+end_example
+which is more than 30000 messages per second.
+
+** indexing in 2012
+
+As per June 2012, we did the same non-scientific benchmark, this time with an
+Intel i5-2500 CPU @ 3.30GHz, an ext4 file system and a maildir with 22589
+messages. We start without an existing database.
+
+#+begin_example
+ $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
+ $ time mu index --quiet
+ 27,79s user 2,17s system 48% cpu 1:01,47 total
+#+end_example
+(about 813 messages per second)
+
+A second run, which is the more typical use case when there is a database
+already, goes much faster:
+
+#+begin_example
+$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
+$ time mu index --quiet
+0,13s user 0,30s system 19% cpu 2,162 total
+#+end_example
+(more than 173000 messages per second)
+
+** indexing in 2016
+
+As per July 2016, we did the same non-scientific benchmark, again with the Intel
+i5-2500 CPU @ 3.30GHz, an ext4 file system. This time, the maildir contains
+72525 messages.
+
+#+begin_example
+$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
+$ time mu index --quiet
+40,34s user 2,56s system 64% cpu 1:06,17 total
+#+end_example
+(about 1099 messages per second).
+
+** indexing in 2022
+
+A few years later and it is June 2022. There's a lot more happening during
+indexing, but indexing became multi-threaded and machines are faster; e.g. this
+is with an AMD Ryzen Threadripper 1950X (16 cores) @ 3.399GHz.
+
+The instructions are a little different since we have a proper repeatable
+benchmark now. After building,
+
+#+begin_example
+ $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
+% THREAD_NUM=4 build/lib/tests/bench-indexer -m perf
+# random seed: R02Sf5c50e4851ec51adaf301e0e054bd52b
+1..1
+# Start of bench tests
+# Start of indexer tests
+indexed 5000 messages in 20 maildirs in 3763ms; 752 μs/message; 1328 messages/s (4 thread(s))
+ok 1 /bench/indexer/4-cores
+# End of indexer tests
+# End of bench tests
+#+end_example
+
+Things are again a little faster, even though the index does a lot more now
+(text-normalizatian, and pre-generating message-sexps). A faster machine helps,
+too!
+
+#+include: "prefooter.inc"
+
+* SEE ALSO
+
+*maildir(5)*, *mu(1)*, *mu-init(1)*, *mu-find(1)*, *mu-cfind(1)*