diff --git a/README.md b/README.md index 52ee7d5..2b7c5eb 100644 --- a/README.md +++ b/README.md @@ -25,6 +25,47 @@ This wiki uses the standard [html/template](https://pkg.go.dev/html/template) library to generate HTML. +## Documentation + +This project uses man(1) pages. They are generated from text files +using [scdoc](https://git.sr.ht/~sircmpwn/scdoc). These are the files +available: + +[oddmu(1)](blob/main/man/oddmu.1.txt): This man page has a short +introduction to Oddmu, its configuration via templates and environment +variables, plus points to the other man pages. + +[oddmu(5)](blob/main/man/oddmu.5.txt): This man page talks about the +Markdown and includes some examples for the non-standard features such +as table markup. It also talks about the Oddmu extensions to Markdown: +wiki links, hashtags and fediverse account links. Local links must use +percent encoding for page names so there is a section about percent +encoding. The man page also explains how feeds are generated. + +[oddmu-search(1)](blob/main/man/oddmu-search.1.txt): This man page +documents the "search" subcommand which you can use to build indexes – +lists of page links. These are important for feeds. + +[oddmu-replace(1)](blob/main/man/oddmu-replace.1.txt): This man page +documents the "replace" subcommand to make mass changes to the files +much like find(1), grep(1) and sed(1) or perl (1). + +[oddmu-html(1)](blob/main/man/oddmu-html.1.txt): This man page +documents the "html" subcommand to generate HTML from Markdown pages +from the command line. + +[oddmu-templates(5)](blob/main/man/oddmu-templates.5.txt): This man +page documents how the templates can be changed (how they *must* be +changed) and lists the attributes available for the various templates. + +[oddmu-apache(5)](blob/main/man/oddmu-apache.5.txt): This man page +documents how to set up the web server for various common tasks such +as using logins to limit what visitors can edit. + +[oddmu.service(5)](blob/main/man/oddmu.service.5.txt): This man page +documents how to setup a systemd unit and have it manage Oddmu. “Great +configurability brings brings great burdens.” + ## Building ```sh diff --git a/man/oddmu-apache.5 b/man/oddmu-apache.5 index e409f19..132996d 100644 --- a/man/oddmu-apache.5 +++ b/man/oddmu-apache.5 @@ -5,7 +5,7 @@ .nh .ad l .\" Begin generated content: -.TH "ODDMU-APACHE" "5" "2023-09-18" +.TH "ODDMU-APACHE" "5" "2023-10-03" .PP .SH NAME .PP @@ -46,7 +46,7 @@ MDCertificateAgreement accepted ServerAdmin alex@alexschroeder\&.ch ServerName transjovian\&.org SSLEngine on - ProxyPassMatch ^/(search|(view|edit|save|add|append|upload|drop)/(\&.*))?$ http://localhost:8080/$1 + ProxyPassMatch "^/((view|edit|save|add|append|upload|drop|search)/(\&.*))?$" "http://localhost:8080/$1" .fi .RE @@ -142,7 +142,7 @@ DocumentRoot /home/oddmu .PP Make sure that none of the subdirectories look like the wiki paths "/view/", "/edit/", "/save/", "/add/", "/append/", "/upload/", -"/drop/" or "/search".\& For example, create a file called "robots.\&txt" +"/drop/" or "/search/".\& For example, create a file called "robots.\&txt" containing the following, telling all robots that they'\&re not welcome.\& .PP .nf diff --git a/man/oddmu-search.1 b/man/oddmu-search.1 index 9da8dab..ee46672 100644 --- a/man/oddmu-search.1 +++ b/man/oddmu-search.1 @@ -5,7 +5,7 @@ .nh .ad l .\" Begin generated content: -.TH "ODDMU-SEARCH" "1" "2023-09-19" +.TH "ODDMU-SEARCH" "1" "2023-10-03" .PP .SH NAME .PP @@ -21,9 +21,8 @@ The "search" subcommand searches the Markdown files in the current directory (!\&), returning the search result as a Markdown-formatted list.\& .PP -The use of a trigram index makes it possible to find substrings and -for the word order not to matter, but it also makes the search results -a bit harder to understand.\& See \fIoddmu-search\fR(7) for more.\& +See \fIoddmu-search\fR(7) for more information of how pages are searched, +sorted and scored.\& .PP .SH OPTIONS .PP diff --git a/man/oddmu-search.7 b/man/oddmu-search.7 index 4e9aece..53b8c46 100644 --- a/man/oddmu-search.7 +++ b/man/oddmu-search.7 @@ -5,7 +5,7 @@ .nh .ad l .\" Begin generated content: -.TH "ODDMU-SEARCH" "7" "2023-09-18" +.TH "ODDMU-SEARCH" "7" "2023-10-03" .PP .SH NAME .PP @@ -17,23 +17,50 @@ oddmu-search - understanding the Oddmu search engine .PP .SH DESCRIPTION .PP -The index indexes trigrams.\& Each group of three characters is a -trigram.\& A document with content "This is a test" is turned to lower -case and indexed under the trigrams "thi", "his", "is ", "s i", " is", -"is ", "s a", " a ", "a t", " te", "tes", "est".\& +The wiki keeps an index of all the hash tags and page titles in +memory.\& Using hashtags and predicates in your queries speeds them up +because fewer files are opened.\& .PP -Each query is split into words and then processed the same way.\& A -query with the words "this test" is turned to lower case and produces -the trigrams "thi", "his", "tes", "est".\& This means that the word -order is not considered when searching for documents.\& +A hashtag starts with a number sign ("#") and contains numbers, +letters, and the underscore ("_").\& .PP -This also means that there is no stemming.\& Searching for "testing" -won'\&t find "This is a test" because there are no matches for the -trigrams "sti", "tin", "ing".\& +Example: #old_school random encounter .PP -These trigrams are looked up in the index, resulting in the list of -documents.\& Each document found is then scored.\& Each of the following -increases the score by one point: +The title predicate filters for pages where the term is contained in +the page title.\& +.PP +Example: title:geo title:cache zürich +.PP +The blog predicate filters for pages where the page name begins with +an ISO date like "2023-09-26" if true, or doesn'\&t begin with an ISO +date if false.\& +.PP +Example: blog:false fountain +.PP +The sorting of all the pages does not depend on the number of matches +or any kind of score because computing the score is expensive as this +requires the page to be loaded from disk.\& Therefore, results are +sorted by title: +.PP +.PD 0 +.IP \(bu 4 +If the page title contains the query string, it gets sorted first.\& +.IP \(bu 4 +If the page name starts with a number, it is sorted descending.\& +.IP \(bu 4 +All other pages follow, sorted ascending.\& +.PD +.PP +The effect is that first, the pages with matches in the page title are +shown, and then all the others.\& Within these two groups, the most +recent blog posts are shown first.\& This assumes that blog pages start +with an ISO date like "2023-09-16".\& +.PP +The score and highlighting of snippets is used to help visitors decide +which links to click.\& +.PP +Each document found is scored.\& Each of the following increases the +score by one point: .PP .PD 0 .IP \(bu 4 @@ -52,34 +79,6 @@ A document with content "This is a test" when searched with the phrase "this test" therefore gets a score of 8: the entire phrase does not match but each word gets four points.\& .PP -Trigrams are sometimes strange: In a text containing the words "main" -and "rail", a search for "mail" returns a match because the trigrams -"mai" and "ail" are found.\& In this situation, the result has a score -of 0.\& -.PP -The sorting of all the pages, however, does not depend on scoring!\& -Computing the score is expensive because the page must be loaded from -disk.\& Therefore, results are sorted by title: -.PP -.PD 0 -.IP \(bu 4 -If the page title contains the query string, it gets sorted first.\& -.IP \(bu 4 -If the page name (the filename!\&) begins with a number, it is sorted -descending.\& -.IP \(bu 4 -All other pages follow, sorted ascending.\& -.PD -.PP -The effect is that first, the pages with matches in the page title are -shown, and then all the others.\& Within these two groups, the most -recent blog posts are shown first.\& This assumes that blog pages start -with an ISO date like "2023-09-16".\& -.PP -The score and highlighting of snippets is used to help visitors decide -which links to click.\& A score of 0 indicates that all the trigrams -were found but \fIno exact matches\fR for any of the terms.\& -.PP .SH SEE ALSO .PP \fIoddmu\fR(1), \fIoddmu-search\fR(1) diff --git a/man/oddmu-search.7.txt b/man/oddmu-search.7.txt index 6ba4bda..5fd0700 100644 --- a/man/oddmu-search.7.txt +++ b/man/oddmu-search.7.txt @@ -15,7 +15,7 @@ memory. Using hashtags and predicates in your queries speeds them up because fewer files are opened. A hashtag starts with a number sign ("#") and contains numbers, -letters, and the underscore ("_"). +letters, and the underscore ("\_"). Example: #old_school random encounter diff --git a/man/oddmu-templates.5 b/man/oddmu-templates.5 index b6fb93c..b05f365 100644 --- a/man/oddmu-templates.5 +++ b/man/oddmu-templates.5 @@ -5,7 +5,7 @@ .nh .ad l .\" Begin generated content: -.TH "ODDMU-TEMPLATES" "5" "2023-09-22" "File Formats Manual" +.TH "ODDMU-TEMPLATES" "5" "2023-10-03" "File Formats Manual" .PP .SH NAME .PP @@ -35,9 +35,10 @@ is a byte array and that'\&s why we need to call \fIprintf\fR).\& .PP For the \fIsearch.\&html\fR template only: .PP -\fI{{.\&Previous}}\fR, \fI{{.\&Page}}\fR, \fI{{.\&Next}}\fR and \fI{{.\&Last}}\fR are the -previous, current, next and last page number in the results since -doing arithmetics in templates is hard.\& The first page number is 1.\& +\fI{{.\&Previous}}\fR, \fI{{.\&Page}}\fR and \fI{{.\&Next}}\fR are the previous, current +and next page number in the results since doing arithmetics in +templates is hard.\& The first page number is 1.\& The last page is +expensive to dermine and so that isn'\&t done.\& .PP \fI{{.\&More}}\fR indicates if there are any more search results.\& .PP @@ -71,7 +72,7 @@ For items in the feed: .PP \fI{{.\&Title}}\fR is the title of the page.\& .PP -\fI{{.\&Html}}\fR is the rendered Markdown, as HTML.\& +\fI{{.\&Html}}\fR is the rendered Markdown, as escaped (!\&) HTML.\& .PP \fI{{.\&Hashtags}}\fR is an array of strings.\& .PP @@ -97,7 +98,7 @@ parameter \fIq\fR.\& .PP .nf .RS 4 -curl http://localhost:8080/search?q=towel +curl http://localhost:8080/search/?q=towel .fi .RE .PP diff --git a/man/oddmu.1 b/man/oddmu.1 index 0397982..20254f5 100644 --- a/man/oddmu.1 +++ b/man/oddmu.1 @@ -5,7 +5,7 @@ .nh .ad l .\" Begin generated content: -.TH "ODDMU" "1" "2023-09-22" +.TH "ODDMU" "1" "2023-10-03" .PP .SH NAME .PP @@ -31,6 +31,10 @@ is "index.\&md".\& If no such file exists, oddmu offers you to create the page.\ If your files don'\&t provide their own title ("# title"), the file name (without ".\&md") is used for the page title.\& .PP +Every file can be viewed as feed by using the extension ".\&rss".\& The +feed items are based on links in bullet lists using the asterix +("*").\& +.PP Subdirectories are created as necessary.\& .PP See \fIoddmu\fR(5) for details about the page formatting.\& @@ -110,10 +114,6 @@ discussions.\& The wiki lists no recent changes.\& The expectation is that the people that care were involved in the discussions beforehand.\& .PP -The wiki also produces no feed.\& The assumption is that announcements are made on -social media: blogs, news aggregators, discussion forums, the fediverse, but -humans.\& -.PP The idea is that the webserver handles as many tasks as possible.\& It logs requests, does rate limiting, handles encryption, gets the certificates, and so on.\& The web server acts as a reverse proxy and the wiki ends up being a content @@ -140,7 +140,7 @@ pages by saving an empty file.\& .SH SEE ALSO .PP \fIoddmu\fR(5), \fIoddmu.\&service\fR(5), oddmu-apache_(5), \fIoddmu-html\fR(1), -\fIoddmu-replace\fR(1), \fIoddmu-search\fR(1), \fIoddmu-search\fR(7), \fIoddmu-feed\fR(1) +\fIoddmu-replace\fR(1), \fIoddmu-search\fR(1), \fIoddmu-search\fR(7) .PP .SH AUTHORS .PP diff --git a/man/oddmu.1.txt b/man/oddmu.1.txt index e2fd457..81cb6b3 100644 --- a/man/oddmu.1.txt +++ b/man/oddmu.1.txt @@ -25,7 +25,8 @@ If your files don't provide their own title ("# title"), the file name (without ".md") is used for the page title. Every file can be viewed as feed by using the extension ".rss". The -feed items are based on links in bullet lists using the asterix ("*"). +feed items are based on links in bullet lists using the asterix +("\*"). Subdirectories are created as necessary. diff --git a/man/oddmu.5 b/man/oddmu.5 index 817e064..78c1504 100644 --- a/man/oddmu.5 +++ b/man/oddmu.5 @@ -5,7 +5,7 @@ .nh .ad l .\" Begin generated content: -.TH "ODDMU" "5" "2023-09-21" "File Formats Manual" +.TH "ODDMU" "5" "2023-10-03" "File Formats Manual" .PP .SH NAME .PP @@ -97,6 +97,37 @@ The Markdown processor comes with a few extensions: \fB trailing backslashes turn into line breaks \fR MathJax is supported (but needs a separte setup) .PP +.SH FEEDS +.PP +Every file can be viewed as feed by using the extension ".\&rss".\& The +feed items are based on links in bullet lists using the asterix +("*").\& The items must point to local pages.\& This is why the link may +not contain two forward slashes ("//").\& +.PP +Assume this is the index page.\& The feed would be "/view/index.\&rss".\& It +would contain the pages "Arianism", "Donatism" and "Monophysitism" but +it would not contain the pages "Feed" and "About" since the list items +don'\&t start with an asterix.\& +.PP +.nf +.RS 4 +# Main Page + +Hello and welcome! Here are some important links: + +- [Feed](index\&.rss) +- [About](about) + +Recent posts: + +* [Arianism](arianism) +* [Donatism](donatism) +* [Monophysitism](monophysitism) +.fi +.RE +.PP +The feed contains at most 10 items, starting at the top.\& +.PP .SH PERCENT ENCODING .PP If you use Markdown links to local pages, you must percent-encode the diff --git a/man/oddmu.5.txt b/man/oddmu.5.txt index 36970db..5dc9ee2 100644 --- a/man/oddmu.5.txt +++ b/man/oddmu.5.txt @@ -87,9 +87,9 @@ The Markdown processor comes with a few extensions: # FEEDS Every file can be viewed as feed by using the extension ".rss". The -feed items are based on links in bullet lists using the asterix ("*"). -The items must point to local pages. This is why the link may not -contain two forward slashes ("//"). +feed items are based on links in bullet lists using the asterix +("\*"). The items must point to local pages. This is why the link may +not contain two forward slashes ("//"). Assume this is the index page. The feed would be "/view/index.rss". It would contain the pages "Arianism", "Donatism" and "Monophysitism" but