11 Commits
v0.2 ... v0.3

Author SHA1 Message Date
Alex Schroeder
6f61dde12a Test search and fix bugs 2023-08-24 14:14:33 +02:00
Alex Schroeder
5b29e6433a Add page tests 2023-08-24 13:06:02 +02:00
Alex Schroeder
2bd20432e2 More snippet and highlight testing and fixing 2023-08-24 12:42:50 +02:00
Alex Schroeder
49d62a7979 Split highlight.go from snippets.go 2023-08-24 10:33:32 +02:00
Alex Schroeder
071e807886 Add snippets test 2023-08-24 10:32:07 +02:00
Alex Schroeder
df1fdf4373 Add indexing limitation 2023-08-24 10:12:54 +02:00
Alex Schroeder
08b63ae84b Add scoring 2023-08-24 10:00:39 +02:00
Alex Schroeder
645a87e5c8 Split page.go 2023-08-24 08:57:36 +02:00
Alex Schroeder
c54e41da28 Comments. handleTitle takes arg.
handleTitle now takes an argument so that the edit page can show the
page title and still have an untouched Page.Body for editing.
2023-08-24 08:51:51 +02:00
Alex Schroeder
6a4e014d1c Split search.go and snippets.go 2023-08-24 08:29:53 +02:00
Alex Schroeder
2142144b0c Add search 2023-08-23 23:27:34 +02:00
15 changed files with 701 additions and 101 deletions

1
.gitignore vendored
View File

@@ -1 +1,2 @@
/oddmu
test.md

View File

@@ -35,6 +35,15 @@ extension.
`{{printf "%s" .Body}}` is the Markdown, as a string (the data itself
is a byte array and that's why we need to call `printf`).
When calling the `save` action, the page name is take from the URL and
the page content is taken from the `body` form parameter. To
illustrate, here's how to edit a page using `curl`:
```sh
curl --form body="Did you bring a towel?" \
http://localhost:8080/save/welcome
```
## Building
```sh
@@ -126,15 +135,15 @@ MDCertificateAgreement accepted
RewriteEngine on
RewriteRule ^/$ http://%{HTTP_HOST}:8080/view/index [redirect]
RewriteRule ^/(view|edit|save)/(.*) http://%{HTTP_HOST}:8080/$1/$2 [proxy]
RewriteRule ^/(view|edit|save|search)/(.*) http://%{HTTP_HOST}:8080/$1/$2 [proxy]
</VirtualHost>
```
First, it manages the domain, getting the necessary certificates. It
redirects regular HTTP traffic from port 80 to port 443. It turns on
the SSL engine for port 443. It redirects `/` to `/view/index` and any
path that starts with `/view/`, `/edit/` or `/save/` is proxied to
port 8080 where the Oddmu program can handle it.
path that starts with `/view/`, `/edit/`, `/save/` or `/search/` is
proxied to port 8080 where the Oddmu program can handle it.
Thus, this is what happens:
@@ -208,9 +217,13 @@ DocumentRoot /home/oddmu/static
Create this directory, making sure to give it a permission that your
webserver can read (world readable file, world readable and executable
directory). Populate it with files. For example, create a file called
`robots.txt` containing the following, tellin all robots that they're
not welcome.
directory). Populate it with files.
Make sure that none of the static files look like the wiki paths
`/view/`, `/edit/`, `/save/` or `/search/`.
For example, create a file called `robots.txt` containing the
following, tellin all robots that they're not welcome.
```text
User-agent: *
@@ -223,9 +236,6 @@ and without needing a wiki page.
[Wikipedia](https://en.wikipedia.org/wiki/Robot_exclusion_standard)
has more information.
All you have make sure is that none of the static files look like the
wiki paths `/view/`, `/edit/` or `/save/`.
## Customization (with recompilation)
The Markdown parser can be customized and
@@ -242,61 +252,33 @@ rocket links (`=>`). Here's how to modify the `loadPage` so that a
translated into Markdown:
```go
func loadPage(title string) (*Page, error) {
filename := title + ".md"
func loadPage(name string) (*Page, error) {
filename := name + ".md"
body, err := os.ReadFile(filename)
if err == nil {
return &Page{Title: title, Name: title, Body: body}, nil
return &Page{Title: name, Name: name, Body: body}, nil
}
filename = title + ".gmi"
filename = name + ".gmi"
body, err = os.ReadFile(filename)
if err == nil {
return &Page{Title: title, Name: title, Body: body}, nil
return &Page{Title: name, Name: name, Body: body}, nil
}
return nil, err
}
```
There is a small problem, however: By default, Markdown expects an
empty line before a list begins. The following change to `viewHandler`
empty line before a list begins. The following change to `renderHtml`
uses the `NoEmptyLineBeforeBlock` extension for the parser:
```go
func viewHandler(w http.ResponseWriter, r *http.Request, title string) {
// Short cut for text files
if (strings.HasSuffix(title, ".txt")) {
body, err := os.ReadFile(title)
if err == nil {
w.Write(body)
return
}
}
// Attempt to load Markdown or Gemini page; edit it if this fails
p, err := loadPage(title)
if err != nil {
http.Redirect(w, r, "/edit/"+title, http.StatusFound)
return
}
// Render the Markdown to HTML, extracting a title and
// possibly sanitizing it
s := string(p.Body)
m := titleRegexp.FindStringSubmatch(s)
if m != nil {
p.Title = m[1]
p.Body = []byte(strings.Replace(s, m[0], "", 1))
}
func (p* Page) renderHtml() {
// Here is where a new extension is added!
extensions := parser.CommonExtensions | parser.NoEmptyLineBeforeBlock
markdownParser := parser.NewWithExtensions(extensions)
flags := html.CommonFlags
opts := html.RendererOptions{
Flags: flags,
}
htmlRenderer := html.NewRenderer(opts)
maybeUnsafeHTML := markdown.ToHTML(p.Body, markdownParser, htmlRenderer)
maybeUnsafeHTML := markdown.ToHTML(p.Body, markdownParser, nil)
html := bluemonday.UGCPolicy().SanitizeBytes(maybeUnsafeHTML)
p.Html = template.HTML(html);
renderTemplate(w, "view", p)
}
```
@@ -306,6 +288,10 @@ Page titles are filenames with `.md` appended. If your filesystem
cannot handle it, it can't be a page title. Specifically, *no slashes*
in filenames.
The pages are indexed as the server starts and the index is kept in
memory. If you have a ton of pages, this surely wastes a lot of
memory.
## References
[Writing Web Applications](https://golang.org/doc/articles/wiki/)

1
go.mod
View File

@@ -3,6 +3,7 @@ module alexschroeder.ch/cgit/oddmu
go 1.21.0
require (
github.com/dgryski/go-trigram v0.0.0-20160407183937-79ec494e1ad0
github.com/gomarkdown/markdown v0.0.0-20230716120725-531d2d74bc12
github.com/microcosm-cc/bluemonday v1.0.25
)

2
go.sum
View File

@@ -1,5 +1,7 @@
github.com/aymerick/douceur v0.2.0 h1:Mv+mAeH1Q+n9Fr+oyamOlAkUNPWPlA8PPGR0QAaYuPk=
github.com/aymerick/douceur v0.2.0/go.mod h1:wlT5vV2O3h55X9m7iVYN0TBM0NH/MmbLnd30/FjWUq4=
github.com/dgryski/go-trigram v0.0.0-20160407183937-79ec494e1ad0 h1:b+7JSiBM+hnLQjP/lXztks5hnLt1PS46hktG9VOJgzo=
github.com/dgryski/go-trigram v0.0.0-20160407183937-79ec494e1ad0/go.mod h1:qzKC/DpcxK67zaSHdCmIv3L9WJViHVinYXN2S7l3RM8=
github.com/gomarkdown/markdown v0.0.0-20230716120725-531d2d74bc12 h1:uK3X/2mt4tbSGoHvbLBHUny7CKiuwUip3MArtukol4E=
github.com/gomarkdown/markdown v0.0.0-20230716120725-531d2d74bc12/go.mod h1:JDGcbDT52eL4fju3sZ4TeHGsQwhG9nbDV21aMyhwPoA=
github.com/gorilla/css v1.0.0 h1:BQqNyPTi50JCFMTw/b67hByjMVXZRwGha6wxVGkeihY=

45
highlight.go Normal file
View File

@@ -0,0 +1,45 @@
package main
import (
"strings"
"regexp"
)
// highlight splits the query string q into terms and highlights them
// using the bold tag. Return the highlighted string and a score.
func highlight (q string, s string) (string, int) {
c := 0
re, err := regexp.Compile("(?i)" + q)
if err == nil {
m := re.FindAllString(s, -1)
if m != nil {
// Score increases for each full match of q.
c += len(m)
}
}
for _, v := range strings.Split(q, " ") {
if len(v) == 0 {
continue
}
re, err := regexp.Compile(`(?is)(\pL?)(` + v + `)(\pL?)`)
if err != nil {
continue
}
r := make(map[string]string)
for _, m := range re.FindAllStringSubmatch(s, -1) {
// Term matched increases the score.
c++
// Terms matching at the beginning and
// end of words and matching entire
// words increase the score further.
if len(m[1]) == 0 { c++ }
if len(m[3]) == 0 { c++ }
if len(m[1]) == 0 && len(m[3]) == 0 { c++ }
r[m[2]] = "<b>" + m[2] + "</b>"
}
for old, new := range r {
s = strings.ReplaceAll(s, old, new)
}
}
return s, c
}

63
highlight_test.go Normal file
View File

@@ -0,0 +1,63 @@
package main
import (
"testing"
)
func TestHighlight(t *testing.T) {
s := `The windows opens
A wave of car noise hits me
No birds to be heard.`
h := `The <b>window</b>s opens
A wave of car noise hits me
No birds to be heard.`
q := "window"
r, c := highlight(q, s)
if r != h {
t.Logf("The highlighting is wrong in 「%s」", r)
t.Fail()
}
// Score:
// - q itself
// - the single token
// - the beginning of a word
if c != 3 {
t.Logf("%s score is %d", q, c)
t.Fail()
}
q = "windows"
_, c = highlight(q, s)
// Score:
// - q itself
// - the single token
// - the beginning of a word
// - the end of a word
// - the whole word
if c != 5 {
t.Logf("%s score is %d", q, c)
t.Fail()
}
q = "car noise"
_, c = highlight(q, s)
// Score:
// - car noise (+1)
// - car, with beginning, end, whole word (+4)
// - noise, with beginning, end, whole word (+4)
if c != 9 {
t.Logf("%s score is %d", q, c)
t.Fail()
}
q = "noise car"
_, c = highlight(q, s)
// Score:
// - the car token
// - the noise token
// - each with beginning, end and whole token (3 each)
if c != 8 {
t.Logf("%s score is %d", q, c)
t.Fail()
}
}

111
page.go Normal file
View File

@@ -0,0 +1,111 @@
package main
import (
"github.com/gomarkdown/markdown"
"github.com/gomarkdown/markdown/ast"
"github.com/gomarkdown/markdown/parser"
"github.com/microcosm-cc/bluemonday"
"html/template"
"strings"
"bytes"
"os"
)
// Page is a struct containing information about a single page. Title
// is the title extracted from the page content using titleRegexp.
// Name is the filename without extension (so a filename of "foo.md"
// results in the Name "foo"). Body is the Markdown content of the
// page and Html is the rendered HTML for that Markdown. Score is a
// number indicating how well the page matched for a search query.
type Page struct {
Title string
Name string
Body []byte
Html template.HTML
Score int
}
// save saves a Page. The filename is based on the Page.Name and gets
// the ".md" extension. Page.Body is saved, without any carriage
// return characters ("\r"). The file permissions used are readable
// and writeable for the current user, i.e. u+rw or 0600. Page.Title
// and Page.Html are not saved no caching. There is no caching.
func (p *Page) save() error {
filename := p.Name + ".md"
s := bytes.ReplaceAll(p.Body, []byte{'\r'}, []byte{})
p.Body = s
p.updateIndex()
return os.WriteFile(filename, s, 0600)
}
// loadPage loads a Page given a name. The filename loaded is that
// Page.Name with the ".md" extension. The Page.Title is set to the
// Page.Name (and possibly changed, later). The Page.Body is set to
// the file content. The Page.Html remains undefined (there is no
// caching).
func loadPage(name string) (*Page, error) {
filename := name + ".md"
body, err := os.ReadFile(filename)
if err != nil {
return nil, err
}
return &Page{Title: name, Name: name, Body: body}, nil
}
// handleTitle extracts the title from a Page and sets Page.Title, if
// any. If replace is true, the page title is also removed from
// Page.Body. Make sure not to save this! This is only for rendering.
func (p* Page) handleTitle(replace bool) {
s := string(p.Body)
m := titleRegexp.FindStringSubmatch(s)
if m != nil {
p.Title = m[1]
if replace {
p.Body = []byte(strings.Replace(s, m[0], "", 1))
}
}
}
// renderHtml renders the Page.Body to HTML and sets Page.Html.
func (p* Page) renderHtml() {
maybeUnsafeHTML := markdown.ToHTML(p.Body, nil, nil)
html := bluemonday.UGCPolicy().SanitizeBytes(maybeUnsafeHTML)
p.Html = template.HTML(html);
}
// plainText renders the Page.Body to plain text and returns it,
// ignoring all the Markdown and all the newlines. The result is one
// long single line of text.
func (p* Page) plainText() string {
parser := parser.New()
doc := markdown.Parse(p.Body, parser)
text := []byte("")
ast.WalkFunc(doc, func(node ast.Node, entering bool) ast.WalkStatus {
if entering && node.AsLeaf() != nil {
text = append(text, node.AsLeaf().Literal...)
text = append(text, []byte(" ")...)
}
return ast.GoToNext
})
// Some Markdown still contains newlines
for i, c := range text {
if c == '\n' {
text[i] = ' '
}
}
// Remove trailing space
for text[len(text)-1] == ' ' {
text = text[0:len(text)-1]
}
return string(text)
}
// summarize for query string q sets Page.Html to an extract.
func (p* Page) summarize(q string) {
p.handleTitle(true)
s, c := snippets(q, p.plainText())
p.Score = c
extract := []byte(s)
html := bluemonday.UGCPolicy().SanitizeBytes(extract)
p.Html = template.HTML(html)
}

59
page_test.go Normal file
View File

@@ -0,0 +1,59 @@
package main
import (
"strings"
"testing"
)
func TestPageTitle (t *testing.T) {
p := &Page{Body: []byte(`# Ache
My back aches for you
I sit, stare and type for hours
But yearn for blue sky`)}
p.handleTitle(false)
if p.Title != "Ache" {
t.Logf("The page title was not extracted correctly: %s", p.Title)
t.Fail()
}
if !strings.HasPrefix(string(p.Body), "# Ache") {
t.Logf("The page title was removed: %s", p.Body)
t.Fail()
}
p.handleTitle(true)
if !strings.HasPrefix(string(p.Body), "My back") {
t.Logf("The page title was not removed: %s", p.Body)
t.Fail()
}
}
func TestPagePlainText (t *testing.T) {
p := &Page{Body: []byte(`# Water
The air will not come
To inhale is an effort
The summer heat kills`)}
s := p.plainText()
r := "Water The air will not come To inhale is an effort The summer heat kills"
if s != r {
t.Logf("The plain text version is wrong: %s", s)
t.Fail()
}
}
func TestPageHtml (t *testing.T) {
p := &Page{Body: []byte(`# Sun
Silver leaves shine bright
They droop, boneless, weak and sad
A cruel sun stares down`)}
p.renderHtml()
s := string(p.Html)
r := `<h1>Sun</h1>
<p>Silver leaves shine bright
They droop, boneless, weak and sad
A cruel sun stares down</p>
`
if s != r {
t.Logf("The HTML is wrong: %s", s)
t.Fail()
}
}

110
search.go Normal file
View File

@@ -0,0 +1,110 @@
package main
import (
trigram "github.com/dgryski/go-trigram"
"path/filepath"
"strings"
"slices"
"io/fs"
"fmt"
)
// Search is a struct containing the result of a search. Query is the
// query string and Items is the array of pages with the result.
// Currently there is no pagination of results! When a page is part of
// a search result, Body and Html are simple extracts.
type Search struct {
Query string
Items []Page
Results bool
}
// index is a struct containing the trigram index for search. It is
// generated at startup and updated after every page edit.
var index trigram.Index
// documents is a map, mapping document ids of the index to page
// names.
var documents map[trigram.DocID]string
func indexAdd(path string, info fs.FileInfo, err error) error {
if err != nil {
return err
}
filename := path
if info.IsDir() || strings.HasPrefix(filename, ".") || !strings.HasSuffix(filename, ".md") {
return nil
}
name := strings.TrimSuffix(filename, ".md")
p, err := loadPage(name)
if err != nil {
return err
}
id := index.Add(string(p.Body))
documents[id] = p.Name
return nil
}
func loadIndex() error {
index = make(trigram.Index)
documents = make(map[trigram.DocID]string)
err := filepath.Walk(".", indexAdd)
if err != nil {
fmt.Println("Indexing failed")
index = nil
documents = nil
}
return err
}
func (p *Page) updateIndex() {
var id trigram.DocID
for docId, name := range documents {
if name == p.Name {
id = docId
break
}
}
if id == 0 {
id = index.Add(string(p.Body))
documents[id] = p.Name
} else {
o, err := loadPage(p.Name)
if err == nil {
index.Delete(string(o.Body), id)
}
index.Insert(string(p.Body), id)
}
}
// search returns a sorted []Page where each page contains an extract
// of the actual Page.Body in its Page.Html.
func search(q string) []Page {
ids := index.Query(q)
items := make([]Page, len(ids))
for i, id := range ids {
name := documents[id]
p, err := loadPage(name)
if err != nil {
fmt.Printf("Error loading %s\n", name)
} else {
p.summarize(q)
items[i] = *p
}
}
fn := func(a, b Page) int {
if a.Score < b.Score {
return 1
} else if a.Score > b.Score {
return -1
} else if a.Title < b.Title {
return -1
} else if a.Title > b.Title {
return 1
} else {
return 0
}
}
slices.SortFunc(items, fn)
return items
}

28
search.html Normal file
View File

@@ -0,0 +1,28 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="format-detection" content="telephone=no">
<meta name="viewport" content="width=device-width">
<title>Search for {{.Query}}</title>
<style>
html { max-width: 70ch; padding: 2ch; margin: auto; color: #111; background: #ffe; }
img { max-width: 20%; }
.result { font-size: larger }
.score { font-size: smaller; opacity: 0.8; }
</style>
</head>
<body>
<h1>Search for {{.Query}}</h1>
<div>
{{if .Results}}
{{range .Items}}
<p><a class="result" href="/view/{{.Name}}">{{.Title}}</a> <span class="score">{{.Score}}</span></p>
<blockquote>{{.Html}}</blockquote>
{{end}}
{{else}}
<p>No results.</p>
{{end}}
</div>
</body>
</html>

71
search_test.go Normal file
View File

@@ -0,0 +1,71 @@
package main
import (
"testing"
"strings"
"os"
)
var name string = "test"
// TestIndex relies on README.md being indexed
func TestIndex (t *testing.T) {
_ = os.Remove(name + ".md")
loadIndex()
q := "Oddµ"
pages := search(q)
if len(pages) == 0 {
t.Log("Search found no result")
t.Fail()
}
for _, p := range pages {
if !strings.Contains(string(p.Body), q) {
t.Logf("Page %s does not contain %s", p.Name, q)
t.Fail()
}
if p.Score == 0 {
t.Logf("Page %s has no score", p.Name)
t.Fail()
}
}
p := &Page{Name: name, Body: []byte("This is a test.")}
p.save()
pages = search("This is a test")
found := false
for _, p := range pages {
if p.Name == name {
found = true
break
}
}
if !found {
t.Logf("Page '%s' was not found", name)
t.Fail()
}
p = &Page{Name: name, Body: []byte("Guvf vf n grfg.")}
p.save()
pages = search("This is a test")
found = false
for _, p := range pages {
if p.Name == name {
found = true
break
}
}
if found {
t.Logf("Page '%s' was still found using the old content: %s", name, p.Body)
t.Fail()
}
pages = search("Guvf")
found = false
for _, p := range pages {
if p.Name == name {
found = true
break
}
}
if !found {
t.Logf("Page '%s' not found using the new content: %s", name, p.Body)
t.Fail()
}
}

79
snippets.go Normal file
View File

@@ -0,0 +1,79 @@
package main
import (
"strings"
"regexp"
)
func snippets (q string, s string) (string, int) {
// Look for Snippets
snippetlen := 100
maxsnippets := 4
// Compile the query as a regular expression
re, err := regexp.Compile("(?i)(" + strings.Join(strings.Split(q, " "), "|") + ")")
// If the compilation didn't work, truncate
if err != nil || len(s) <= snippetlen {
if len(s) > 400 {
s = s[0:400]
}
return highlight(q, s)
}
// show a snippet from the beginning of the document
j := strings.LastIndex(s[:snippetlen], " ")
if j == -1 {
// OK, look for a longer word
j = strings.Index(s, " ")
if j == -1 {
// Or just truncate the body.
if len(s) > 400 {
s = s[0:400]
}
return highlight(q, s)
}
}
t := s[0:j]
res := t + " …"
s = s[j:] // avoid rematching
jsnippet := 0
for jsnippet < maxsnippets {
m := re.FindStringSubmatch(s)
if m == nil {
break
}
jsnippet++
j = strings.Index(s, m[1])
if j > -1 {
// get the substring containing the start of
// the match, ending on word boundaries
from := j - snippetlen / 2
if from < 0 {
from = 0
}
start := strings.Index(s[from:], " ")
if start == -1 {
start = 0
} else {
start += from
}
to := j + snippetlen / 2
if to > len(s) {
to = len(s)
}
end := strings.LastIndex(s[:to], " ")
if end == -1 {
// OK, look for a longer word
end = strings.Index(s[to:], " ")
if end == -1 {
end = len(s)
} else {
end += to
}
}
t = s[start : end];
res = res + t + " …";
// truncate text to avoid rematching the same string.
s = s[end:]
}
}
return highlight(q, res)
}

27
snippets_test.go Normal file
View File

@@ -0,0 +1,27 @@
package main
import (
"testing"
)
func TestSnippets(t *testing.T) {
s := `We are immersed in a sea of dead people. All the dead that have gone before us, silent now, just staring, gaping. As we move and talk and fret, never once stopping to ask ourselves or them! what it was all about. Instead we drown ourselves in noise. Incessantly we babble, surrounded by false friends claiming that all is well. And look at us! Yes, we are well. Patting our backs and expecting a pat and we do! we smugly do enjoy.`
h := `We are immersed in a sea of dead people. <b>All</b> the dead that have gone before us, silent now, just … to ask ourselves or them! what it was <b>all</b> about. Instead we drown ourselves in no<b>is</b>e. … surrounded by false friends claiming that <b>all</b> <b>is</b> <b>well</b>. And look at us! Yes, we are <b>well</b>. …`
q := "all is well"
r, c := snippets(q, s)
if r != h {
t.Logf("The snippets are wrong in 「%s」", r)
t.Fail()
}
// Score 12:
// - all is well (1)
// - all, beginning, end, whole word (+4 × 3 = 12)
// - is, beginning, end, whole word (+4 × 1 = 4), and as a substring (1)
// - well, beginning, end, whole word (+4 × 2 = 8)
if c != 26 {
t.Logf("%s score is %d", q, c)
t.Fail()
}
}

View File

@@ -7,12 +7,19 @@
<title>{{.Title}}</title>
<style>
html { max-width: 70ch; padding: 2ch; margin: auto; color: #111; background: #ffe; }
form { display: inline-block; padding-left: 1em; }
img { max-width: 100%; }
</style>
</head>
<body>
<h1>{{.Title}}</h1>
<p><a href="/edit/{{.Name}}">Edit this page</a></p>
<div>
<a href="/edit/{{.Name}}">Edit this page</a>
<form role="search" action="/search" method="GET">
<input type="text" spellcheck="false" name="q" required>
<button>Search</button>
</form>
</div>
<div>
{{.Html}}
</div>

120
wiki.go
View File

@@ -1,113 +1,122 @@
package main
import (
"github.com/microcosm-cc/bluemonday"
"github.com/gomarkdown/markdown"
"html/template"
"net/http"
"strings"
"regexp"
"bytes"
"fmt"
"os"
)
var templates = template.Must(template.ParseFiles("edit.html", "view.html"))
// Templates are parsed at startup.
var templates = template.Must(template.ParseFiles("edit.html", "view.html", "search.html"))
var validPath = regexp.MustCompile("^/(edit|save|view)/(([a-z]+/)?[^/]+)$")
// validPath is a regular expression where the second group matches a
// page, so when the handler for "/edit/" is called, a URL path of
// "/edit/foo" results in the editHandler being called with title
// "foo". The regular expression doesn't define the handlers (this
// happens in the main function).
var validPath = regexp.MustCompile("^/([^/]+)/(.+)$")
// titleRegexp is a regular expression matching a level 1 header line
// in a Markdown document. The first group matches the actual text and
// is used to provide an title for pages. If no title exists in the
// document, the page name is used instead.
var titleRegexp = regexp.MustCompile("(?m)^#\\s*(.*)\n+")
type Page struct {
Title string
Name string
Body []byte
Html template.HTML
}
func (p *Page) save() error {
filename := p.Name + ".md"
return os.WriteFile(filename, bytes.ReplaceAll(p.Body, []byte{'\r'}, []byte{}), 0600)
}
func loadPage(title string) (*Page, error) {
filename := title + ".md"
body, err := os.ReadFile(filename)
if err != nil {
return nil, err
}
return &Page{Title: title, Name: title, Body: body}, nil
}
func renderTemplate(w http.ResponseWriter, tmpl string, p *Page) {
err := templates.ExecuteTemplate(w, tmpl+".html", p)
// renderTemplate is the helper that is used render the templates with
// data.
func renderTemplate(w http.ResponseWriter, tmpl string, data any) {
err := templates.ExecuteTemplate(w, tmpl+".html", data)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
}
// rootHandler just redirects to /view/index.
func rootHandler(w http.ResponseWriter, r *http.Request) {
http.Redirect(w, r, "/view/index", http.StatusFound)
}
func viewHandler(w http.ResponseWriter, r *http.Request, title string) {
// viewHandler renders a text file, if the name ends in ".txt" and
// such a file exists. Otherwise, it loads the page. If this didn't
// work, the browser is redirected to an edit page. Otherwise, the
// "view.html" template is used to show the rendered HTML.
func viewHandler(w http.ResponseWriter, r *http.Request, name string) {
// Short cut for text files
if (strings.HasSuffix(title, ".txt")) {
body, err := os.ReadFile(title)
if (strings.HasSuffix(name, ".txt")) {
body, err := os.ReadFile(name)
if err == nil {
w.Write(body)
return
}
}
// Attempt to load Markdown page; edit it if this fails
p, err := loadPage(title)
p, err := loadPage(name)
if err != nil {
http.Redirect(w, r, "/edit/"+title, http.StatusFound)
http.Redirect(w, r, "/edit/"+name, http.StatusFound)
return
}
// Render the Markdown to HTML, extracting a title and
// possibly sanitizing it
s := string(p.Body)
m := titleRegexp.FindStringSubmatch(s)
if m != nil {
p.Title = m[1]
p.Body = []byte(strings.Replace(s, m[0], "", 1))
}
maybeUnsafeHTML := markdown.ToHTML(p.Body, nil, nil)
html := bluemonday.UGCPolicy().SanitizeBytes(maybeUnsafeHTML)
p.Html = template.HTML(html);
p.handleTitle(true)
p.renderHtml()
renderTemplate(w, "view", p)
}
func editHandler(w http.ResponseWriter, r *http.Request, title string) {
p, err := loadPage(title)
// editHandler uses the "edit.html" template to present an edit page.
// When editing, the page title is not overriden by a title in the
// text. Instead, the page name is used.
func editHandler(w http.ResponseWriter, r *http.Request, name string) {
p, err := loadPage(name)
if err != nil {
p = &Page{Title: title, Name: title}
p = &Page{Title: name, Name: name}
} else {
p.handleTitle(false)
}
renderTemplate(w, "edit", p)
}
func saveHandler(w http.ResponseWriter, r *http.Request, title string) {
// saveHandler takes the "body" form parameter and saves it. The
// browser is redirected to the page view.
func saveHandler(w http.ResponseWriter, r *http.Request, name string) {
body := r.FormValue("body")
p := &Page{Title: title, Name: title, Body: []byte(body)}
p := &Page{Name: name, Body: []byte(body)}
err := p.save()
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
http.Redirect(w, r, "/view/"+title, http.StatusFound)
http.Redirect(w, r, "/view/"+name, http.StatusFound)
}
// makeHandler returns a handler that uses the URL path without the
// first path element as its argument, e.g. if the URL path is
// /edit/foo/bar, the editHandler is called with "foo/bar" as its
// argument. This uses the second group from the validPath regular
// expression.
func makeHandler(fn func (http.ResponseWriter, *http.Request, string)) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
m := validPath.FindStringSubmatch(r.URL.Path)
if m == nil {
if m != nil {
fn(w, r, m[2])
} else {
http.NotFound(w, r)
return
}
fn(w, r, m[2])
}
}
// searchHandler presents a search result. It uses the query string in
// the form parameter "q" and the template "search.html". For each
// page found, the HTML is just an extract of the actual body.
func searchHandler(w http.ResponseWriter, r *http.Request) {
q := r.FormValue("q")
items := search(q)
s := &Search{Query: q, Items: items, Results: len(items) > 0}
renderTemplate(w, "search", s)
}
// getPort returns the environment variable ODDMU_PORT or the default
// port, "8080".
func getPort() string {
port := os.Getenv("ODDMU_PORT")
if port == "" {
@@ -121,8 +130,9 @@ func main() {
http.HandleFunc("/view/", makeHandler(viewHandler))
http.HandleFunc("/edit/", makeHandler(editHandler))
http.HandleFunc("/save/", makeHandler(saveHandler))
http.HandleFunc("/search", searchHandler)
loadIndex()
port := getPort()
fmt.Println("Serving a wiki on port " + port)
fmt.Printf("Serving a wiki on port %s\n", port)
http.ListenAndServe(":" + port, nil)
}