For 8.21.11.127 we have to issues: the key in the result set is
netrange and not inetnum, and the result for netrange is a list of
ranges and not just a single range. The new code no longer presumes to
know the keys. It just goes through all of them, trying to find
something that looks like a range. When it finds an array reference,
it goes through each entry, looking for a range. The first key where
are least one range is found is returned, with all the ranges for that
key, in our case that would be 8.0.0.0 - 8.127.255.255 and 8.21.11.0 -
8.21.11.255.
The ban-contributors code then presents two forms, one for each match.
Without this commit, what used to happen is that if ban-contributors
banned a contributor in a namespace, the pageidx of the main space got
overwritten with the pageidx of the namespace: since the values of
@IndexList and %IndexHash remained unchanged.
I did not want to use Number::Range::Regex because those regular
expressions are somewhat hard to read, so instead some test cases from
actual spammers were added and the code rewritten to be easier to
understand. It should now also be obvious when it breaks.
New approach: save the original value of $DataDir in
$NamespacesRootDataDir. When reading the value of $BannedHosts or
$BannedContent via GetPageContent in UserIsBanned or BannedContent,
and in DoBanHosts for ban-contributors.pl, use the root data dir; when
saving $BannedHosts or $BannedContent via DoPost, use the root data
dir.
To facilitate spamfighting, the namespace is not set when the current
action refers to one of the page ids in the @NamespaceIgnored list.
The default value for these is $BannedContent and $BannedHosts, in
other words, the pages 'BannedContent' and 'BannedHosts'.
The current code always resulted in an empty list of files for
TRANSLATIONS; the did not end up in the build directory; and they did
not get installed elsewhere.
If we want to match borked spam like <a href=http://example> then it's
counterproductive if we remove the URLs because our pattern will have
to be "href=" instead of "href=http". Also it's hard to remember that
URLs are removed.
In OpenHtmlEnvironment we simplyfy the regular expression that is
supposed to detect whether this is a class assignment to a simple
check whether the attribute contains an equal sign.
Trying to get more HTML5 elements used.
PrintAllPages:
Use the article element instead of a div with class "page". The new
article element still has the "h-entry" class that the old div had.
The h1 element for these pages used to have the class "entry-title"
which is apparently deprecated. The new code now uses the "p-name"
attribute.
The page content is no longer surrounded in a div with the
"entry-content" class and the appropriate lang attribute. We rely on
PrintPageHtml to do the right thing, now.
PrintPageHtml:
Surround the page being printed with a div containing the "e-content"
class and an appropriate lang attribute.
PageHtml:
This also uses PrintPageHtml and therefore doesn't need to surround
the page content with a div containing the "page" class and the lang
attribute.
As PageHtml is used in RSS feed generation, that means that the feed
entries now don't have a div containing the "page" class but a div
containing the "e-content" class.
GetHeaderDiv:
Instead of using a div with the "header" class, use the header
element.
Instead of using a div with the "menu" class, use the nav element.
PrintPageContent:
No changes! We're not changing the div here because the content that
is being printed here does not belong into an article element. It is
not "a self-contained composition in a … page … intended to be
independently distributable or reusable" – it *is* the page
itself (without the h1 header).
PrintFooter:
Use an additional footer element.
DefaultFooter:
Remove the div with the "footer" class.
References:
* http://microformats.org/wiki/h-entry
* https://developer.mozilla.org/en-US/docs/Web/HTML/Element/header
* https://developer.mozilla.org/en-US/docs/Web/HTML/Element/nav
* https://developer.mozilla.org/en-US/docs/Web/HTML/Element/article
* https://developer.mozilla.org/en-US/docs/Web/HTML/Element/footer
The problem is that by default the test-data/config file contains
$ScriptName = 'http://localhost/wiki.pl' but morbo serves the site at
http://127.0.0.1:8080. We therefore append a new $ScriptName
assignment if the correct one doesn't exist. The alternative is
tricky because of the /wiki.pl prefix; fixing that would require a lot
more code, I suspect.
DuckDuckGo search doesn't use the www subdomain anymore.
The raw recent changes returns the bogus hash (four octal digits)
instead of Anonymous before maintenance anonymises the entry.
When serving recent changes, we know the username and host of the
person making the edit. We use GetAuthorLink to show either the name
linked to the username, or "Anonymous", or a colour coded bogus hash
of their host (that's the four octal digits, hopefully colourized by
your CSS).
When serving raw changes, we used to serve just the username or
"Anonymous". In order to help use cases such as the Gemini wiki
running on gemini://alexschroeder.ch:1965 which consumes raw changes
to present a view that is compatible with Gemini Wiki, we'd like those
bogus hashes as well. This comit does that by splitting ColorCode into
Code and ColorCode such that we can use Code when serving raw changes.
Up to now it was assumed that the raw wiki text would not be written
as Gemtext, but increasingly that is not the case. This commit adds
handling of Gemtext links.
gemini_link now handles URLs and is used for all links in
serve_gemini_page.
Paragraph splits now happen at the beginning of list items and when
line breaks are requested. It's not great but what else are you going
to do?
Handle image links.
Handle HTML tags (by ignoring them).
Raw pages served as text/plain instead of text/markdown.