This reverts commit 1ee235c949.
Sadly, I discovered that my websites no longer correctly displayed pages
containing characters such as the EN DASH. Reverting this commit seems
to solve the problem.
On a new server with Debian Jessie (8), Apache 2.4, Perl 5.25.1 and
CGI 4.28 I'm getting double-decoded page names. An Umlaut will thus
turn into an undisplayable character (a questionmark in a black
diamond). Decoding of path_info was necessary on my old server with
Debian Wheezy (7), Apache 2.2, Perl 5.14.2 and CGI 3.52.
If you're still in the unfortunate situation, you can copy the old
implementation of GetId into your config file:
sub GetId {
my $id = UnquoteHtml(GetParam('id', GetParam('title', ''))); # id=x or title=x -> x
if (not $id and $q->keywords) {
$id = decode_utf8(join('_', $q->keywords)); # script?p+q -> p_q
}
if ($UsePathInfo and $q->path_info) {
my @path = map { decode_utf8($_) } split(/\//, $q->path_info);
$id ||= pop(@path); # script/p/q -> q
foreach my $p (@path) {
SetParam($p, 1); # script/p/q -> p=1
}
}
return $id;
}
As we derive a lot of filenames from strings in UTF-8 encoded files, we
need to make sure that any filename that might might be set by a user –
including all the filenames containing a directory deriving from
$DataDir – are passed through utf8::encode. That is, every character
gets replaced with a sequence of one or more characters that represent
the individual bytes of the character and the UTF8 flag is turned off.
In other words, -d $DataDir might not work if $DataDir contains a UTF-8
encoded string. The solution is to use the following replacements:
-f $name IsFile($name)
-e $name IsFile($name)
-d $name IsDir($name)
(stat($name))[9] Modified($name)
-M $name $Now - Modified($name)
-z $name ZeroSize($name)
unlink $name Unlink($name)
mkdir $name CreateDir($name)
rmdir $name RemoveDir($name)
(Using IsFile for -e is probably not ideal?)
If you don’t, and Oddmuse gets used with Mojolicious, and you use the
Namespaces Extension, and a namespace contains non-ASCII characters such
as ä, ö, or ü, these characters will end up as part of $DataDir and
trigger the problem.
I also wonder whether we should be using some other Perl library.
All occurence of tuns into
-f $name IsFile($name)
-e $name IsFile($name)
-d $name IsDir($name)
(stat($name))[9] Modified($name)
-M $name $Now - Modified($name)
unlink $name Unlink($name)
mkdir $name CreateDir($name)
rmdir $name RemoveDir($name)
This change is incomplete. All the modules also need to be changed.
The benefit of this change is that t/mojolicious-namespaces.t passes.
The old code would only call the the function provided
if pagination indicated that it was OK to do so. Thus,
only the first ten pages got replaced. This has been
fixed and tests have been added.
Everything on oddmuse.org now redirects to https, which means that every wiki
that is using default style sheet requires two requests to get the css file.
It seems like $ShowEdits feature was half broken (not all occurances were
actually defaulted to its value). In the last commit I did the same mistake
with $ShowAll. This is now fixed.
Also, for completeness, I decided to add $ShowRollbacks as well.
Sometimes you might want to “List all changes” and “Include minor
changes” by default. We can already change the default value of the
latter by using $ShowEdits variable, but another one was unsettable from
the config file. Now we have $ShowAll variable.
This used to be string with a trailing whitespace, but actual use was
wrong: T('Return to ' . NormalToFree($id)) – this concatenates the page
name and then attempts to translate the result, which never works. The
correct usage is Ts('Return to %s', NormalToFree($id)).
search.t: Using braces without escaping them in regular expressions
trigges a warning. wiki.pl will now quotemeta the replacement string
when highlighting changes.
upgrade-files.t: Only remove the old UseMod directory if it actually
exists in order to fix some warnings.
wiki.pl: only reading the log file when open actually succeeded in order
to fix some warnings.
Currently the config file and modules are supposed to be in $DataDir,
which does make any sense from security point of view. Files with code
should not be in directories that are writable by www-data.
Previously you had to use a wrapper script to work around that. Now we
provide special variables.
Please note that oddmuse will sometimes cache data by using Storable.
Such cache is saved to the disk and then read back when required. This,
however, is an insecure operation given that there is a risk that the
file will be manipulated from www-data in a malicious way.
Previously if some user cancelled his request (simply by pressing Stop button
in his browser), then the script will receive a TERM signal or the like.
This means that some locks could be left behind, which required someone
to unlock the wiki manually (by using Unlock Wiki action).
Now we remove these locks automatically!
However, some tasks might want to handle such situations gracefully. That's why
%LockCleaners hash was added. Use lock name as the key and put a coderef as a
value. If SIGTERM (or a bunch of other signals) is received, then it will run
this code, which, supposedly, cleans all of the stuff after it. Private Wiki
Extension was changed according to that, so you can see it in action.
Also, tests added!
RollbackPossible needs to handle the situation where $KeepDays == 0.
DoRollback used to examine all the pages where rollback was
possible (using $KeepDays) but in order to avoid the special case where
$KeepDays == 0, we can also examine all the pages changed after the
target timestamp $to.
Once again, bitten by Perl. We used to print a cache "if ($Page{blocks}
and $Page{flags} and GetParam('cache', $UseCache) > 0)" -- but if we
have exactly one clean block, then flags will be "0" which is false. So
now we're testing for defined $Page{flags}.
There was a huge discussion with a lot of tension:
https://oddmuse.org/wiki/Revolutionary_Changes#ForgiveAndForget
And also the comments:
https://oddmuse.org/wiki/Comments_on_Revolutionary_Changes#ForgiveAndForget
But in the end, it is safer to have a history which is not broken.
Don't get it wrong, ForgiveAndForget is still a good thing, it's just not what
we should do *by default*.
If your wiki does benefit from ForgiveAndForget, then add this to your config:
$KeepDays = 14;
Although this change solves a couple of important problems, it does not address
new ones that arise because of no ForgiveAndForget. Namely it does not
resolve the problem of deleting stuff when you *really* have to do it. For
example, [[DMCA Extension]] (or similarly named extension with the same
purpose) should be developed. These problems existed for a long time, because
people were using “$KeepDays = 0” a lot. It is just that we started to accept
wikis with no ForgiveAndForget more thoroughly.
In other words, this commit is just part of the bigger change.
Why don't we set it to 5 years? Because then it will be a time bomb that will
be triggered unexpectedly. We should have a more predictable default value.
Remember the problem with toc.pl when the whole page was *sometimes* not
utf8-decoded? There were some thoughts that it might be associated with
memory files, and it is correct. Although I was not able to narrow it down
last time, now I did (simply because this problem appeared elsewhere).
If you look at $output variable after utf8::decode with Devel::Peek, you
will see two variants of flags. This one looks good:
FLAGS = (PADMY,POK,pPOK,UTF8)
And this one is wrong:
FLAGS = (PADMY,POK,pPOK)
This problem is weird because it works inconsistently. Most of the time
you will get correct output, but sometimes it will be broken.
Someone has to golf it down to something short in order to submit perl
bug report. This, however, does not look like a simple task.
Current workaround is as stupid as it looks like, but it works.
Somehow assigning it to another variable solves the problem (which, by the
way, is similar to solving other perl string-related problems).
ReplaceAndDiff calls Replace, which loops over all pages. That's why we
don't need to call it from SearchTitleAndBody -- that makes our code
runs way too often.
“There are no comments, yet. Be the first to leave a comment!” – that's what
you will see when you preview your comment on an empty page.
Since the user is already doing it, there is no need to tell that. Also, it
may look like it is part of the preview.
We no longer do that (after this commit). In other words, the preview should
look exactly like the resulting page.
The original issue was that looking at all changes (action=rc all=1) the
resulting diff didn't always make sense if you clicked on the diff link.
It showed the difference between that revision and the current revision.
The PrintHtmlDiff sub was changed significantly to make it easier to
understand and to help fix this issue.
The drawback is that it now requires a new key in page and keep files:
lastmajorsummary. It goes with lastmajor and diff-major and records the
summary for that particular edit. As new changes will start recording
this new key, the change will slowly propagate in existing wikis.
Whenever you look at minor diffs, however, the existing summary key is
chosen. Plus, whenever you want to look at differences between
particular revisions, this is equivalent to looking at minor diffs. So
the only situation that is problematic is an edit history like the
following:
A - major change
B - major change (major diff, major summary, last major revision)
C - minor change
When looking at this page with diff=2, we want to show major diff, major
summary, last major revision. If B happened before this commit was
installed, the summary will be missing.
Changing everything to return objects is a worthy goal, but for now we have
taken enough destructive steps towards it. Therefore, this commit fixes the
problem in backwards compatible way (by adding one more parameter to the
signatures).
Note that this additional parameter is NOT a timestamp, it is a whole page
object. Which means that we are still moving towards our goal of using page
objects everywhere, this commit is just doing it in a backwards-compatible
way.
sub ParseData is fully backwards compatible. If some module runs it in list
context, then it will get listified hash like previously. New code should
always run it in scalar context though (everything in our code base
was changed according to that).
sub GetTextRevision is not backwards compatible (don't let “wantarray” usage
to confuse you). Most modules do not touch that subroutine, so we are probably
fine (modules from our git repo that do use were changed accordingly).
“EncodePage(%$page)” looks wrong. It seems like we should change it to accept
hash ref.