As we derive a lot of filenames from strings in UTF-8 encoded files, we
need to make sure that any filename that might might be set by a user –
including all the filenames containing a directory deriving from
$DataDir – are passed through utf8::encode. That is, every character
gets replaced with a sequence of one or more characters that represent
the individual bytes of the character and the UTF8 flag is turned off.
In other words, -d $DataDir might not work if $DataDir contains a UTF-8
encoded string. The solution is to use the following replacements:
-f $name IsFile($name)
-e $name IsFile($name)
-d $name IsDir($name)
(stat($name))[9] Modified($name)
-M $name $Now - Modified($name)
-z $name ZeroSize($name)
unlink $name Unlink($name)
mkdir $name CreateDir($name)
rmdir $name RemoveDir($name)
(Using IsFile for -e is probably not ideal?)
If you don’t, and Oddmuse gets used with Mojolicious, and you use the
Namespaces Extension, and a namespace contains non-ASCII characters such
as ä, ö, or ü, these characters will end up as part of $DataDir and
trigger the problem.
I also wonder whether we should be using some other Perl library.
sub ParseData is fully backwards compatible. If some module runs it in list
context, then it will get listified hash like previously. New code should
always run it in scalar context though (everything in our code base
was changed according to that).
sub GetTextRevision is not backwards compatible (don't let “wantarray” usage
to confuse you). Most modules do not touch that subroutine, so we are probably
fine (modules from our git repo that do use were changed accordingly).
“EncodePage(%$page)” looks wrong. It seems like we should change it to accept
hash ref.
Recently, uploaded files don't just contain #FILE and a MIME type --
the MIME type is followed by a space and optionally more information.
I replaced the hand-coded parsing with a call to TextIsFile and added
better error checking and fixed the error messages (they used $s
instead of %s).