Compare commits

..

16 Commits
2.0.1 ... 2.1

Author SHA1 Message Date
Alex Schroeder
741601489f Fix cookie encoding issues
If the username was set to Schröder, the wiki would keep reporting the
username as changed. This was caused by encoding errors if cookie
values. This change also allows non-ASCII parameters to be stored in
the cookie.
2012-06-23 01:08:46 +02:00
Alex Schroeder
217055fab2 Pod fix.
Changed =head to =head2 as suggested by anonymous visitor to the wiki.
2012-06-22 18:23:51 +02:00
Alex Schroeder
8f68442db1 Fix UTF-8 decoding issue when using toc.pl
The output of ApplyRules is no explicitly decoded as UTF8.
2012-06-22 02:54:29 +02:00
Alex Schroeder
b9d0c60080 Added copyright year. 2012-06-14 10:21:40 +02:00
Alex Schroeder
2b2e45b952 Merge branch 'master' of git.sv.gnu.org:/srv/git/oddmuse 2012-06-14 10:20:17 +02:00
Alex Schroeder
dfc3555184 Merge branch 'add-thumbs' 2012-06-14 10:19:33 +02:00
Alex Schroeder
474798c5cd Handle the new file format for uploaded files
Recently, uploaded files don't just contain #FILE and a MIME type --
the MIME type is followed by a space and optionally more information.
I replaced the hand-coded parsing with a call to TextIsFile and added
better error checking and fixed the error messages (they used $s
instead of %s).
2012-06-14 10:04:54 +02:00
Alex Schroeder
0a54f14a6f From the wiki
Taken the version from the wiki and added the standard
$ModulesDescription.
2012-06-14 08:39:25 +02:00
Alex Schroeder
04cdf0be24 Fixed cookie decoding
If the username contained a non-ASCII character, eg. Schröder, then
the script would keep printing the cookie message, telling the user
that the cookie had changed when in fact it had not.
2012-05-25 17:41:01 +02:00
Alex Schroeder
e531f9d569 Encoding the cookie and fixing drafts.pl
Drafts are saved using the username as filename. This must also be
encoded and decoded correctly. Because of NFC and NFD issues on Mac
HFS, an appropriate normalization was added to mac.pl.

As the username is also part of the cookie, this showed that the
Cookie content wasn't being encoded correctly, so that was fixed, too.
2012-05-25 11:56:46 +02:00
Alex Schroeder
4f6407fd38 Resolved conflict. 2012-05-25 08:29:34 +02:00
Alex Schroeder
3174e184f9 Fix an encoding error in namespaces.pl on Debian
The Debian installation uses ext3 and therefore raw bytes for
filenames unlike the HFS filesystem of Mac OSX.

Copyright years were updated. The maintenance output of for drafts was
cleaned up.
2012-05-25 01:00:10 +02:00
Alex Schroeder
8d94a0a50f Merge branch 'master' of git.sv.gnu.org:/srv/git/oddmuse
Conflicts:
	modules/mac.pl
	t/encoding.t
	wiki.pl
2012-05-24 18:31:44 +02:00
Alex Schroeder
67650e3dc8 More UTF-8 fixes
All the source files containing non-ASCII characters needed to have
utf8 added. This will be necessary for user config files as well! The
regular expressions identifying page names had to be changed.
UrlEncode translates the string back to bytes before encoding it.
Cached RSS files are saved with UTF-8 encoding and therefore need
their meta-data changed (using the XML::RSS module to do this
correctly didn't work for some of the test files). The CGI object's
parameters, keywords and info_path are decoded correctly. File access
uses the UTF-8 layer (reading, writing, appending, access to the log
of recent changes, running sub processes with grep and diff).

The mac compatibility extension will also disable the use of grep if
non-ASCII characters are searched for because of an unexplained
problem with grep.
2012-05-24 18:08:42 +02:00
Alex Schroeder
700d412a01 More UTF-8 fixes
All the source files containing non-ASCII characters needed to have
utf8 added. This will be necessary for user config files as well! The
regular expressions identifying page names had to be changed.
UrlEncode translates the string back to bytes before encoding it.
Cached RSS files are saved with UTF-8 encoding and therefore need
their meta-data changed (using the XML::RSS module to do this
correctly didn't work for some of the test files). The CGI object's
parameters, keywords and info_path are decoded correctly. File access
uses the UTF-8 layer (reading, writing, appending, access to the log
of recent changes, running sub processes with grep and diff).

The mac compatibility extension will also disable the use of grep,
because I could not get it to work (and I don't think there will be
large Oddmuse installations running on Apple's HFS).
2012-05-24 16:58:10 +02:00
Alex Schroeder
cd2b4d624e Remove $HttpCharset and moved everything to UTF-8
This also required some changes to the tests where explicit UTF-8
encoding had been used in the past.
2012-05-22 11:50:23 +02:00
70 changed files with 722 additions and 273 deletions

View File

@@ -1,8 +0,0 @@
.DS_Store
oddmuse-1.*
oddmuse-inkscape-1.*
*.patch
*.patch.gz
*.diff
*.diff.gz
test-data

View File

@@ -1 +0,0 @@
pkg

View File

@@ -1,2 +0,0 @@
install
var

View File

@@ -1,5 +0,0 @@
current.pl
FDL
GPL
*.tar.gz
*.tar.gz.sig

View File

@@ -1 +0,0 @@
.DS_Store

View File

@@ -55,7 +55,7 @@ sub DoAtomIntrospection {
push(@types, @UploadTypes) if $UploadAllowed;
my $upload = '<accept>' . join(', ', @types) . '</accept>';
print <<EOT;
<?xml version="1.0" encoding='$HttpCharset'?>
<?xml version="1.0" encoding='UTF-8'?>
<service xmlns="http://purl.org/atom/app#">
<workspace title="Wiki" >
<collection title="$SiteName" href="$ScriptName/atom/wiki">
@@ -79,7 +79,7 @@ sub GetRcAtom {
my $historyPrefix = QuoteHtml($ScriptName) . "?action=history;id=";
my $limit = GetParam("rsslimit", 15); # Only take the first 15 entries
my $count = 0;
my $feed = qq{<?xml version="1.0" encoding="$HttpCharset"?>\n};
my $feed = qq{<?xml version="1.0" encoding="UTF-8"?>\n};
if ($RssStyleSheet =~ /\.(xslt?|xml)$/) {
$feed .= qq{<?xml-stylesheet type="text/xml" href="$RssStyleSheet" ?>\n};
} elsif ($RssStyleSheet) {

View File

@@ -29,25 +29,21 @@ sub DoConfig {
\$EditPass = "";
};
my $source = GetRaw('http://www.emacswiki.org/scripts/current');
foreach my $var qw($HomePage $MaxPost $HttpCharset $StyleSheet
$StyleSheetPage $NotFoundPg $NewText $NewComment
$EditAllowed $BannedHosts $BannedCanRead
$BannedContent $WikiLinks $FreeLinks $BracketText
$BracketWiki $NetworkFile $AllNetworkFiles
$PermanentAnchors $InterMap $NearMap
$RssInterwikiTranslate $SurgeProtection
$SurgeProtectionTime $SurgeProtectionViews
foreach my $var qw($HomePage $MaxPost $StyleSheet $StyleSheetPage $NotFoundPg
$NewText $NewComment $EditAllowed $BannedHosts
$BannedCanRead $BannedContent $WikiLinks $FreeLinks
$BracketText $BracketWiki $NetworkFile $AllNetworkFiles
$PermanentAnchors $InterMap $NearMap $RssInterwikiTranslate
$SurgeProtection $SurgeProtectionTime $SurgeProtectionViews
$DeletedPage $RCName @RcDays $RcDefault $KeepDays
$KeepMajor $SummaryHours $SummaryDefaultLength
$ShowEdits $UseLookup $RecentTop $RecentLink
$PageCluster $InterWikiMoniker $SiteDescription
$RssImageUrl $RssRights $RssExclude
$RssCacheHours $RssStyleSheet $UploadAllowed
@UploadTypes $EmbedWiki $FooterNote $EditNote
$TopLinkBar @UserGotoBarPages $UserGotoBar
$ValidatorLink $CommentsPrefix $HtmlHeaders
$IndentLimit $LanguageLimit $JournalLimit
$SisterSiteLogoUrl %SpecialDays %Smilies
$KeepMajor $SummaryHours $SummaryDefaultLength $ShowEdits
$UseLookup $RecentTop $RecentLink $PageCluster
$InterWikiMoniker $SiteDescription $RssImageUrl $RssRights
$RssExclude $RssCacheHours $RssStyleSheet $UploadAllowed
@UploadTypes $EmbedWiki $FooterNote $EditNote $TopLinkBar
@UserGotoBarPages $UserGotoBar $ValidatorLink
$CommentsPrefix $HtmlHeaders $IndentLimit $LanguageLimit
$JournalLimit $SisterSiteLogoUrl %SpecialDays %Smilies
%Languages) {
my $default = undef;
my $re = quotemeta($var);

View File

@@ -156,7 +156,7 @@ sub CreoleInit {
# $FullUrlPattern = "((?:$UrlProtocols:|/)$UrlChars+)";
# Permit page authors to link to other pages having semicolons in their names.
# my $LinkCharsSansZero = "-;,.()' _1-9A-Za-z\x80-\xff";
# my $LinkCharsSansZero = "-;,.()' _1-9A-Za-z\x{0080}-\x{ffff}";
# my $LinkChars = $LinkCharsSansZero.'0';
# $FreeLinkPattern = "([$LinkCharsSansZero]|[$LinkChars][$LinkChars]+)";
}
@@ -623,8 +623,8 @@ sub CreoleRuleRecursive {
elsif (m/\G\s+/cg) {
$html .= ' ';
}
elsif ( m/\G([A-Za-z\x80-\xff]+([ \t]+[a-z\x80-\xff]+)*[ \t]+)/cg
or m/\G([A-Za-z\x80-\xff]+)/cg
elsif ( m/\G([A-Za-z\x{0080}-\x{ffff}]+([ \t]+[a-z\x{0080}-\x{ffff}]+)*[ \t]+)/cg
or m/\G([A-Za-z\x{0080}-\x{ffff}]+)/cg
or m/\G(\S)/cg) {
$html .= $1; # multiple words but do not match http://foo
}

View File

@@ -1,20 +1,16 @@
# Copyright (C) 2006 Alex Schroeder <alex@emacswiki.org>
# Copyright (C) 2006, 2012 Alex Schroeder <alex@gnu.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# This program is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation; either version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the
# Free Software Foundation, Inc.
# 59 Temple Place, Suite 330
# Boston, MA 02111-1307 USA
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <http://www.gnu.org/licenses/>.
use vars qw($DraftDir);
@@ -36,6 +32,7 @@ $Action{draft} = \&DoDraft;
sub DoDraft {
my $id = shift;
my $draft = $DraftDir . '/' . GetParam('username', GetRemoteHost());
utf8::encode($draft);
if ($id) {
my $text = GetParam('text', '');
ReportError(T('No text to save'), '400 BAD REQUEST') unless $text;
@@ -70,15 +67,23 @@ sub DraftNewGetEditForm {
push(@MyMaintenance, \&DraftCleanup);
sub DraftFiles {
return map {
$_ = substr($_, length($DraftDir) + 1);
utf8::decode($_);
$_;
} glob("$DraftDir/* $DraftDir/.*");
}
sub DraftCleanup {
print '<p>' . T('Draft Cleanup');
foreach my $draft (glob("$DraftDir/* $DraftDir/.*")) {
next if $draft =~ m!/\.\.?$!;
my $ts = (stat($draft))[9];
foreach my $draft (DraftFiles()) {
next if $draft eq '.' or $draft eq '..';
my $ts = (stat("$DraftDir/$draft"))[9];
if ($Now - $ts < 1209600) { # 14*24*60*60
print $q->br(), Tss("%1 was last modified %2 and was kept",
$draft, CalcTimeSince($Now - $ts));
} elsif (unlink($draft)) {
} elsif (unlink("$DraftDir/$draft")) {
print $q->br(), Tss("%1 was last modified %2 and was deleted",
$draft, CalcTimeSince($Now - $ts));
} else {

View File

@@ -36,29 +36,7 @@ function togglecomments (id) {
} unless $HtmlHeaders =~ /commenthidden/; # mod_perl?
}
sub SafeId {
my $id = shift;
my $regexp = "";
$regexp = "|\xc3[\x80-\x96\x98-\xb6\xb8-\xff]|[\xc4-\xca].|\xcb[\x00-\xbf]"
. "|\xcd[\xb0-\xbd\xbf-\xff]|[\xce-\xDF].|\xe0..|\xe1[\x00-\xbe]."
. "|\xe1\xbf[\x00-\xbf]|\xe2\x80[\x8c\x8d]"
if $HttpCharset eq 'UTF-8';
# Unicode Codepoint UTF-8 encoding
# [#xC0-#xD6] c3 80 - c3 96
# [#xD8-#xF6] c3 98 - c3 b6
# [#xF8-#x2FF] c3 b8 - cb bf
# [#x370-#x37D] cd b0 - cd bd
# [#x37F-#x1FFF] cd bf - e1 bf bf
# [#x200C-#x200D] e2 80 8c - e2 80 8d
# [#x2070-#x218F] e2 81 b0 - e2 86 8f -- FIXME \
# [#x2C00-#x2FEF] e2 b0 80 - e2 bf af -- FIXME |
# [#x3001-#xD7FF] e3 80 81 - ed 9f bf -- FIXME | these are missing
# [#xF900-#xFDCF] ef a4 80 - ef b7 8f -- FIXME | in the regexp above
# [#xFDF0-#xFFFD] ef b7 b0 - ef bf bd -- FIXME |
# [#x10000-#xEFFFF] f0 90 80 80 - f3 af bf bf -- FIXME /
$id = ":$id" unless $id =~ /^[:_A-Za-z]$regexp/;
return join('', $id =~ m/([-.:_A-Za-z0-9]$regexp)/g);
}
my $num = 0;
*DynamicCommentsOldGetPageLink = *GetPageLink;
*GetPageLink = *DynamicCommentsNewGetPageLink;
@@ -70,10 +48,10 @@ sub DynamicCommentsNewGetPageLink {
$title =~ s/_/ /g;
my $page = PageHtml($id);
if ($page) {
my $safe = SafeId($id);
return qq{<a href="javascript:togglecomments('$safe')">$title</a>}
my $anchor = "id" . $num++;
return qq{<a href="javascript:togglecomments('$anchor')">$title</a>}
. '</p>' # close p before opening div
. $q->div({-class=>commenthidden, -id=>$safe},
. $q->div({-class=>commenthidden, -id=>$anchor},
$page,
$q->p(DynamicCommentsOldGetPageLink($id, T('Add Comment'))))
. '<p>'; # open an empty p that will be closed in PrintAllPages

View File

@@ -22,7 +22,7 @@ push(@MyRules, \&LinkAllRule);
$RuleOrder{\&LinkAllRule} = 1000;
sub LinkAllRule {
if (/\G([A-Za-z\x80-\xff]+)/gc) {
if (/\G([A-Za-z\x{0080}-\x{ffff}]+)/gc) {
my $oldpos = pos;
Dirty($1);
# print the word, or the link to the word

View File

@@ -18,24 +18,68 @@
$ModulesDescription .= '<p><a href="http://git.savannah.gnu.org/cgit/oddmuse.git/tree/modules/mac.pl">mac.pl</a>, see <a href="http://www.oddmuse.org/cgi-bin/oddmuse/Mac">Mac</a></p>';
use Encode;
use Unicode::Normalize;
*OldAllPagesList = *AllPagesList;
*AllPagesList = *NewAllPagesList;
*OldMacAllPagesList = *AllPagesList;
*AllPagesList = *NewMacAllPagesList;
sub NewAllPagesList {
sub NewMacAllPagesList {
$refresh = GetParam('refresh', 0);
if ($IndexInit && !$refresh) {
return @IndexList;
}
OldAllPagesList(@_);
OldMacAllPagesList(@_);
my @new = ();
%IndexHash = ();
foreach my $id (@IndexList) {
$id = encode_utf8(NFC(decode_utf8($id)));
$id = NFC($id);
push(@new, $id);
$IndexHash{$id} = 1;
}
@IndexList = @new;
return @new;
}
push(@MyInitVariables, \&MacFixEncoding);
sub MacFixEncoding {
# disable grep if searching for non-ascii stuff:
# $ mkdir /tmp/dir
# $ echo schroeder > /tmp/dir/schroeder
# $ echo schröder > /tmp/dir/schröder
# $ echo SCHRÖDER > /tmp/dir/SCHRÖDER-UP # don't use SCHRÖDER because of HFS
# $ grep -rli schröder /tmp/dir
# /tmp/dir/schröder
# $ grep -rli SCHRÖDER /tmp/dir
# /tmp/dir/schröder
#
# Why is grep not finding the upper case variant in the SCHRÖDER-UP
# file?
$UseGrep = 0 if GetParam('search', '') =~ /[x{0080}-\x{ffff}]/;
# the rest is only necessary if using namespaces.pl
return unless defined %Namespaces;
while (my ($key, $value) = each %Namespaces) {
delete $Namespaces{$key};
utf8::decode($key);
$key = NFC($key);
$Namespaces{$key} = $NamespaceRoot . '/' . $key . '/';
}
while (my ($key, $value) = each %InterSite) {
delete $InterSite{$key};
utf8::decode($key);
$key = NFC($key);
$InterSite{$key} = $Namespaces{$key} if $Namespaces{$key};
}
}
# for drafts.pl
*OldMacDraftFiles = *DraftFiles;
*DraftFiles = *NewMacDraftFiles;
sub NewMacDraftFiles {
return map { NFC($_) } OldMacDraftFiles(@_);
}

View File

@@ -198,8 +198,8 @@ sub MarkdownNearInit {
sub DoWikiWords {
my $text = shift;
my $WikiWord = '[A-Z]+[a-z\x80-\xff]+[A-Z][A-Za-z\x80-\xff]*';
my $FreeLinkPattern = "([-,.()' _0-9A-Za-z\x80-\xff]+)";
my $WikiWord = '[A-Z]+[a-z\x{0080}-\x{ffff}]+[A-Z][A-Za-z\x{0080}-\x{ffff}]*';
my $FreeLinkPattern = "([-,.()' _0-9A-Za-z\x{0080}-\x{ffff}]+)";
if ($FreeLinks) {
# FreeLinks
@@ -299,7 +299,7 @@ sub CreateWikiLink {
sub UnescapeWikiWords {
my $text = shift;
my $WikiWord = '[A-Z]+[a-z\x80-\xff]+[A-Z][A-Za-z\x80-\xff]*';
my $WikiWord = '[A-Z]+[a-z\x{0080}-\x{ffff}]+[A-Z][A-Za-z\x{0080}-\x{ffff}]*';
# Unescape escaped WikiWords
$text =~ s/\\($WikiWord)/$1/g;
@@ -379,7 +379,7 @@ sub NewEncodeCode {
$text = OldEncodeCode($text);
# Protect Wiki Words
my $WikiWord = '[A-Z]+[a-z\x80-\xff]+[A-Z][A-Za-z\x80-\xff]*';
my $WikiWord = '[A-Z]+[a-z\x{0080}-\x{ffff}]+[A-Z][A-Za-z\x{0080}-\x{ffff}]*';
$text =~ s!($WikiWord)!\\$1!gx;
return $text;
@@ -471,7 +471,7 @@ sub MarkdownAddComment {
sub NewDoAnchors {
my $text = shift;
my $WikiWord = '[A-Z]+[a-z\x80-\xff]+[A-Z][A-Za-z\x80-\xff]*';
my $WikiWord = '[A-Z]+[a-z\x{0080}-\x{ffff}]+[A-Z][A-Za-z\x{0080}-\x{ffff}]*';
return Markdown::_DoAnchors($text);
}

View File

@@ -80,9 +80,9 @@ $RuleOrder{\&MarkupRule} = 150;
%MarkupLines = ('>' => 'pre',
);
my $words = '([A-Za-z\x80-\xff][-%.,:;\'"!?0-9 A-Za-z\x80-\xff]*?)';
my $words = '([A-Za-z\x{0080}-\x{ffff}][-%.,:;\'"!?0-9 A-Za-z\x{0080}-\x{ffff}]*?)';
# zero-width look-ahead assertion to prevent km/h from counting
my $noword = '(?=[^-0-9A-Za-z\x80-\xff]|$)';
my $noword = '(?=[^-0-9A-Za-z\x{0080}-\x{ffff}]|$)';
my $markup_pairs_re = '';
my $markup_forced_pairs_re = '';
@@ -174,7 +174,7 @@ sub MarkupRule {
return $MarkupSingles{UnquoteHtml($1)};
} elsif ($MarkupPairs{'/'} and m|\G~/|gc) {
return '~/'; # fix ~/elisp/ example
} elsif ($MarkupPairs{'/'} and m|\G(/[-A-Za-z0-9\x80-\xff/]+/$words/)|gc) {
} elsif ($MarkupPairs{'/'} and m|\G(/[-A-Za-z0-9\x{0080}-\x{ffff}/]+/$words/)|gc) {
return $1; # fix /usr/share/lib/! example
}
# "foo

View File

@@ -1,17 +1,16 @@
# Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009 Alex Schroeder <alex@gnu.org>
# Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2012 Alex Schroeder <alex@gnu.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
# This program is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation; either version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <http://www.gnu.org/licenses/>.
=head1 Namespaces Extension
@@ -81,6 +80,7 @@ sub NamespacesInitVariables {
if (!$Monolithic and $UsePathInfo) {
$Namespaces{$NamespacesMain} = $ScriptName . '/';
foreach my $name (glob("$DataDir/*")) {
utf8::decode($name);
if (-d $name
and $name =~ m|/($InterSitePattern)$|
and $name ne $NamespacesMain
@@ -249,7 +249,7 @@ sub NewNamespaceGetRcLines { # starttime, hash of seen pages to use as a second
return LatestChanges(@result);
}
=head RSS feed
=head2 RSS feed
When retrieving the RSS feed with the parameter full=1, one would
expect the various items to contain the fully rendered HTML.

View File

@@ -77,7 +77,7 @@ sub NewQuestionaskerDoPost {
or QuestionaskerException($id)) {
print GetHeader('', T('Edit Denied'), undef, undef, '403 FORBIDDEN');
print $q->p(T('You did not answer correctly.'));
print $q->start_form, QuestionaskerGetQuestion(1),
print GetFormStart(), QuestionaskerGetQuestion(1),
(map { $q->hidden($_, '') }
qw(title text oldtime summary recent_edit aftertext)), $q->end_form;
PrintFooter();

View File

@@ -230,7 +230,7 @@ sub NewReCaptchaDoPost {
print GetHeader('', T('Edit Denied'), undef, undef, '403 FORBIDDEN');
print $q->start_div({-class=>'error'});
print $q->p(T('You did not answer correctly.'));
print $q->start_form, ReCaptchaGetQuestion(1),
print GetFormStart(), ReCaptchaGetQuestion(1),
(map { $q->hidden($_, '') }
qw(title text oldtime summary recent_edit aftertext)), $q->end_form;
print $q->end_div();

View File

@@ -33,7 +33,7 @@ package OddMuse::Tokenize;
use vars qw($regexp);
$regexp = qr'[A-Za-z0-9_\x80-\xff]+';
$regexp = qr'[A-Za-z0-9_\x{0080}-\x{ffff}]+';
sub new {
my ($classname, @args) = @_;

View File

@@ -33,7 +33,7 @@ push(@MyRules, \&SeTextRule);
# If the length does not match, pos is reset and zero is returned so
# that the remaining rules will be tested instead.
my $word = '([-A-Za-z\x80-\xff]+)';
my $word = '([-A-Za-z\x{0080}-\x{ffff}]+)';
sub SeTextRule {
my $oldpos = pos;
if ($bol && ((m/\G((.+?)[ \t]*\n(-+|=+)[ \t]*\n)/gc

View File

@@ -5,7 +5,7 @@ use vars qw($FS $FreeLinkPattern $UrlProtocols $UrlChars $EndChars
$UrlPattern $q);
$FS = "\x1e";
$FreeLinkPattern = "([-,.()' _0-9A-Za-z\x80-\xff]+)";
$FreeLinkPattern = "([-,.()' _0-9A-Za-z\x{0080}-\x{ffff}]+)";
$UrlProtocols = 'http|https|ftp|afs|news|nntp|mid|cid|mailto|wais|prospero|telnet|gopher|irc';
$UrlChars = '[-a-zA-Z0-9/@=+$_~*.,;:?!\'"()&#%]'; # see RFC 2396
$EndChars = '[-a-zA-Z0-9/@=+$_~*]'; # no punctuation at the end of the url.

View File

@@ -158,7 +158,7 @@ sub StaticHtml {
<title>$SiteName: $id</title>
<link type="text/css" rel="stylesheet" href="static.css" />
<meta http-equiv="refresh" content="0; url=$target">
<meta http-equiv="content-type" content="text/html; charset=$HttpCharset">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Redirected to <a href="$target">$1</a>.</p>
@@ -173,7 +173,7 @@ EOT
<head>
<title>$SiteName: $id</title>
<link type="text/css" rel="stylesheet" href="static.css" />
<meta http-equiv="content-type" content="text/html; charset=$HttpCharset">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
EOT

View File

@@ -173,7 +173,7 @@ sub StaticHtml {
# Process the page
local $Message = "";
# encoding is left off, so fix it:
print qq!<?xml version="1.0" encoding="$HttpCharset" ?>!;
print qq!<?xml version="1.0" encoding="UTF-8" ?>!;
print GetHeader($id, QuoteHtml($id), undef, "");
print $q->start_div({-class=> 'content browse'});
print PageHtml($id);

View File

@@ -33,7 +33,7 @@ sub TablesLongRule {
# a new row is started when a cell is repeated
# if cells are missing, column spans are created (the first row
# could use row spans...)
if ($bol && m|\G\s*\n*\&lt;table(/[A-Za-z\x80-\xff/]+)? +([A-Za-z\x80-\xff,;\/ ]+)\&gt; *\n|cg) {
if ($bol && m|\G\s*\n*\&lt;table(/[A-Za-z\x{0080}-\x{ffff}/]+)? +([A-Za-z\x{0080}-\x{ffff},;\/ ]+)\&gt; *\n|cg) {
my $class = join(' ', split(m|/|, $1)); # leading / in $1 will make sure we have leading space
Clean(CloseHtmlEnvironments() . "<table class=\"user long$class\">");
# labels and their default class
@@ -60,7 +60,7 @@ sub TablesLongRule {
my $rowspan = '';
my $first = 1;
for my $line (@lines) {
if ($line =~ m|^($regexp)/?([0-9]+)?/?([A-Za-z\x80-\xff/]+)?[:=] *(.*)|) { # regexp changes for other tables
if ($line =~ m|^($regexp)/?([0-9]+)?/?([A-Za-z\x{0080}-\x{ffff}/]+)?[:=] *(.*)|) { # regexp changes for other tables
$label = $1;
$rowspan = $2;
$class = join(' ', split(m|/|, $3)); # no leading / therefore no leading space

View File

@@ -19,6 +19,7 @@
$ModulesDescription .= '<p><a href="http://git.savannah.gnu.org/cgit/oddmuse.git/tree/modules/tex.pl">tex.pl</a></p>';
use vars qw($TeXInit);
use utf8;
my %h = qw( !` ¡ {\pounds} £ \pounds £ {\S} § \S § \"{} ¨ {\copyright} ©
\copyright © $^a$ ª \={} ¯ $\pm$ ± \pm ± $^2$ ² $^3$ ³ \'{} ´ {\P} ¶

256
modules/thumbs.pl Normal file
View File

@@ -0,0 +1,256 @@
# Copyright (C) 2004, 2012 Alex Schroeder <alex@gnu.org>
# Copyright (C) 2005 Rob Neild
#
# This program is free software: you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation, either version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <http://www.gnu.org/licenses/>.
# Thumbnail (and improved image handling) module for OddMuse wiki
# Conflicts with the "Image extension module"
require MIME::Base64;
use File::Path;
$ModulesDescription .= '<p><a href="http://git.savannah.gnu.org/cgit/oddmuse.git/tree/modules/thumbs.pl">thumbs.pl</a>, see <a href="http://www.oddmuse.org/cgi-bin/oddmuse/Image_Thumbnails">Image Thumbnails</a></p>';
# Tempoary directory to create thumbnails in
$ThumbnailTempDir = '/tmp';
# Path and name of external program to use to create thumbnails. Only
# ImageMagick 'convert' can be used. You may have to set the MAGICK_HOME
# environment variable in your config file if you set it to
# /usr/local/bin/convert and get the following error:
# convert: no decode delegate for this image format
# For your config file:
# $ENV{MAGICK_HOME} = '/usr/local';
$ThumbnailConvert = '/usr/bin/convert';
# Max size for a thumbnail. If larger size is specified just shows
# regular image
$ThumbnailMaxSize = 500;
# Default thumbnail size if non is specified
$ThumbnailDefaultSize = 100;
# MIME types to create thumbnail for, all allowed if empty list
@ThumbnailTypes = @UploadTypes;
# As well as using ALT, use TITLE. This enables comments to popup when
# hovering mouse over thumbnail
$ThumbnailImageUseTitle = 0;
$ThumbnailCacheDir = "oddmuse_thumbnail_cache";
$ThumbnailCacheUrl = "/oddmuse_thumbnail_cache";
# Define new formatting rule "thumb" that inserts an auto generated thumbnail
# Syntax is [[thumb:page name | etc. ]]
push(@MyRules, \&ThumbNailSupportRule);
sub ThumbNailSupportRule {
my $result;
my $RawMatch;
if (m!\G(\[\[thumb:$FreeLinkPattern(\|.*?)?\]\])!gc)
{
$RawMatch = $1;
# Try and extract out all the options. They can be in any order, apart from comment at end
my $name = $2;
my $size="$ThumbnailDefaultSize"; # default size for thumbnail
my $frame;
my $comment; # default alignment for a non framed picture
my $alignment_framed = 'tright'; # default alignment for a framed picture
my $alignment;
my $params = $3 . '|';
if($params =~ s/\|([0-9]+)px\|/\|/) { $size = $1; }
if($params =~ s/\|thumb\|/\|/) { $frame = 'yes' ;}
if($params =~ s/\|frame\|/\|/) { $frame = 'yes'; }
if ($params =~ s/\|none\|/\|/) { $alignment_framed= 'tnone'; }
if ($params =~ s/\|right\|/\|/) { $alignment_framed= 'tright'; $alignment='floatright';}
if ($params =~ s/\|left\|/\|/) { $alignment_framed= 'tleft'; $alignment='floatleft'; }
if ($params =~ m/\|(.+)\|$/) { $comment = $1; }
my $id = FreeToNormal($name);
AllPagesList();
# if the page does exists
if ($IndexHash{$id})
{
if (! -e "$ThumbnailCacheDir/$id/$size")
{
GenerateThumbNail ($id, $size);
}
my %img_attribs;
my $action = "$ThumbnailCacheUrl/" . UrlEncode($id) . "/$size";
$img_attribs{'-src'} = $action;
if (defined $comment) {
$img_attribs{'-alt'} ="$comment";
$img_attribs{'-title'} = "$comment" if $ThumbnailImageUseTitle==1;
}
else { $img_attribs{'-alt'} = "$name"; }
$img_attribs{'-class'} = 'upload';
$result = $q->img(\%img_attribs);
$result = ScriptLink(UrlEncode($id) , $result, 'image');
if (defined $frame) {
if (defined $comment) { $result = $result . $q->div({-class=>'thumbcaption'}, "$comment"); }
if ($size>0) {
$result = $q->div({-style=>"width:" . ($size+2) . "px"}, $result);
$result = $q->div({-class=>"thumb " . $alignment_framed}, $result);
}
}
else
{
if (defined $alignment) { $result = $q->div({-class=>"$alignment" }, $result); }
}
}
else
{
# if the image does not exist
$result = '[' . T('thumb') . ':' . $name . GetEditLink($id, '?', 1) . ']';
}
}
if (defined $result)
{
Dirty($RawMatch);
print $result;
$result = '';
}
return $result;
}
# define new action "thumbnail" that actually does the on fly generation of the image
# thumbnails are put into the file so they only need be generated once
# we also store the size of thumbnail so that can be used in the markup
# if we get passed a size of zero then all we need to do is check whether we have the image size stored in thumbnail_0
# this enbles markup for non-thumbnail images better
sub GenerateThumbNail {
my ($id, $size) = (@_);
ValidIdOrDie($id);
AllPagesList();
if (not $IndexHash{$id}) { ReportError(Ts('Error creating thumbnail from non existant page %s.' , $id), '500 INTERNAL SERVER ERROR'); } # Page Doesn't exist,
my $openpage = $OpenPageName; # remember the current page we are on
RequestLockOrError();
OpenPage($id);
# Parse out some data
# Check MIME type supported
# Check is a file
my ($text, $revision) = GetTextRevision(GetParam('revision', '')); # maybe revision reset!
my ($type) = TextIsFile($text); # MIME type if an uploaded file
my $data = substr($text, index($text, "\n") + 1);
if ($type)
{
my $regexp = quotemeta($type);
if (@ThumbnailTypes and not grep(/^$regexp$/, @ThumbnailTypes)) {
ReportError(Ts('Can not create thumbnail for file type %s.' , $type), '415 UNSUPPORTED MEDIA TYPE');
}
}
else
{
ReportError(T('Can not create thumbnail for a text document'), '500 INTERNAL SERVER ERROR');
}
my $filename = $ThumbnailTempDir . "/odd" . $id . "_" . $size;
# Decode the original image to a temp file
open(FD, "> $filename") or ReportError(Ts("Could not open %s for writing whilst trying to save image before creating thumbnail. Check write permissions.",$filename), '500 INTERNAL SERVER ERROR');
binmode(FD);
print FD MIME::Base64::decode($data);
close(FD);
eval { mkpath("$ThumbnailCacheDir/$id") };
if ($@) {
ReportError(Ts('Can not create path for thumbnail - %s', $@), '500 INTERNAL SERVER ERROR');
}
# create the thumbnail
my $command = "$ThumbnailConvert '$filename' -verbose -resize ${size}x '$ThumbnailCacheDir/$id/$size' 2>&1";
open (MESSAGE, '-|', $command)
or ReportError(Tss("Failed to run %1 to create thumbnail: %2", $ThumbnailConvert, $!),
'500 INTERNAL SERVER ERROR');
my $convert = <MESSAGE>;
close(MESSAGE);
my $scaled_size_x;
my $scaled_size_y;
my $thumbnail_data= '';
if($?) {
ReportError(Ts("%s ran into an error", $ThumbnailConvert), '500 INTERNAL SERVER ERROR', undef,
$q->pre($command . "\n" . $convert));
} elsif($convert =~ m/=>(\d+)x(\d+)/) {
$scaled_size_x = $1;
$scaled_size_y = $2;
} elsif (!$convert) {
ReportError(Ts("%s produced no output", $ThumbnailConvert), '500 INTERNAL SERVER ERROR');
} else {
ReportError(Ts("Failed to parse %s.", $convert), '500 INTERNAL SERVER ERROR');
}
unlink($filename);
# save tag to page
#$Page{'thumbnail_' . $size} = '#FILE ' . $type . ' created=' . $Now . ' revision=' . $Page{'revision'} . ' size=' . $scaled_size_x . 'x' . $scaled_size_y . "\n" . $thumbnail_data;
#SavePage();
ReleaseLock();
OpenPage($openpage); # restore original open page
}

View File

@@ -140,7 +140,7 @@ that table. This is optional. If not specified, it defaults to "toc".
sub TocRule {
# <toc...> markup. This explicitly displays a table of contents at this point.
if ($bol and
m~\G&lt;toc(/([A-Za-z\x80-\xff/]+))? # $1
m~\G&lt;toc(/([A-Za-z\x{0080}-\x{ffff}/]+))? # $1
(\s+(?:header_text\s*=\s*)?"(.+?)")? # $3
(\s+(?:class\s*=\s*)?"(.+?)")? # $5
&gt;[ \t]*(\n|$)~cgx) { # $7
@@ -231,6 +231,7 @@ sub NewTocApplyRules {
open( STDOUT, '>', \$html) or die "Can't open memory file: $!";
($blocks, $flags) = OldTocApplyRules(@_);
close STDOUT;
utf8::decode($html);
}
# If there are at least two HTML headers on this page, insert a table of
# contents.

View File

@@ -1 +0,0 @@
*.wiki

View File

@@ -18,6 +18,7 @@
#
# This translation was last checked for Oddmuse version 1.753.
#
use utf8;
$ModulesDescription .= '<p>$Id: brazilian-portuguese-utf8.pl,v 1.14 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -18,6 +18,7 @@
#
# This translation was updated for Oddmuse 1.354.
#
use utf8;
$ModulesDescription .= '<p>$Id: bulgarian-utf8.pl,v 1.11 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -18,6 +18,7 @@
#
# This translation was last checked for Oddmuse version 1.504.
#
use utf8;
$ModulesDescription .= '<p>$Id: chinese-utf8.pl,v 1.12 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -24,6 +24,7 @@
#by wctang <wctang@csie.nctu.edu.tw> and using the tool cnmap
#(http://search.cpan.org/~qjzhou/Encode-CNMap-0.32/bin/cnmap) by Qing-Jie Zhou <qjzhou@hotmail.com>.
#
use utf8;
$ModulesDescription .= '<p>$Id: chinese_cn-utf8.pl,v 1.11 2009/06/07 19:30:37 as Exp $</p>';
##############################################################
# for those who want to use Chinese even for special pages

View File

@@ -24,6 +24,7 @@
#
# This translation was last checked for Oddmuse version 1.215.
#
use utf8;
$ModulesDescription .= '<p>$Id: dutch-utf8.pl,v 1.12 2011/05/17 13:24:13 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -16,6 +16,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: finnish-utf8.pl,v 1.11 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -19,6 +19,7 @@
#
# This translation was last checked for Oddmuse version 1.296.
#
use utf8;
$ModulesDescription .= '<p>$Id: french-utf8.pl,v 1.20 2011/02/05 12:29:07 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -15,6 +15,8 @@
# Create a modules subdirectory in your data directory, and put the file in
# there. It will be loaded automatically.
#
use utf8;
use utf8;
$ModulesDescription .= '<p>$Id: german-utf8.pl,v 1.30 2011/02/05 12:40:38 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -15,6 +15,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: greek-utf8.pl,v 1.2 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -17,6 +17,7 @@
#
# This translation was last checked for Oddmuse version 1.195.
#
use utf8;
$ModulesDescription .= '<p>$Id: hebrew-utf8.pl,v 1.11 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -15,6 +15,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: italian-utf8.pl,v 1.11 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -17,6 +17,7 @@
#
# This translation was last checked for Oddmuse version 1.215.
#
use utf8;
$ModulesDescription .= '<p>$Id: japanese-utf8.pl,v 1.11 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -15,6 +15,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: korean-utf8.pl,v 1.5 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -15,6 +15,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: new-utf8.pl,v 1.12 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -17,6 +17,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: polish-utf8.pl,v 1.7 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -20,6 +20,7 @@
#
# This translation was last checked for Oddmuse version 1.195.
#
use utf8;
$ModulesDescription .= '<p>$Id: portuguese-utf8.pl,v 1.15 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -15,6 +15,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: romanian-utf8.pl,v 1.9 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -20,6 +20,7 @@
#
# This script was last checked for Oddmuse version 1.658.
#
use utf8;
$ModulesDescription .= '<p>$Id: russian-utf8.pl,v 1.13 2007/08/19 11:42:08 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Reading not allowed: user, ip, or network is blocked.

View File

@@ -17,6 +17,7 @@
#
# This translation was last checked for Oddmuse version 1.195.
#
use utf8;
$ModulesDescription .= '<p>$Id: serbian-utf8.pl,v 1.11 2009/06/07 19:30:37 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -18,6 +18,7 @@
#
# This translation was last checked for Oddmuse version 1.195.
#
use utf8;
$ModulesDescription .= '<p>$Id: spanish-utf8.pl,v 1.13 2011/07/05 00:30:18 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -20,6 +20,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: swedish-utf8.pl,v 1.18 2009/06/07 19:30:38 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -15,6 +15,7 @@
# Create a modules subdirectory in your data directory, and put the
# file in there. It will be loaded automatically.
#
use utf8;
$ModulesDescription .= '<p>$Id: ukrainian-utf8.pl,v 1.6 2009/06/07 19:30:38 as Exp $</p>';
%Translate = split(/\n/,<<END_OF_TRANSLATION);
Include normal pages

View File

@@ -109,7 +109,7 @@ sub get {
} else {
print $q->header( -cache_control => 'max-age=10',
-etag => $OddMuse::Page{ts},
-type => "text/plain; charset=$OddMuse::HttpCharset",
-type => "text/plain; charset=UTF-8",
-status => "200 OK",);
print $OddMuse::Page{text} unless $head;
}
@@ -373,11 +373,6 @@ sub propfind {
$resp->addChild($propstat);
}
}
# The XML Parser handles UTF-8 correctly, but Perl will
# automatically convert it to Latin-1 upon printing to STDOUT unless
# we use binmode.
eval { local $SIG{__DIE__}; binmode(STDOUT, ":utf8"); }
if $OddMuse::HttpCharset eq 'UTF-8';
warn "RESPONSE: 207\n" . $doc->toString(1) . "\n" if $verbose;
print $doc->toString(1);
}

View File

@@ -15,6 +15,7 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 39;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();

View File

@@ -1,24 +1,21 @@
# Copyright (C) 2006 Alex Schroeder <alex@emacswiki.org>
# Copyright (C) 2006, 2012 Alex Schroeder <alex@gnu.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# This program is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation; either version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the
# Free Software Foundation, Inc.
# 59 Temple Place, Suite 330
# Boston, MA 02111-1307 USA
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <http://www.gnu.org/licenses/>.
require 't/test.pl';
package OddMuse;
use Test::More tests => 23;
use Test::More tests => 26;
use utf8;
clear_pages();
@@ -77,3 +74,10 @@ utime($Now-1300000, $Now-1300000, "$DraftDir/Alex");
# Second maintenance requires admin password and deletes one draft
test_page(get_page('action=maintain pwd=foo'), 'Alex was last modified [^<>]* ago and was deleted');
ok(! -f "$DraftDir/.berta", "$DraftDir/Alex is gone");
# Testing UTF-8
# Saving draft uses 204 No Content status
test_page(get_page('title=HomePage text=foo username=Schröder Draft=1'),
'Status: 204', 'username%251eSchr%C3%B6der');
test_page(get_page('action=maintain pwd=foo'),
'Schröder was last modified [^<>]* and was kept');

View File

@@ -14,7 +14,8 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 9;
use Test::More tests => 11;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
@@ -35,12 +36,16 @@ test_page(update_page('Comments_on_2011-07-06', 'Yo'),
'Yo');
xpath_test(get_page('Hi'),
'//div[@class="journal"]/div[@class="page"]/p[@class="comment"]/a[@href="javascript:togglecomments(\'Comments_on_2011-07-06\')"][text()="Comments on 2011-07-06"]');
'//div[@class="journal"]/div[@class="page"]/p[@class="comment"]/a[@href="javascript:togglecomments(\'id0\')"][text()="Comments on 2011-07-06"]');
# encoding basics
$page = update_page('2011-07-06_(…)_Dü', 'Hallo Dü');
test_page($page, 'Hallo Dü');
xpath_test($page, '//p[contains(text(), "Dü")]');
update_page('2011-07-06_(…)_Dü', 'Hallo');
update_page('Comments_on_2011-07-06_(…)_Dü', 'Yo');
xpath_test(update_page('Hi', '<journal>'),
'//h1/a[text()="2011-07-06 (…) Dü"]',
'//div[@class="journal"]/div[@class="page"]/p[@class="comment"]/a[text()="Comments on 2011-07-06 (…) Dü"]',
'//div[@class="journal"]/div[@class="page"]/p[@class="comment"]/a[@href="javascript:togglecomments(\'Comments_on_2011-07-06__Dü\')"]');
'//div[@class="journal"]/div[@class="page"]/p[@class="comment"]/a[@href="javascript:togglecomments(\'id0\')"]');

94
t/encoding.t Normal file
View File

@@ -0,0 +1,94 @@
# Copyright (C) 2012 Alex Schroeder <alex@gnu.org>
#
# This program is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation; either version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <http://www.gnu.org/licenses/>.
require 't/test.pl';
package OddMuse;
use Test::More tests => 35;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
# ASCII basics
$page = update_page('Aal', 'aal');
test_page($page, '<h1><a .*>Aal</a></h1>', '<p>aal</p>');
xpath_test($page, '//h1/a[text()="Aal"]', '//p[text()="aal"]');
$page = get_page('Aal');
test_page($page, '<h1><a .*>Aal</a></h1>', '<p>aal</p>');
xpath_test($page, '//h1/a[text()="Aal"]', '//p[text()="aal"]');
# non-ASCII
$page = update_page('Öl', 'öl');
test_page($page, '<h1><a .*>Öl</a></h1>', '<p>öl</p>');
xpath_test($page, '//h1/a[text()="Öl"]', '//p[text()="öl"]');
$page = get_page('Öl');
test_page($page, '<h1><a .*>Öl</a></h1>', '<p>öl</p>');
xpath_test($page, '//h1/a[text()="Öl"]', '//p[text()="öl"]');
$page = get_page('action=index raw=1');
test_page($page, 'Aal', 'Öl');
test_page(get_page('Aal'), 'aal');
test_page(get_page('Öl'), 'öl');
# rc
test_page(get_page('action=rc raw=1'),
'title: Öl', 'description: öl');
# diff
update_page('Öl', 'Ähren');
xpath_test(get_page('action=browse id=Öl diff=1'),
'//div[@class="old"]/p/strong[@class="changes"][text()="öl"]',
'//div[@class="new"]/p/strong[@class="changes"][text()="Ähren"]');
# search
# testing with non-ASCII is important on a Mac
# ASCII
$page = get_page('search=aal raw=1');
test_page($page, 'title: Search for: aal', 'title: Aal');
# matching page name does not involve grep working
$page = get_page('search=öl raw=1');
test_page($page, 'title: Search for: öl', 'title: Öl');
# this fails with grep on a Mac, thus testing if mac.pl
# managed to switch of the use of grep
test_page(get_page('search=ähren raw=1'),
'title: Search for: ähren', 'title: Öl');
# the username keeps getting reported as changed
test_page(get_page('action=browse id=Möglich username=Schr%C3%B6der'),
'Set-Cookie: Wiki=username%251eSchr%C3%B6der',
'username=Schröder');
# verify that non-ASCII parameters work as intended
AppendStringToFile($ConfigFile, "use utf8;\n\$CookieParameters{ärger} = 1;\n");
test_page(get_page('action=browse id=Test %C3%A4rger=hallo'),
'Set-Cookie: Wiki=%C3%A4rger%251ehallo');
# this causes wide character in print somehow? otherwise harmless
test_page(update_page("Russian", "Русский Hello"),
"Русский");
# with toc.pl, however, a problem: Русский is corrupted
add_module('toc.pl');
test_page(update_page("Russian", "Русский Hello again"),
"Русский");

View File

@@ -15,6 +15,7 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 33;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();

View File

@@ -16,6 +16,7 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 61;
use utf8; # tests contain UTF-8 characters and it matters
SKIP: {
eval { require Search::FreeText };

View File

@@ -19,6 +19,7 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 5;
use utf8; # tests contain UTF-8 characters and it matters
%Languages = ('de' => '\b(der|die|das|und|oder)\b',
'fr' => '\b(et|le|la|pas)\b', );

View File

@@ -122,12 +122,10 @@ i"m tired
i"m tired
He said, "[w]hen I voice complaints..."
He said, &#x201c;[w]hen I voice complaints&#x2026;&#x201d;
[foo]'s problem
[foo]&#x2019;s problem
EOT
xpath_run_tests("[http://foo.org/ foo]'s problem",
'//a[@class="url http outside"][@href="http://foo.org/"][text()="foo"]'
. '/following-sibling::text()[string()="s problem"]');
$MarkupQuotes = 0;
run_tests(q{"Get lost!", they say, and I answer: "I'm not 'lost'!"},
q{"Get lost!", they say, and I answer: "I'm not 'lost'!"});

View File

@@ -1,21 +1,22 @@
# Copyright (C) 2006, 2007, 2008, 2009 Alex Schroeder <alex@gnu.org>
# Copyright (C) 2006, 2007, 2008, 2009, 2012 Alex Schroeder <alex@gnu.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
# This program is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation; either version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <http://www.gnu.org/licenses/>.
require 't/test.pl';
package OddMuse;
use Test::More tests => 71;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
add_module('namespaces.pl');
@@ -118,10 +119,7 @@ test_page(update_page('Umlaute', 'namespace mit umlaut',
'ns=Zürich'), 'namespace mit umlaut');
xpath_test(get_page('action=rc'),
# the exact result depends on filesystem encoding!
'//a[@class="local"][@href="http://localhost/wiki.pl/Zu%cc%88rich/Umlaute"'
. ' or @href="http://localhost/wiki.pl/Zu%fcrich/Umlaute"'
. ' or @href="http://localhost/wiki.pl/Z%c3%bcrich/Umlaute"]');
'//a[@class="local"][@href="http://localhost/wiki.pl/Z%c3%bcrich/Umlaute"]');
# Test rollbacks
test_page(get_page('action=browse ns=Muu id=Test'),
'Another Mooo!');

View File

@@ -16,6 +16,8 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 25;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
add_module('permanent-anchors.pl');
@@ -57,7 +59,7 @@ $page = update_page('Keith_Jarret', 'plays unlike [::Thelonius Mönk]');
like($page, qr(the page (.*?) also exists), 'the page ... also exists');
$page =~ qr(the page (.*?) also exists);
$link = $1;
xpath_test($link, Encode::encode_utf8('//a[@class="local"][@href="http://localhost/wiki.pl?action=browse;anchor=0;id=Thelonius_M%c3%b6nk"][text()="Thelonius Mönk"]'));
xpath_test($link, '//a[@class="local"][@href="http://localhost/wiki.pl?action=browse;anchor=0;id=Thelonius_M%c3%b6nk"][text()="Thelonius Mönk"]');
# verify that the redirection works
test_page(get_page('action=browse id=Thelonius_Mönk'),
'Status: 302',

View File

@@ -16,6 +16,7 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 9;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();

View File

@@ -19,6 +19,7 @@ require 't/test.pl';
package OddMuse;
use Test::More tests => 10; # update two numbers below!
use utf8; # tests contain UTF-8 characters and it matters
SKIP: {

View File

@@ -16,6 +16,7 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 63;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
WriteStringToFile($RcFile, "1FirstPage1\n");

View File

@@ -16,6 +16,7 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 100;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
AppendStringToFile($ConfigFile, "\$CommentsPrefix = 'Comments on ';\n");

View File

@@ -16,6 +16,7 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 38;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();

View File

@@ -1,22 +1,21 @@
# Copyright (C) 2004, 2005, 2006, 2008 Alex Schroeder <alex@gnu.org>
# Copyright (C) 2004, 2005, 2006, 2008, 2012 Alex Schroeder <alex@gnu.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
# This program is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation; either version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <http://www.gnu.org/licenses/>.
package OddMuse;
use lib '.';
use XML::LibXML;
use Encode;
use utf8;
# Import the functions
@@ -25,8 +24,8 @@ $UseConfig = 0; # don't read module files
$DataDir = 'test-data';
$ENV{WikiDataDir} = $DataDir;
require 'wiki.pl';
$ENV{PATH} = '/usr/local/bin:' . $ENV{PATH}; # location of perl?
Init();
use vars qw($redirect);
undef $/;
@@ -35,6 +34,7 @@ $| = 1; # no output buffering
sub url_encode {
my $str = shift;
return '' unless $str;
utf8::encode($str); # turn to byte string
my @letters = split(//, $str);
my @safe = ('a' .. 'z', 'A' .. 'Z', '0' .. '9', '-', '_', '.'); # shell metachars are unsafe
foreach my $letter (@letters) {
@@ -46,6 +46,15 @@ sub url_encode {
return join('', @letters);
}
# run perl in a subprocess and make sure it prints UTF-8 and not Latin-1
sub capture {
my $command = shift;
open (CL, '-|:encoding(utf-8)', $command) or die "Can't run $command: $!";
my $result = <CL>;
close CL;
return $result;
}
sub update_page {
my ($id, $text, $summary, $minor, $admin, @rest) = @_;
my $pwd = $admin ? 'foo' : 'wrong';
@@ -54,8 +63,8 @@ sub update_page {
$summary = url_encode($summary);
$minor = $minor ? 'on' : 'off';
my $rest = join(' ', @rest);
$redirect = `perl wiki.pl 'Save=1' 'title=$page' 'summary=$summary' 'recent_edit=$minor' 'text=$text' 'pwd=$pwd' $rest`;
$output = `perl wiki.pl action=browse id=$page $rest`;
$redirect = capture("perl wiki.pl 'Save=1' 'title=$page' 'summary=$summary' 'recent_edit=$minor' 'text=$text' 'pwd=$pwd' $rest");
$output = capture("perl wiki.pl action=browse id=$page $rest");
if ($redirect =~ /^Status: 302 /) {
# just in case a new page got created or NearMap or InterMap
$IndexHash{$id} = 1;
@@ -66,10 +75,7 @@ sub update_page {
}
sub get_page {
open(F, "perl wiki.pl @_ |");
my $output = <F>;
close F;
return $output;
return capture("perl wiki.pl @_");
}
sub name {
@@ -120,8 +126,8 @@ sub run_macro_tests {
# one string, many tests
sub test_page {
my $page = shift;
foreach my $str (@_) {
like($page, qr($str), name($str));
foreach my $test (@_) {
like($page, qr($test), name($test));
}
}
@@ -147,10 +153,19 @@ sub xpath_do {
skip("Cannot parse ".name($page).": $@", $#tests + 1) if $@;
foreach my $test (@tests) {
my $nodelist;
eval { $nodelist = $doc->findnodes($test) };
my $bytes = $test;
# utf8::encode: Converts in-place the character sequence to the
# corresponding octet sequence in *UTF-X*. The UTF8 flag is
# turned off, so that after this operation, the string is a byte
# string. (I have no idea why this is necessary, but there you
# go. See encoding.t tests and make sure the page file is
# encoded correctly.)
utf8::encode($bytes);
eval { $nodelist = $doc->findnodes($bytes) };
if ($@) {
fail(&$check(1) ? "$test: $@" : "not $test: $@");
} elsif (ok(&$check($nodelist->size()), name(&$check(1) ? $test : "not $test"))) {
} elsif (ok(&$check($nodelist->size()),
name(&$check(1) ? $test : "not $test"))) {
$result .= $nodelist->string_value();
} else {
$page =~ s/^.*?<html/<html/s;
@@ -201,16 +216,18 @@ sub remove_rule {
}
sub add_module {
my $mod = shift;
my ($mod, $subdir) = @_;
$subdir .= '/' if $subdir and substr($subdir, -1) ne '/';
my $filename =
mkdir $ModuleDir unless -d $ModuleDir;
my $dir = `/bin/pwd`;
chop($dir);
if (-l "$ModuleDir/$mod") {
# do nothing
} elsif (eval{ symlink("$dir/modules/$mod", "$ModuleDir/$mod"); 1; }) {
} elsif (eval{ symlink("$dir/modules/$subdir$mod", "$ModuleDir/$mod"); 1; }) {
# do nothing
} else {
system("copy '$dir/modules/$mod' '$ModuleDir/$mod'");
system("copy '$dir/modules/$subdir$mod' '$ModuleDir/$mod'");
}
die "Cannot symlink $mod: $!" unless -e "$ModuleDir/$mod";
do "$ModuleDir/$mod";
@@ -231,7 +248,8 @@ sub clear_pages {
}
die "Cannot remove $DataDir!\n" if -e $DataDir;
mkdir $DataDir;
open(F,">$DataDir/config");
add_module('mac.pl') if $^O eq 'darwin'; # guessing HFS filesystem
open(F, '>:encoding(utf-8)', "$DataDir/config");
print F "\$AdminPass = 'foo';\n";
# this used to be the default in earlier CGI.pm versions
print F "\$ScriptName = 'http://localhost/wiki.pl';\n";

View File

@@ -19,6 +19,8 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 3;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
add_module('tex.pl');

View File

@@ -16,11 +16,14 @@
require 't/test.pl';
package OddMuse;
use Test::More tests => 20;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
add_module('translation-links.pl');
AppendStringToFile($ConfigFile, q{
use utf8;
%Languages = ('de' => '\b(der|die|das|und|oder)\b',
'en' => '\b(the|he|she|that|this)\b');
$Translate{de} = 'Deutsch';

28
t/translations.t Normal file
View File

@@ -0,0 +1,28 @@
# Copyright (C) 2012 Alex Schroeder <alex@gnu.org>
#
# This program is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation; either version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# this program. If not, see <http://www.gnu.org/licenses/>.
require 't/test.pl';
package OddMuse;
use Test::More tests => 6;
use utf8; # tests contain UTF-8 characters and it matters
clear_pages();
test_page(update_page('HomePage', 'tätärätää!'),
'Edit this page', 'Last edited', 'tätärätää!');
add_module('german-utf8.pl', 'translations');
test_page(get_page('HomePage'),
'Diese Seite bearbeiten', 'Zuletzt geändert', 'tätärätää!');

163
wiki.pl
View File

@@ -47,20 +47,19 @@ $EmbedWiki $BracketText $UseConfig $UseLookup $AdminPass $EditPass $NetworkFile
$BracketWiki $FreeLinks $WikiLinks $SummaryHours $FreeLinkPattern $RCName
$RunCGI $ShowEdits $LinkPattern $RssExclude $InterLinkPattern $MaxPost $UseGrep
$UrlPattern $UrlProtocols $ImageExtensions $InterSitePattern $FS $CookieName
$SiteBase $StyleSheet $NotFoundPg $FooterNote $NewText $EditNote $HttpCharset
$UserGotoBar $VisitorFile $RcFile %Smilies %SpecialDays $InterWikiMoniker
$SiteDescription $RssImageUrl $ReadMe $RssRights $BannedCanRead $SurgeProtection
$TopLinkBar $LanguageLimit $SurgeProtectionTime $SurgeProtectionViews
$DeletedPage %Languages $InterMap $ValidatorLink %LockOnCreation
$RssStyleSheet %CookieParameters @UserGotoBarPages $NewComment $HtmlHeaders
$StyleSheetPage $ConfigPage $ScriptName $CommentsPrefix @UploadTypes
$AllNetworkFiles $UsePathInfo $UploadAllowed $LastUpdate $PageCluster
%PlainTextPages $RssInterwikiTranslate $UseCache $Counter $ModuleDir
$FullUrlPattern $SummaryDefaultLength $FreeInterLinkPattern
%InvisibleCookieParameters %AdminPages $UseQuestionmark $JournalLimit
$LockExpiration $RssStrip %LockExpires @IndexOptions @Debugging $DocumentHeader
%HtmlEnvironmentContainers @MyAdminCode @MyFooters @MyInitVariables @MyMacros
@MyMaintenance @MyRules);
$SiteBase $StyleSheet $NotFoundPg $FooterNote $NewText $EditNote $UserGotoBar
$VisitorFile $RcFile %Smilies %SpecialDays $InterWikiMoniker $SiteDescription
$RssImageUrl $ReadMe $RssRights $BannedCanRead $SurgeProtection $TopLinkBar
$LanguageLimit $SurgeProtectionTime $SurgeProtectionViews $DeletedPage
%Languages $InterMap $ValidatorLink %LockOnCreation $RssStyleSheet
%CookieParameters @UserGotoBarPages $NewComment $HtmlHeaders $StyleSheetPage
$ConfigPage $ScriptName $CommentsPrefix @UploadTypes $AllNetworkFiles
$UsePathInfo $UploadAllowed $LastUpdate $PageCluster %PlainTextPages
$RssInterwikiTranslate $UseCache $Counter $ModuleDir $FullUrlPattern
$SummaryDefaultLength $FreeInterLinkPattern %InvisibleCookieParameters
%AdminPages $UseQuestionmark $JournalLimit $LockExpiration $RssStrip
%LockExpires @IndexOptions @Debugging $DocumentHeader %HtmlEnvironmentContainers
@MyAdminCode @MyFooters @MyInitVariables @MyMacros @MyMaintenance @MyRules);
# Internal variables:
use vars qw(%Page %InterSite %IndexHash %Translate %OldCookie $FootnoteNumber
@@ -95,7 +94,6 @@ $CookieName = 'Wiki'; # Name for this wiki (for multi-wiki sites)
$SiteBase = ''; # Full URL for <BASE> header
$MaxPost = 1024 * 210; # Maximum 210K posts (about 200K for pages)
$HttpCharset = 'UTF-8'; # You are on your own if you change this!
$StyleSheet = ''; # URL for CSS stylesheet (like '/wiki.css')
$StyleSheetPage = 'css'; # Page for CSS sheet
$LogoUrl = ''; # URL for site logo ('' for no logo)
@@ -207,7 +205,7 @@ sub DoWikiRequest {
sub ReportError { # fatal!
my ($errmsg, $status, $log, @html) = @_;
$q = new CGI unless $q; # make sure we can report errors before InitRequest
InitRequest(); # make sure we can report errors before InitRequest
print GetHttpHeader('text/html', 'nocache', $status), GetHtmlHeader(T('Error')),
$q->start_div({class=>"error"}), $q->h1(QuoteHtml($errmsg)), @html, $q->end_div,
$q->end_html, "\n\n"; # newlines for FCGI because of exit()
@@ -217,13 +215,14 @@ sub ReportError { # fatal!
}
sub Init {
binmode(STDOUT, ':utf8');
InitDirConfig();
$FS = "\x1e"; # The FS character is the RECORD SEPARATOR control char in ASCII
$Message = ''; # Warnings and non-fatal errors.
InitLinkPatterns(); # Link pattern can be changed in config files
InitModules(); # Modules come first so that users can change module variables in config
InitConfig(); # Config comes as early as possible; remember $q is not available here
InitRequest(); # get $q with $MaxPost and $HttpCharset; set these in the config file
InitRequest(); # get $q with $MaxPost; set these in the config file
InitCookie(); # After InitRequest, because $q is used
InitVariables(); # After config, to change variables, after InitCookie for GetParam
}
@@ -231,8 +230,6 @@ sub Init {
sub InitModules {
if ($UseConfig and $ModuleDir and -d $ModuleDir) {
foreach my $lib (glob("$ModuleDir/*.pm $ModuleDir/*.pl")) {
next unless ($lib =~ /^($ModuleDir\/[-\w.]+\.p[lm])$/o);
$lib = $1; # untaint
do $lib unless $MyInc{$lib};
$MyInc{$lib} = 1; # Cannot use %INC in mod_perl settings
$Message .= CGI::p("$lib: $@") if $@; # no $q exists, yet
@@ -245,7 +242,7 @@ sub InitConfig {
do $ConfigFile; # these options must be set in a wrapper script or via the environment
$Message .= CGI::p("$ConfigFile: $@") if $@; # remember, no $q exists, yet
}
if ($ConfigPage) { # $FS, $HttpCharset, $MaxPost must be set in config file!
if ($ConfigPage) { # $FS and $MaxPost must be set in config file!
my ($status, $data) = ReadFile(GetPageFile(FreeToNormal($ConfigPage)));
my %data = ParseData($data); # before InitVariables so GetPageContent won't work
eval $data{text} if $data{text};
@@ -271,11 +268,10 @@ sub InitDirConfig {
$ModuleDir = "$DataDir/modules" unless $ModuleDir;
}
sub InitRequest {
sub InitRequest { # set up $q
$CGI::POST_MAX = $MaxPost;
$q = new CGI unless $q;
$q->charset($HttpCharset) if $HttpCharset;
eval { local $SIG{__DIE__}; binmode(STDOUT, ":raw"); }; # we treat input and output as bytes
$q->charset('UTF-8');
}
sub InitVariables { # Init global session variables for mod_perl!
@@ -329,12 +325,10 @@ sub ReInit { # init everything we need if we want to link to stuff
sub InitCookie {
undef $q->{'.cookies'}; # Clear cache if it exists (for SpeedyCGI)
if ($q->cookie($CookieName)) {
%OldCookie = split(/$FS/o, UrlDecode($q->cookie($CookieName)));
} else {
%OldCookie = ();
}
my %provided = map { $_ => 1 } $q->param;
my $cookie = $q->cookie($CookieName);
utf8::decode($cookie); # make sure it's decoded as UTF-8
%OldCookie = split(/$FS/o, UrlDecode($cookie));
my %provided = map { utf8::decode($_); $_ => 1 } $q->param;
for my $key (keys %OldCookie) {
SetParam($key, $OldCookie{$key}) unless $provided{$key};
}
@@ -370,8 +364,10 @@ sub CookieRollbackFix {
sub GetParam {
my ($name, $default) = @_;
utf8::encode($name); # may fail
my $result = $q->param($name);
$result = $default unless defined($result);
utf8::decode($result); # may fail
return QuoteHtml($result); # you need to unquote anything that can have <tags>
}
@@ -383,13 +379,13 @@ sub SetParam {
sub InitLinkPatterns {
my ($WikiWord, $QDelim);
$QDelim = '(?:"")?'; # Optional quote delimiter (removed from the output)
$WikiWord = '[A-Z]+[a-z\x80-\xff]+[A-Z][A-Za-z\x80-\xff]*';
$WikiWord = '[A-Z]+[a-z\x{0080}-\x{ffff}]+[A-Z][A-Za-z\x{0080}-\x{ffff}]*';
$LinkPattern = "($WikiWord)$QDelim";
$FreeLinkPattern = "([-,.()'%&?;<> _1-9A-Za-z\x80-\xff]|[-,.()'%&?;<> _0-9A-Za-z\x80-\xff][-,.()'%&?;<> _0-9A-Za-z\x80-\xff]+)"; # disallow "0" and must match HTML and plain text (ie. > and &gt;)
$FreeLinkPattern = "([-,.()'%&?;<> _1-9A-Za-z\x{0080}-\x{ffff}]|[-,.()'%&?;<> _0-9A-Za-z\x{0080}-\x{ffff}][-,.()'%&?;<> _0-9A-Za-z\x{0080}-\x{ffff}]+)"; # disallow "0" and must match HTML and plain text (ie. > and &gt;)
# Intersites must start with uppercase letter to avoid confusion with URLs.
$InterSitePattern = '[A-Z\x80-\xff]+[A-Za-z\x80-\xff]+';
$InterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x80-\xff_=!?#\$\@~`\%&*+\\/:;.,]*[-a-zA-Z0-9\x80-\xff_=#\$\@~`\%&*+\\/])$QDelim";
$FreeInterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x80-\xff_=!?#\$\@~`\%&*+\\/:;.,()' ]+)"; # plus space and other characters, and no restrictions on the end of the pattern
$InterSitePattern = '[A-Z\x{0080}-\x{ffff}]+[A-Za-z\x{0080}-\x{ffff}]+';
$InterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x{0080}-\x{ffff}_=!?#\$\@~`\%&*+\\/:;.,]*[-a-zA-Z0-9\x{0080}-\x{ffff}_=#\$\@~`\%&*+\\/])$QDelim";
$FreeInterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x{0080}-\x{ffff}_=!?#\$\@~`\%&*+\\/:;.,()' ]+)"; # plus space and other characters, and no restrictions on the end of the pattern
$UrlProtocols = 'http|https|ftp|afs|news|nntp|mid|cid|mailto|wais|prospero|telnet|gopher|irc|feed';
$UrlProtocols .= '|file' if $NetworkFile;
my $UrlChars = '[-a-zA-Z0-9/@=+$_~*.,;:?!\'"()&#%]'; # see RFC 2396
@@ -485,9 +481,7 @@ sub ApplyRules {
Clean(CloseHtmlEnvironments());
Dirty($1);
my ($oldpos, $old_) = (pos, $_); # remember these because of the call to RSS()
eval { local $SIG{__DIE__}; binmode(STDOUT, ":utf8"); } if $HttpCharset eq 'UTF-8';
print RSS($3 ? $3 : 15, split(/\s+/, UnquoteHtml($4)));
eval { local $SIG{__DIE__}; binmode(STDOUT, ":raw"); };
Clean(AddHtmlEnvironment('p')); # if dirty block is looked at later, this will disappear
($_, pos) = ($old_, $oldpos); # restore \G (assignment order matters!)
} elsif (/\G(&lt;search (.*?)&gt;)/cgis) {
@@ -522,8 +516,8 @@ sub ApplyRules {
Clean("&$1;");
} elsif (m/\G\s+/cg) {
Clean(' ');
} elsif (m/\G([A-Za-z\x80-\xff]+([ \t]+[a-z\x80-\xff]+)*[ \t]+)/cg
or m/\G([A-Za-z\x80-\xff]+)/cg or m/\G(\S)/cg) {
} elsif (m/\G([A-Za-z\x{0080}-\x{ffff}]+([ \t]+[a-z\x{0080}-\x{ffff}]+)*[ \t]+)/cg
or m/\G([A-Za-z\x{0080}-\x{ffff}]+)/cg or m/\G(\S)/cg) {
Clean($1); # multiple words but do not match http://foo
} else {
last;
@@ -794,6 +788,7 @@ sub UnquoteHtml {
sub UrlEncode {
my $str = shift;
return '' unless $str;
utf8::encode($str); # turn to byte string
my @letters = split(//, $str);
my %safe = map {$_ => 1} ('a' .. 'z', 'A' .. 'Z', '0' .. '9', '-', '_', '.', '!', '~', '*', "'", '(', ')', '#');
foreach my $letter (@letters) {
@@ -912,13 +907,6 @@ sub RSS {
# translations will be double encoded when printing the result.
my $tDiff = T('diff');
my $tHistory = T('history');
if ($HttpCharset eq 'UTF-8' and ($tDiff ne 'diff' or $tHistory ne 'history')) {
eval { local $SIG{__DIE__};
require Encode;
$tDiff = Encode::decode_utf8($tDiff);
$tHistory = Encode::decode_utf8($tHistory);
}
}
my $wikins = 'http://purl.org/rss/1.0/modules/wiki/';
my $rdfns = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#';
@uris = map { s/^"?(.*?)"?$/$1/; $_; } @uris; # strip quotes of uris
@@ -1030,8 +1018,8 @@ sub GetRss {
if (GetParam('cache', $UseCache) > 0) {
foreach my $uri (keys %todo) { # read cached rss files if possible
if ($Now - (stat($todo{$uri}))[9] < $RssCacheHours * 3600) {
$data{$uri} = ReadFile($todo{$uri});
delete($todo{$uri}); # no need to fetch them below
$data{$uri} = ReadFile($todo{$uri});
delete($todo{$uri}); # no need to fetch them below
}
}
}
@@ -1048,8 +1036,8 @@ sub GetRss {
%todo = (); # because the uris in the response may have changed due to redirects
my $entries = $pua->wait();
foreach (keys %$entries) {
my $uri = $entries->{$_}->request->uri;
$data{$uri} = $entries->{$_}->response->content;
my $uri = $entries->{$_}->request->uri;
$data{$uri} = $entries->{$_}->response->content;
}
}
}
@@ -1059,7 +1047,10 @@ sub GetRss {
if (GetParam('cache', $UseCache) > 0) {
CreateDir($RssDir);
foreach my $uri (@need_cache) {
WriteStringToFile(GetRssFile($uri), $data{$uri});
my $data = $data{$uri};
# possibly a Latin-1 file without encoding attribute will cause a problem?
$data =~ s/encoding="[^"]*"/encoding="UTF-8"/; # content was converted
WriteStringToFile(GetRssFile($uri), $data) if $data;
}
}
return $str, %data;
@@ -1313,9 +1304,17 @@ sub Tss {
sub GetId {
my $id = UnquoteHtml(GetParam('id', GetParam('title', ''))); # id=x or title=x -> x
$id = join('_', $q->keywords) unless $id; # script?p+q -> p_q
if (not $id) {
my @keywords = $q->keywords;
foreach my $keyword (@keywords) {
utf8::decode($keyword);
}
$id = join('_', @keywords) unless $id; # script?p+q -> p_q
}
if ($UsePathInfo) {
my @path = split(/\//, $q->path_info);
my $path = $q->path_info;
utf8::decode($path);
my @path = split(/\//, $path);
$id = pop(@path) unless $id; # script/p/q -> q
foreach my $p (@path) {
SetParam($p, 1); # script/p/q -> p=1
@@ -1500,7 +1499,7 @@ sub GetRcLines { # starttime, hash of seen pages to use as a second return value
my %following = ();
my @result = ();
# check the first timestamp in the default file, maybe read old log file
open(F, $RcFile);
open(F, '<:encoding(UTF-8)', $RcFile);
my $line = <F>;
my ($ts) = split(/$FS/o, $line); # the first timestamp in the regular rc file
if (not $ts or $ts > $starttime) { # we need to read the old rc file, too
@@ -1586,7 +1585,7 @@ sub GetRcLinesFor {
rcclusteronly rcfilteronly match lang followup);
# parsing and filtering
my @result = ();
open(F,$file) or return ();
open(F, '<:encoding(UTF-8)', $file) or return ();
while (my $line = <F>) {
chomp($line);
my ($ts, $id, $minor, $summary, $host, $username, $revision,
@@ -1872,7 +1871,7 @@ sub GetRcRss {
}
}
}
my $rss = qq{<?xml version="1.0" encoding="$HttpCharset"?>\n};
my $rss = qq{<?xml version="1.0" encoding="UTF-8"?>\n};
if ($RssStyleSheet =~ /\.(xslt?|xml)$/) {
$rss .= qq{<?xml-stylesheet type="text/xml" href="$RssStyleSheet" ?>\n};
} elsif ($RssStyleSheet) {
@@ -2218,15 +2217,16 @@ sub GetHeader {
$result .= $q->start_div({-class=>'header'});
if (not $embed and $LogoUrl) {
my $url = $IndexHash{$LogoUrl} ? GetDownloadLink($LogoUrl, 2) : $LogoUrl;
$result .= ScriptLink(UrlEncode($HomePage), $q->img({-src=>$url, -alt=>$alt, -class=>'logo'}), 'logo');
$result .= ScriptLink(UrlEncode($HomePage),
$q->img({-src=>$url, -alt=>$alt, -class=>'logo'}), 'logo');
}
if (GetParam('toplinkbar', $TopLinkBar)) {
$result .= GetGotoBar($id);
if (%SpecialDays) {
my ($sec, $min, $hour, $mday, $mon, $year) = gmtime($Now);
if ($SpecialDays{($mon + 1) . '-' . $mday}) {
$result .= $q->br() . $q->span({-class=>'specialdays'},
$SpecialDays{($mon + 1) . '-' . $mday});
$result .= $q->br() . $q->span({-class=>'specialdays'},
$SpecialDays{($mon + 1) . '-' . $mday});
}
}
}
@@ -2246,8 +2246,8 @@ sub GetHttpHeader {
my %headers = (-cache_control=>($UseCache < 0 ? 'no-cache' : 'max-age=10'));
$headers{-etag} = $ts || PageEtag() if GetParam('cache', $UseCache) >= 2;
$headers{'-last-modified'} = TimeToRFC822($ts) if $ts and $ts ne 'nocache'; # RFC 2616 section 13.3.4
$headers{-charset} = 'UTF-8';
$headers{-type} = GetParam('mime-type', $type);
$headers{-type} .= "; charset=$HttpCharset" if $HttpCharset;
$headers{-status} = $status if $status;
$headers{-Content_Encoding} = $encoding if $encoding;
my $cookie = Cookie();
@@ -2263,7 +2263,7 @@ sub CookieData {
my ($changed, $visible, %params);
foreach my $key (keys %CookieParameters) {
my $default = $CookieParameters{$key};
my $value = GetParam($key, $default); # values are URL encoded
my $value = GetParam($key, $default);
$params{$key} = $value if $value ne $default;
# The cookie is considered to have changed under the following
# condition: If the value was already set, and the new value is
@@ -2280,6 +2280,7 @@ sub Cookie {
my ($changed, $visible, %params) = CookieData(); # params are URL encoded
if ($changed) {
my $cookie = join(UrlEncode($FS), %params); # no CTL in field values
utf8::encode($cookie); # prevent casting to Latin 1
my $result = $q->cookie(-name=>$CookieName, -value=>$cookie,
-expires=>'+2y');
$Message .= $q->p(T('Cookie: ') . $CookieName . ', '
@@ -2297,10 +2298,9 @@ sub GetHtmlHeader { # always HTML!
. T('Edit this page') . '" href="'
. ScriptUrl('action=edit;id=' . UrlEncode(GetId())) . '" />' if $id;
return $DocumentHeader
. $q->head($q->title($title) . $base
. GetCss() . GetRobots() . GetFeeds() . $HtmlHeaders
. '<meta http-equiv="Content-Type" content="text/html; charset='
. $HttpCharset . '"/>')
. $q->head($q->title($title) . $base
. GetCss() . GetRobots() . GetFeeds() . $HtmlHeaders
. '<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>')
. '<body class="' . GetParam('theme', $ScriptName) . '">';
}
@@ -2319,7 +2319,7 @@ sub GetFeeds { # default for $HtmlHeaders
my $id = GetId(); # runs during Init, not during DoBrowseRequest
$html .= '<link rel="alternate" type="application/rss+xml" title="'
. QuoteHtml("$SiteName: $id") . '" href="' . $ScriptName
. '?action=rss;rcidonly=' . UrlEncode(FreeToNormal($id)) . '" />' if $id;
. '?action=rss;rcidonly=' . UrlEncode(FreeToNormal($id)) . '" />' if $id;
my $username = GetParam('username', '');
$html .= '<link rel="alternate" type="application/rss+xml" '
. 'title="Follow-ups for ' . NormalToFree($username) . '" '
@@ -2452,8 +2452,9 @@ sub GetCommentForm {
sub GetFormStart {
my ($ignore, $method, $class) = @_;
$method ||= 'post';
$class ||= 'form';
return $q->start_multipart_form(-method=>$method, -action=>$FullUrl,
-class=>$class||'form');
-accept_charset=>'UTF-8', -class=>$class);
}
sub GetSearchForm {
@@ -2537,6 +2538,7 @@ sub DoDiff { # Actualy call the diff program
WriteStringToFile($oldName, $_[0]);
WriteStringToFile($newName, $_[1]);
my $diff_out = `diff $oldName $newName`;
utf8::decode($diff_out); # needs decoding
$diff_out =~ s/\\ No newline.*\n//g; # Get rid of common complaint.
ReleaseLockDir('diff');
# No need to unlink temp files--next diff will just overwrite.
@@ -2663,7 +2665,9 @@ sub OpenPage { # Sets global variables
%Page = ();
$Page{ts} = $Now;
$Page{revision} = 0;
if ($id eq $HomePage and (open(F, $ReadMe) or open(F, 'README'))) {
if ($id eq $HomePage
and (open(F, '<:encoding(UTF-8)', $ReadMe)
or open(F, '<:encoding(UTF-8)', 'README'))) {
local $/ = undef;
$Page{text} = <F>;
close F;
@@ -2791,8 +2795,8 @@ sub ExpireKeepFiles { # call with opened page
}
sub ReadFile {
my $fileName = shift;
if (open(IN, "<$fileName")) {
my $file = shift;
if (open(IN, '<:encoding(UTF-8)', $file)) {
local $/ = undef; # Read complete files
my $data=<IN>;
close IN;
@@ -2802,18 +2806,18 @@ sub ReadFile {
}
sub ReadFileOrDie {
my ($fileName) = @_;
my ($file) = @_;
my ($status, $data);
($status, $data) = ReadFile($fileName);
($status, $data) = ReadFile($file);
if (!$status) {
ReportError(Ts('Cannot open %s', $fileName) . ": $!", '500 INTERNAL SERVER ERROR');
ReportError(Ts('Cannot open %s', $file) . ": $!", '500 INTERNAL SERVER ERROR');
}
return $data;
}
sub WriteStringToFile {
my ($file, $string) = @_;
open(OUT, ">$file")
open(OUT, '>:encoding(UTF-8)', $file)
or ReportError(Ts('Cannot write %s', $file) . ": $!", '500 INTERNAL SERVER ERROR');
print OUT $string;
close(OUT);
@@ -2821,7 +2825,7 @@ sub WriteStringToFile {
sub AppendStringToFile {
my ($file, $string) = @_;
open(OUT, ">>$file")
open(OUT, '>>:encoding(UTF-8)', $file)
or ReportError(Ts('Cannot write %s', $file) . ": $!", '500 INTERNAL SERVER ERROR');
print OUT $string;
close(OUT);
@@ -3258,8 +3262,8 @@ sub DoIndex {
push(@menu, $q->b(Ts('(for %s)', GetParam('lang', '')))) if GetParam('lang', '');
print $q->start_div({-class=>'content index'}),
GetFormStart(undef, 'get', 'index'), GetHiddenValue('action', 'index'),
$q->p(join($q->br(), @menu)), $q->end_form(),
$q->h2(Ts('%s pages found.', ($#pages + 1))), $q->start_p();
$q->p(join($q->br(), @menu)), $q->end_form(),
$q->h2(Ts('%s pages found.', ($#pages + 1))), $q->start_p();
}
foreach (@pages) {
PrintPage($_);
@@ -3290,6 +3294,7 @@ sub PrintPage {
sub AllPagesList {
my $refresh = GetParam('refresh', 0);
return @IndexList if @IndexList and not $refresh;
SetParam('refresh', 0) if $refresh;
if (not $refresh and -f $IndexFile) {
my ($status, $rawIndex) = ReadFile($IndexFile); # not fatal
if ($status) {
@@ -3306,6 +3311,7 @@ sub AllPagesList {
foreach (glob("$PageDir/*/*.pg $PageDir/*/.*.pg")) { # find .dotfiles, too
next unless m|/.*/(.+)\.pg$|;
my $id = $1;
utf8::decode($id);
push(@IndexList, $id);
$IndexHash{$id} = 1;
}
@@ -3359,7 +3365,8 @@ sub PageIsUploadedFile {
return undef if $OpenPageName eq $id;
if ($IndexHash{$id}) {
my $file = GetPageFile($id);
open(FILE, "<$file") or ReportError(Ts('Cannot open %s', $file) . ": $!", '500 INTERNAL SERVER ERROR');
open(FILE, '<:encoding(UTF-8)', $file)
or ReportError(Ts('Cannot open %s', $file) . ": $!", '500 INTERNAL SERVER ERROR');
while (defined($_ = <FILE>) and $_ !~ /^text: /) {
} # read lines until we get to the text key
close FILE;
@@ -3403,7 +3410,7 @@ sub GrepFiltered { # grep is so much faster!!
# if we know of any remaining grep incompatibilities we should
# return @pages here!
$regexp = quotemeta($regexp);
open(F,"grep -rli $regexp '$PageDir' 2>/dev/null |");
open(F, '-|:encoding(UTF-8)', "grep -rli $regexp '$PageDir' 2>/dev/null");
while (<F>) {
push(@result, $1) if m/.*\/(.*)\.pg/ and not $found{$1};
}