Wide character at /opt/pkg/lib/perl5/vendor_perl/5.38.0/IkiWiki/Plugin/git.pm line 395
using Ikiwiki setup file /Users/gt/website.setup ...
rebuilding Ikiwiki instance...
generating wrappers..
rebuilding wiki..
Wide character at /opt/pkg/lib/perl5/vendor_perl/5.38.0/IkiWiki/Plugin/git.pm line 395.
rebuilding calendar for diary/*
$ ikiwiki --version
ikiwiki version 3.20200202.3
At the line 395 of the git.pm file, there is the following function:
sub decode_git_file ($$) {
my $dir=shift;
my $file=shift;
# git does not output utf-8 filenames, but instead
# double-quotes them with the utf-8 characters
# escaped as \nnn\nnn.
if ($file =~ m/^"(.*)"$/) {
($file=$1) =~ s/\\([0-7]{1,3})/chr(oct($1))/eg;
}
# strip prefix if in a subdir
if (! defined $prefix_cache{$dir}) {
($prefix_cache{$dir}) = run_or_die_in($dir, 'git', 'rev-parse', '--show-prefix');
if (! defined $prefix_cache{$dir}) {
$prefix_cache{$dir}="";
}
}
$file =~ s/^\Q$prefix_cache{$dir}\E//;
return decode("utf8", $file);
}
}
Any idea how to fix the issue? My site uses CJK characters in file names and Git commit messages. Would it be OK?
Are the CJK characters you are using Unicode characters? Would you be able to share some of the recent git commit messages, or the filenames of recently added or modified files (aiming to select the ones that are most likely to have triggered this bug)? That would be a big help in debugging this. Thanks — Jon, 2024-03-04
I found the following unique characters in the file names:
()_-. 、《》「」一三上不与丛东中临之乐九习书事二于五些交京人仁介代令件价似体何作侄修债值兒入全八公六其具典养再写农出分划刘別剧劣励勇化北十千华南博卿历原参及变叠古句台史号名吳吴周命和品唐商問器回図国图國圖在地培基堯塔士复夏多大天太女如姪媒子字孟学孫宋宗宝实客家寫寶小少局居屋属屿岛峽州工布帖常干年幽序底庵廣建异式录心忠惯感成所打扫找技把投拳换支政效散数文料方日时明曲書月有期本术杂李杜杨松构枚果架标校梅梦棠棣極楷楼歙止武殳毓比毛氏气水永汉江河治法波注泽洛流测济海測游湾滄漢漫潘澍激点牆片版牝猫献玄王现理用甫电画登白百的目盲真石研硯碑礼社神秦程稱究章童笑笔第筆简算篆籍类紙素經網編红约纪细经结络网罗美翻老考聲臣臨自舟般艺节苔若茅荒著藏蘭虎虚蜜蟹行衛表袁複要觀视言記訪試詩誌語记论试诠语说谷賦质资赋趙路車轼辭还送通過邺部里重野量钓關间附陈院陳陵陸隔雅集音頫题颜風首馬魯鱼鸣鸰鹡黑默齐(),:0125aBcdeFgGhHiklmnorstWyz
and the following in the file content:Does anything look bad?
Unique characters in file content (tried to add some linebreaks):
I'm not sure where/how to add the debug print you request, just yet, sorry.
As a first step to try and reproduce it I've tried to create the following filenames in a test wiki:
The first and last of those are rejected by Ikiwiki by default because of characters not in the range of
wiki_file_chars
. From your setup file can you share the value ofwiki_file_chars
?I found
wiki_file_chars => '-[:alnum:]+/.:_《》'
Did not even realize that is configurable. I probably should add a lot more characters?
If you want to use those characters in filenames, yes. But please note that if you don't, you get a different error message to the one you are experiencing (
skipping bad filename <problematic filename>
), so I don't think adjustingwiki_file_chars
will fix your problem..mdwn~
. These files are automatically created as backups when .mdwn files are edited Emacs and I do like to use that feature. However, seeing them printed at Ikiwiki build time is not useful as they're expected and the shear number makes other true problematic ones less visible. Is it possible to whitelist that?If you didn't get the warnings about the emacs backup files before adding/changing
wiki_file_chars
, and since adding/changingwiki_file_chars
has not fixed the main problem here (has it?), Why not just revert back to your old setting (unset, if that's what it was)?So you don't want to publish them, but you want to suppress the warning about not publishing them? If that's right, please file a todo item for that tagged wishlist.
Including more characters into
wiki_file_chars
did fix something, but I'm still deciding if all are fixed.That's why I wanted the reporting on
*.mdwn~
files which is the majority of the "skipping bad filename XXX" lines out of the way.I don't want
*.mdwn~
files to be built into the files deployed to the web site, but I do want to keep them in the source files so they are available as backups, i.e. their designed purpose for Emacs.I've found this setting that seem to be able to exclude file types:
exclude => '^(.*\\.bak|.*\\.[:alnum:]+~|#.*#)$',
Although it seems it requires single square brackets in
[:alnum:]
while thewiki_file_chars
requires double square brackets. Isn't that odd?exclude
quieten the warnings?wiki_file_chars =>
andexclude =>
was to include files that should be included and exclude the ones that should NOT be, however it's only a useful digression from the original question asked -- the error in the title.