Table directive can not deal with Chinese, when format csv

[[!table  format=csv data="""
a,b,c
1,2,你好
"""
]]

But the below example works well.

[[!table  format=csv data="""
a,b,c
1,2,3
"""
]]

The below example works well too

[[!table  format=dsv delimiter=, data="""
a,b,c
1,2,你好
"""
]]

You don't say what actually happens when you try this, but I hit something similar trying unicode symbols in a CSV-based table. (I wasn't aware of the DSV work-around. Thanks!) The specific error I get trying is

[\[!table Error: Cannot decode string with wide characters at /usr/lib/x86_64-linux-gnu/perl/5.24/Encode.pm line 243.]]

That file is owned by the libperl5 package, but I think I've seen an error mentioning Text::CSV i.e. libtext-csv-perl when I've encountered this before. -- Jon

A related problem, also fixed by using DSV, is messing up the encoding of non-ASCII, non-wide characters, e.g. £ (workaround was to use £ instead) -- Jon

Sorry, I have faced the same error: [[!table Error: Cannot decode string with wide characters at /usr/lib/x86_64-linux-gnu/perl/5.24/Encode.pm line 243.]] -- ?tumashu1


The below patch seem to deal with this problem:

From d6ed90331b31e4669222c6831f7a0f40f0677fe1 Mon Sep 17 00:00:00 2001
From: Feng Shu <tumashu@163.com>
Date: Sun, 2 Dec 2018 08:41:39 +0800
Subject: [PATCH 2/2] Fix table plugin can handle UTF-8 csv format

---
 IkiWiki/Plugin/table.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

From ad1a92c796d907ad293e572a168b6e9a8219623f Mon Sep 17 00:00:00 2001
From: Feng Shu <tumashu@163.com>
Date: Sun, 2 Dec 2018 08:41:39 +0800
Subject: [PATCH 2/2] Fix table plugin can handle UTF-8 csv format

---
 IkiWiki/Plugin/table.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/IkiWiki/Plugin/table.pm b/IkiWiki/Plugin/table.pm
index f3c425a37..7fea8ab1c 100644
--- a/IkiWiki/Plugin/table.pm
+++ b/IkiWiki/Plugin/table.pm
@@ -135,6 +135,7 @@ sub split_csv ($$) {
    my $csv = Text::CSV->new({ 
        sep_char    => $delimiter,
        binary      => 1,
+       decode_utf8 => 1,
        allow_loose_quotes => 1,
    }) || error("could not create a Text::CSV object");

@@ -143,7 +144,7 @@ sub split_csv ($$) {
    foreach my $line (@text_lines) {
        $l++;
        if ($csv->parse($line)) {
-           push(@data, [ map { decode_utf8 $_ } $csv->fields() ]);
+           push(@data, [ $csv->fields() ]);
        }
        else {
            debug(sprintf(gettext('parse fail at line %d: %s'), 
-- 
2.19.0

Thanks, I've applied that patch and added test coverage. done --smcv


I can confirm that the above patch fixes the issue for me. Thanks! I'm not an ikiwiki committer, but I would encourage them to consider the above. Whilst I'm at it, I would be really grateful for some input on support multi-row table headers which relates to the same plugin. Jon


I've hit this bug with an inline-table and 3.20190228-1 (so: patch applied), with the following definition

[[\!table class=fullwidth_table delimiter="      " data="""    

Number  Title   Own?    Read?    
I (HB1), 70 (PB1), 5 (PB50)     Dune    O       ✓"""]]

I'm going to attempt to work around it by moving to an external CSV. ­— Jon

What version of Text::CSV (Debian: libtext-csv-perl) are you using? What version of Text::CSV::XS (Debian: libtext-csv-xs-perl) are you using, if any?

I could't reproduce this with libtext-csv-perl_2.00-1 and libtext-csv-xs-perl_1.39-1, assuming that the whitespace in delimiter="..." was meant to be a literal tab character, and that the data row has literal tabs before Dune, before O and before ✓.

It would be great if you could modify t/table.t to include a failing test-case, and push it to your github fork or something, so I can apply it without having to guess precisely what the whitespace should be. --smcv

Sorry, I appreciate as bug reports go my last post was not that useful. It's serving as a sort-of personal placeholder to investigate further. The issue can be seen live here, the source is here. The web servers versions are ikiwiki 3.20190228-1, libtext-csv-perl 1.33-2 and libtext-csv-xs-perl is not installed. I'll do some futher diagnosis and poking around. ­— Jon

OK: issue exists with oldstable/Stretch, and is seemingly fixed in stable/Buster. libcsv-text-xs-perl doesn't seem to matter (presence or absence doesn't change the bug). Upgrading just libtext-csv-perl on a Stretch host with Buster's 1.99-1 is not sufficent to fix it. As per the error, I think libperl5 might be relevant, i.e. bug present in 5.24.1-3+deb9u5 and fixed by 5.28.1-6.

EDIT: yes indeed merely upgrading to libperl5.28=5.28.1-6 in stretch fixes the issue. — Jon

Post Buster-upgrade, and it's still broken on my webhost, which shows [\[!table Error: Wide character at /usr/lib/x86_64-linux-gnu/perl/5.28/Encode.pm line 296.]] with libperl5.28:amd64 5.28.1-6, libtext-csv-perl 1.99-1 and libtext-csv-xs-perl 1.38-1. Further fiddling will commence. (removing libtext-csv-xs-perl does not help.) — Jon

Still hitting this table bug, using ikiwiki=3.20200202.3, buster system, no libtext-csv-perl installed, libperl5.28:amd64 5.28.1-6, no libtext-csv-xs-perl. Workaround is to set format=tsv, even though I'm overriding the delimiter to \t. This reproduces easily in the quay.io/jdowland/opinionated-ikiwiki container FWIW. Jon, 2020-06-26