X-Git-Url: https://scripts.mit.edu/gitweb/www/ikiwiki.git/blobdiff_plain/de9104d4e2571dcdf28659b3fd244c3c77f02740..2210c65083465bf8e515418cf2528e18f7326bc0:/doc/todo/should_use_a_standard_encoding_for_utf_chars_in_filenames.mdwn diff --git a/doc/todo/should_use_a_standard_encoding_for_utf_chars_in_filenames.mdwn b/doc/todo/should_use_a_standard_encoding_for_utf_chars_in_filenames.mdwn new file mode 100644 index 000000000..a454d7da5 --- /dev/null +++ b/doc/todo/should_use_a_standard_encoding_for_utf_chars_in_filenames.mdwn @@ -0,0 +1,61 @@ +It seems that I can't use Polish characters in post title. +When I try to do it, then I can see error message: "Błąd: bad page name". + +I hope it's a bug, not a feature and you fix it soon :) --[[Paweł|ptecza]] + +> ikiwiki only allows a very limited set of characters raw in page names, +> this is done as a deny-by-default security thing. All other characters +> need to be encoded in __code__ format, where "code" is the character +> number. This is normally done for you, but if you're adding a page +> manually, you need to handle it yourself. --[[Joey]] + +>> Assume I have my own blog and I want to send a new post with Polish +>> characters in a title. I think it's totally normal and common thing +>> in our times. Do you want to tell me I shouldn't use my native +>> characters in the title? It can't be true ;) + +>> In my opinion encoding of title is a job for the wiki engine, +>> not for me. Joey, please try to look at a problem from my point +>> of view. I'm only user and I don't have to understand +>> what the character number is. I only want to blog :) + +>> BTW, why don't you use the modified-UTF7 coding for page names +>> as used in IMAP folder names with non-Latin letters? --[[Paweł|ptecza]] + +>>> Joey, do you intend to fix that bug or it's a feature +>>> for you? ;) --[[Paweł|ptecza]] + +>>>> Of course you can put Polish characters in the title. but the page +>>>> title and filename are not identical. Ikiwiki has to place some limits +>>>> on what filenames are legal to prevent abuse. Since +>>>> the safest thing to do in a security context is to deny by default and +>>>> only allow a few well-defined safe things, that's what it does, so +>>>> filenames are limited to basic alphanumeric characters. +>>>> +>>>> It's not especially hard to transform your title into get a legal +>>>> ikiwiki filename: + + joey@kodama:~>perl -MIkiWiki -le 'print IkiWiki::titlepage(shift).".mdwn"' "Błąd" + B__197____130____196____133__d.mdwn + +>>>>> Thanks for the hint! It's good for me, but rather not for common users :) + +>>>>>> Interesting... I have another result: +>>>>>> +>>>>>> perl -MIkiWiki -le 'print IkiWiki::titlepage(shift).".mdwn"' "Błąd" +>>>>>> B__179____177__d.mdwn +>>>>>> +>>>>>> What's your locale? I have both pl\_PL (ISO-8859-2) and pl\_PL.UTF-8, +>>>>>> but I use pl\_PL. Is it wrong? --[[Paweł|ptecza]] + +>>>> Now, as to UTF7, in retrospect, using a standard encoding might be a +>>>> better idea than coming up with my own encoding for filenames. Can +>>>> you provide a pointer to a description to modified-UTF7? --[[Joey]] + +>>>>> The modified form of UTF7 is defined in [RFC 2060](http://www.ietf.org/rfc/rfc2060.txt) +>>>>> for IMAP4 protocol (please see section 5.1.3 for details). + +>>>>> There is a Perl [Unicode::IMAPUtf7](http://search.cpan.org/~fabpot/Unicode-IMAPUtf7-2.01/lib/Unicode/IMAPUtf7.pm) +>>>>> module at the CPAN, but probably it hasn't been debianized yet :( --[[Paweł|ptecza]] + +[[wishlist]]