Charset problem [migrating] ArticleStorage to FileSystem

Moderator: crythias

Locked
gdavid
Znuny newbie
Posts: 7
Joined: 05 Nov 2012, 19:58
Znuny Version: 3.1.10

Charset problem [migrating] ArticleStorage to FileSystem

Post by gdavid »

OTRS 3.1
Recently we migrated our Article Storage Backend from DB to FS. We encountered this strange problem:
attachements with accented chars in the filename are created in the FS in some wrong way by the AtricleStorageSwitch.pl script.
I suspect that the accented chars in the FS belong to a different charset. So the visualization via bash shell is ok, but when OTRS is working the article attachements is not found (as reported on the otrs) even if the searched file name is (or seems) the same actually present on FS.
We are using UTF8 charset in DB and in bash environment.
Does anyone have some suggestion on how to solve this issue? If any other info is needed to investigate more, please ask.
Thanks
Last edited by gdavid on 15 Nov 2013, 18:15, edited 1 time in total.
gdavid
Znuny newbie
Posts: 7
Joined: 05 Nov 2012, 19:58
Znuny Version: 3.1.10

Re: Charset problem [migrating] ArticleStorage to FileSystem

Post by gdavid »

Some more clues ... It turned out that the problem is not related to the ArticleStorageSwitch.pl, but to the whole filesystem article storage method (in a linux environment).
I mean that on a OTRS configured with ArticleStorage on filesystem, even newly-created email tickets suffer of this issue. If in the mail is present an attachment whose filename contains some (not alls) accented char (i.e. "é", "ó", etc), then the corresponding file created in ./var/article/{year}/{month}/{day}{article_id}/ is named in some wrong way, preventing OTRS to find it (logged agents see no attach on that ticket/mail)
I get error on otrs.log when agents open that ticket:
"[Error][Kernel::System::Main::FileRead][320] File '/opt/otrs/var/article/2013/11/10/1668324/Documentación.pdf' doesn't exist!"

That file exists in that path, I can "see" in bash:
# ls -l
-rw-rw-r-- 1 otrs www-data 22131 Nov 10 23:26 Documentación.pdf

but it contains some spurious chars, as demonstrated by:
# LANG=C find . -regex '.*[^a-zA-Z./-].*'
./Documentacio??n.pdf

These are the attachment-related lines on plain-text body of the incoming e-mail:

<--snips-->
Content-Type: application/pdf; name="=?UTF-8?Q?Documentacio=CC=81n=2Epdf?="
Content-Disposition: attachment; filename="=?UTF-8?Q?Documentacio=CC=81n=2Epdf?="
Content-Transfer-Encoding: base64
<--snips-->

I think the issue could be related to multibyte utf-8 characters and on how OTRS code handles it.
Does anyone (french or spanish people more than others) solved this issue? Did I find a new bug (maybe releted to http://bugs.otrs.org/show_bug.cgi?id=9418) to be reported to developers?
Many thanks
g
Locked