non-utf-8, GPG encrypted mails cut off after umlauts

sec-mss · Post by **sec-mss** » 28 Oct 2011, 15:57

Hello,

we have a very annoying problem with our OTRS (Version 3.0.10) with gpg-encrypted mails.

Enviroment:
- Our OTRS is version 3.0.10, encoding is UTF-8, DB-charset is UTF-8, servercharset is UTF-8.
- I can reproduce this error on our livesystem (debian lenny), testsystem (debian lenny) and with a fresh install from tar.gz under ubuntu 10.04.3 lts (with otrs 3.0.10 & 3.0.11)!

Szenario:
If we recieve a gpg encrypted mail for our otrs, mail encoding is either iso-8859-1 or iso-8859-15, and the encrypted text contains (german) umlauts, otrs somehow fails decrypting the mail correctly and the mail is cut of after the first umlaut.

The same szenario works correctly, if the encrypted mail has utf-8 encoding.

Can anyone reproduce this or give me a hint, how to solve this or may I fill a bug report?

Edit: I'm pretty sure, the problem did not exists within otrs-2.4.x with DB-Charset latin1, if this helps somehow

Best regards

sec-mss · Post by **sec-mss** » 04 Nov 2011, 11:15

So, as noone shows up here, I'll fill a bugreport.

Edit: I've hit on this existing bug report, which describes the same problem, but misses some facts, so I'll just append my results to it, hoping someone at otrs will take care of it!
http://bugs.otrs.org/show_bug.cgi?id=5705

Regards

tto · Post by **tto** » 15 Nov 2011, 14:04

sec-mss wrote:So, as noone shows up here, I'll fill a bugreport.

Edit: I've hit on this existing bug report, which describes the same problem, but misses some facts, so I'll just append my results to it, hoping someone at otrs will take care of it!
http://bugs.otrs.org/show_bug.cgi?id=5705

Regards

Hi S.

I can confirm the bug as you described it. You may apply the following patch to Kernel/Output/HTML/ArticleCheckPGP.pm. It's not perfect, but worked in our case and should be enough as a hotfix.

regards, T.

Code: Select all

### Eclipse Workspace Patch 1.0
#P otrs
Index: Kernel/Output/HTML/ArticleCheckPGP.pm
===================================================================
RCS file: /home/cvs/otrs/Kernel/Output/HTML/ArticleCheckPGP.pm,v
retrieving revision 1.25
diff -u -r1.25 ArticleCheckPGP.pm
--- Kernel/Output/HTML/ArticleCheckPGP.pm	2 Dec 2010 19:16:52 -0000	1.25
+++ Kernel/Output/HTML/ArticleCheckPGP.pm	15 Nov 2011 11:56:55 -0000
@@ -73,6 +73,18 @@
         my %Decrypt = $Self->{CryptObject}->Decrypt( Message => $Param{Article}->{Body} );
         if ( $Decrypt{Successful} ) {
 
+            # workaround for non UTF-8 encodied crypt messages - OTRS bug 5705
+            if( $Param{Article}->{Body} =~ /Charset\:\s*(.+)\s*/m) {
+                my $DecryptCharset = $1 || '';
+                if ( $DecryptCharset ) {
+                    $Decrypt{Data} = $Self->{EncodeObject}->Convert2CharsetInternal(
+                        Text => $Decrypt{Data},
+                        From => $DecryptCharset,
+                    );
+                }
+            }
+            # EO workaround for non UTF-8 encodied crypt messages - OTRS bug 5705
+
             # remember to result
             $Self->{Result} = \%Decrypt;
             $Param{Article}->{Body} = $Decrypt{Data};

sec-mss · Post by **sec-mss** » 22 Nov 2011, 16:12

Thanks for your Patch Thorsten!

As the patch extracts the charset from the PGP-Header, it obviously fails, if there is no "Charset"-Line in the PGP-Header (which is the case for one of our customers), or if there is no PGP-Header (inline-pgp).

So my first thought was, to use $Param{Article}->{Charset}, $Param{Article}->{ContentType} or $Param{Article}->{ContentCharset} as fallback to determine the charset. This fails, as these are always set to utf-8, i guess in Kernel/System/EmailParser.pm, sub GetReturnContentType(), where ContentType - Charset is changed to $InternalCharset.
I'm stuck here right now, maybe someone understands, why the ContentType-Charset is changed, and can shed some light in here?

Best regards,
Sebastian

Post by **crythias** » 22 Nov 2011, 18:26

If you know what the encoding *might* be, you might try to change the line:
my $DecryptCharset = $1 || '';

and add the encoding in the single quotes.

Edit: of course, this makes the fall-back to be the "bad" character set, instead of UTF-8, which may be an issue. On the other hand, it'll only happen for people who don't tell you what their character set is, which hopefully is the bad one.

Post by **crythias** » 22 Nov 2011, 18:40

After further thought,
Since Decrypt does give a result, you may want to also test if the Decrypt fails before you try to change the encoding.

Decrypt()
Decrypt a message and returns a hash (Successful, Message, Data)
my %Result = $CryptObject->Decrypt(
Message => $CryptedMessage,
);
The returned hash %Result has the following keys:

Successful => '1', # could the given data be decrypted at all (0 or 1)
Data => '...', # the decrypted data
KeyID => 'FA23FB24' # hex ID of PGP-(secret-)key that was used for decryption
Message => '...' # descriptive text containing the result status

sec-mss · Post by **sec-mss** » 24 Nov 2011, 10:42

Hi,

if you can see in tto's patch, his conversion is indeed only done, when Decrypt() tells it has decrypted succesfully:

Code: Select all

if ( $Decrypt{Successful} ) {

So this does not help here.

Setting a "default" DecryptCharset isn't really what I'm looking for.
We do now help us with another quick & very dirty hack, where we use procmail and a tiny perlscript to add a Charset-line to the pgp-header (if there is none), extracted from the mailheader. This works, but doesn't give me a good feeling about it.

Returning to my former thoughts, I guess the problem of having UTF-8/InternalCharset as Charset/ContentCharset in $Param{Article} is, that EmailParser.pm does a charset conversion before the decryption of an pgp-content, which is done when you access the article via webui. The conversion has obviously no effect on the crypted content, but "destroys" mailheader information about the original charset.

Any ideas?

Post by **crythias** » 24 Nov 2011, 15:51

it appears that as soon as PGP decryption is done, it stores the article in plain text when it is first viewed and decrypted:

From ArticleCheckPGP.pm

Code: Select all

        my %Decrypt = $Self->{CryptObject}->Decrypt( Message => $Param{Article}->{Body} );
        if ( $Decrypt{Successful} ) {

            # remember to result
            $Self->{Result} = \%Decrypt;
            $Param{Article}->{Body} = $Decrypt{Data};

            # updated article body
            $Self->{TicketObject}->ArticleUpdate(
                TicketID  => $Param{Article}->{TicketID},
                ArticleID => $Self->{ArticleID},
                Key       => 'Body',
                Value     => $Decrypt{Data},
                UserID    => $Self->{UserID},
            );
        }

Which points to the idea that the conversion to utf-8 has to happen after decryption and before storage (ArticleUpdate). From the code as presented, it appears that once the article is decrypted, it will not need to be decrypted again. The biggest issue is to determine what the characterset was at time of encryption, which I would tend to agree isn't stored with the data. But that appears to be why the workaround (I'll tell you my CharacterSet after decryption) makes sense.

sec-mss · Post by **sec-mss** » 06 Dec 2011, 11:07

Your summary is right crythias. That's exactly what we have investigated before and what T.'s patch is trying to profit from.
but the main problem still remains: if there is now information about the original charset, the work around will fail, and we only could guess the charset. This is not a proper & stable solution in my eyes.

Btw. I got a notification about a new post here from sunday, 21:26 CET, but I can't see any new post here?

Post by **crythias** » 06 Dec 2011, 11:43

sec-mss wrote:This is not a proper & stable solution in my eyes.

This is a problem with the encryption, though... it's a problem that happens on the sender's side, not on the OTRS's side. If the encryption doesn't tell you the charset, that's a problem during encryption, not decryption. If the clear specifies a charset, then the decryption algorithm is to blame. If the clear does not, it's the encryption's (client) fault, and the client encryption is being lazy assuming that the clear text will be of the same as the message.

sec-mss wrote:Btw. I got a notification about a new post here from sunday, 21:26 CET, but I can't see any new post here?

It was likely spam and deleted before you noticed it.

sec-mss · Post by **sec-mss** » 06 Dec 2011, 16:49

crythias wrote:This is a problem with the encryption, though... it's a problem that happens on the sender's side, not on the OTRS's side. If the encryption doesn't tell you the charset, that's a problem during encryption, not decryption. If the clear specifies a charset, then the decryption algorithm is to blame. If the clear does not, it's the encryption's (client) fault, and the client encryption is being lazy assuming that the clear text will be of the same as the message.

Maybe you're right with you point of view, stating clients are responsible for giving proper information about the used charset. But that would be a "perfect world" scenario and does also miss the mime-pgp encrypted mails, where charset information is only available in the mailheader!

So we have mime-pgp & "unproper" inline-pgp mails, where we need to obtain charset information from the mailheader. As these information is destroyed/overwritten by otrs before the decryption takes place, we do need a way to keep this information available.
From my point of view, this needs to be done inside otrs for a working pgp support.

Znuny Open Source Ticketsystem

non-utf-8, GPG encrypted mails cut off after umlauts

non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts

Re: non-utf-8, GPG encrypted mails cut off after umlauts