Please whitelist cPanel in your adblocker so that you’re able to see our version release promotions, thanks!

The Community Forums

Interact with an entire community of cPanel & WHM users!

Cpanel filter for non-english characters

Discussion in 'General Discussion' started by dandanfireman, May 1, 2006.

  1. dandanfireman

    dandanfireman Well-Known Member

    May 31, 2002
    Likes Received:
    Trophy Points:
    I have a customer that is looking to filter all incoming emails on an account to exclude any that have non-english characters in them. Since the customer can't read any language other than english, this seems fairly logical.

    I understand that some languages might be difficult to detect. What about just excluding eastern languages that use a completely different character set? The customer is specifically getting a lot of messages in Korean, and would like them to go away.

    TO anyone reading this that might believe this is in someway discriminatory, please don't bother replying. It is simply a technical question trying to avoid unwanted emails by a customer.
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  2. chirpy

    chirpy Well-Known Member

    Jun 15, 2002
    Likes Received:
    Trophy Points:
    Go on, have a guess
    Have a look at the email headers. You might see a change in the header record from the standard ASCII:

    Content-Transfer-Encoding: 7bit

    to something else for those languages. If so, you could filter on that header record.
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. casey

    casey Well-Known Member

    Jan 17, 2003
    Likes Received:
    Trophy Points:
    Look into spamassassin config as well:

    # Speakers of Asian languages, like Chinese, Japanese and Korean, will almost
    # definitely want to uncomment the following lines. They will switch off some
    # rules that detect 8-bit characters, which commonly trigger on mails using CJK
    # character sets, or that assume a western-style charset is in use.
    # score HTML_COMMENT_8BITS 0
    # score UPPERCASE_25_50 0
    # score UPPERCASE_50_75 0
    # score UPPERCASE_75_100 0

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice