[Python-projects] Pylint : Non-ASCII characters count double

Aurélien Campéas aurelien.campeas at logilab.fr
Thu Jan 31 11:08:16 CET 2008


On Wed, Jan 30, 2008 at 06:43:32PM +0100, Lubin Fayolle wrote:
> Hello,
> 
> I have just understood why some of my lines were deemed too long by
> pylint. For instance, when checking the below script:
> 
> <BEGINNING OF SCRIPT>
> # -*- coding: utf-8 -*-
> 
> """A little script to demonstrate a little bug."""
> 
> print "------------------------------------------------------------------------"
> print "-----------------------------------------------------------------------é"
> 
> <END OF SCRIPT>
> 
> Pylint returns the following warning:
> 
>      myscript.py:6: [C] Line too long (81/80)
> 
> where line 6 corresponds to the second call of print. The two lines are
> the same length though... It seems that 'é' counts double, like many (any?)
> other non-ASCII characters.
> 

>>> u'é'.encode('utf-8')
'\xc3\xa9'
>>> len(u'é'.encode('utf-8'))
2

UTF-8 is a variable-length encoding. Everything outside pure ASCII (7
bits) weights than one byte.

> Thought it was worth mentioning even if it is only an annoyance...

By changing the encoding of your file to ISO-8859-1 (for instance) you
would avoid this annoyance (I think).

> 
> Best regards,
> 
> Lubin Fayolle
> _______________________________________________
> Python-Projects mailing list
> Python-Projects at lists.logilab.org
> http://lists.logilab.org/mailman/listinfo/python-projects
> 


More information about the Python-Projects mailing list