[Python-projects] Pylint : Non-ASCII characters count double
Sylvain Thénault
sylvain.thenault at logilab.fr
Thu Jan 31 11:10:51 CET 2008
On Thu, Jan 31, 2008 at 11:08:16AM +0100, Aurélien Campéas wrote:
> On Wed, Jan 30, 2008 at 06:43:32PM +0100, Lubin Fayolle wrote:
> > Hello,
> >
> > I have just understood why some of my lines were deemed too long by
> > pylint. For instance, when checking the below script:
> >
> > <BEGINNING OF SCRIPT>
> > # -*- coding: utf-8 -*-
> >
> > """A little script to demonstrate a little bug."""
> >
> > print "------------------------------------------------------------------------"
> > print "-----------------------------------------------------------------------é"
> >
> > <END OF SCRIPT>
> >
> > Pylint returns the following warning:
> >
> > myscript.py:6: [C] Line too long (81/80)
> >
> > where line 6 corresponds to the second call of print. The two lines are
> > the same length though... It seems that 'é' counts double, like many (any?)
> > other non-ASCII characters.
> >
>
> >>> u'é'.encode('utf-8')
> '\xc3\xa9'
> >>> len(u'é'.encode('utf-8'))
> 2
>
> UTF-8 is a variable-length encoding. Everything outside pure ASCII (7
> bits) weights than one byte.
>
> > Thought it was worth mentioning even if it is only an annoyance...
>
> By changing the encoding of your file to ISO-8859-1 (for instance) you
> would avoid this annoyance (I think).
True. Though this is still a pylint bug as well...
--
Sylvain Thénault LOGILAB, Paris (France)
Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations
Développement logiciel sur mesure: http://www.logilab.fr/services
Python et calcul scientifique: http://www.logilab.fr/science
More information about the Python-Projects
mailing list