[Webkit-unassigned] [Bug 37765] REGRESSION(57531): the commit-queue still hates Tor Arne Vestbø

Sun Apr 18 22:20:33 PDT 2010

https://bugs.webkit.org/show_bug.cgi?id=37765

--- Comment #23 from Eric Seidel <eric at webkit.org>  2010-04-18 22:20:31 PST ---
(In reply to comment #21)
> (From update of attachment 53626 [details])
> In response to comment #2, I think it would work if you use decode instead of
> encode (the string you passed in is already utf8 encoded, you want to convert
> it to a unicode object, right?).

My example was slightly confusing, and I was myself slightly confused.  Now
that I see "str" as an array of bytes and not actually a string type the world
makes a lot more sense.  I was attempting to unconditionally encode any "string
like object" to a utf-8 bytestream before passing off to the Popen call.  This
meant that if I was passing around utf8-encoded str objects, and tried to call
"encode" on them, it would scream at me.

> FWIW, I would probably have tried to keep everything as a utf8 encoded str, but
> using unicode everywhere seems fine too (you get some better type checking this
> way).  In either case, thanks for taking this on.

Yes, it's true, that could work.  However there are already many APIs which
feed us unicode() objects and so we'd have to be careful to be sure to convert
those back to utf-8 byte stream (str) objects before passing them around or
comparisons would fail.

I decided to go with the unicode() solution, mostly because it seemed the most
consistent in my head.  Turns out it's also the recommended way to prepare for
Python 3.0 (which we'll eventually get to in a year or more). :)

> > -        first_line = changelog_file.readline()
> > +        first_line = unicode(changelog_file.readline(), "utf-8")
> 
> Nit: I find using .decode('utf-8') easier to read than calling unicode(), but
> either is fine with me.  There's another instance of this in the same file.

Yes.  The only difference is handling of None.  However I agree in this case
decode is definitely nicer.

> >      def _parse_attachment_element(self, element, bug_id):
> > +
> 
> Nit: Did you mean to add this blank line?

Nope.  Will fix.

> > -            subprocess.call(('taskkill.exe', '/f', '/pid', str(pid)),
> > +            subprocess.call(('taskkill.exe', '/f', '/pid', unicode(pid)),
> 
> Nit: This can probably stay as str().  It's a bit weird to have the other
> strings be str and the pid be unicode.

It certainly could.  I was trying to move away from ever using the str()
constructor.  str() is basically always bad news, even when we use it to mean
"make this number a string".

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.