python2.7运行出现的Unicode equal comparison failed to convert both arguments to Unicode - interpreting

Published on with 85 views and 0 comments

quora.py:237: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

 for document in raw_corpus]

('\xe8\xaf\x8d\xe5\x85\xb8\xe9\x95\xbf\xe5\xba\xa6---->%d', 79)

Dictionary(79 unique tokens: [u'pvp', u'600', u'211', u'10', u'YY']...)

uncode编码警告:在unicode等价比较中,把两个参数同时转换为unicode编码失败。中断并认为他们不相等。

windows下的字符串str默认编码是ascii,而python编码是utf8

解决方法:添加如下几行代码

 import sys

 reload(sys)    
 sys.setdefaultencoding('utf8')
说你懂得生之微末,我便做了这壮大与你看,你说再热闹也终需离散,我便做了这一辈子与你看,你说冷暖自知,我便做了这冬花夏雪与你看,你说恋恋旧日好时光,我便做了这描金绣凤的浮世绘与你看。你说应愁高处不胜寒,我便拱手河山,讨你欢。