参见英文答案 > itertools.groupby() not grouping correctly 3个
我已经检查了一些关于groupby()的主题,但我没有得到我的例子的错误:
students = [{'name': 'Paul', 'mail': '@gmail.com'},
{'name': 'Tom', 'mail': '@yahoo.com'},
{'name': 'Jim', 'mail': 'gmail.com'},
{'name': 'Jules', 'mail': '@something.com'},
{'name': 'Gregory', 'mail': '@gmail.com'},
{'name': 'Kathrin', 'mail': '@something.com'}]
key_func = lambda student: student['mail']
for key, group in itertools.groupby(students, key=key_func):
print(key)
print(list(group))
这将分别打印每个学生.为什么我不能只获得3组:@ gmail.com,@ yahoo.com和@ something.com?
解决方法:
对于初学者来说,一些邮件是gmail.com,有些是@ gmail.com,这就是为什么他们被视为单独的组.
groupby还希望数据能够通过相同的键功能进行预先排序,这就解释了为什么你会两次获得@ something.com.
从docs:
… Generally, the iterable needs to already be sorted on the same key function. …
students = [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Tom', 'mail': '@yahoo.com'},
{'name': 'Jim', 'mail': 'gmail.com'}, {'name': 'Jules', 'mail': '@something.com'},
{'name': 'Gregory', 'mail': '@gmail.com'}, {'name': 'Kathrin', 'mail': '@something.com'}]
key_func = lambda student: student['mail']
students.sort(key=key_func)
# sorting by same key function we later use with groupby
for key, group in itertools.groupby(students, key=key_func):
print(key)
print(list(group))
# @gmail.com
# [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Gregory', 'mail': '@gmail.com'}]
# @something.com
# [{'name': 'Jules', 'mail': '@something.com'}, {'name': 'Kathrin', 'mail': '@something.com'}]
# @yahoo.com
# [{'name': 'Tom', 'mail': '@yahoo.com'}]
# gmail.com
# [{'name': 'Jim', 'mail': 'gmail.com'}]
修复了sort和gmail.com/@gmail.com后,我们得到了预期的输出:
import itertools
students = [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Tom', 'mail': '@yahoo.com'},
{'name': 'Jim', 'mail': '@gmail.com'}, {'name': 'Jules', 'mail': '@something.com'},
{'name': 'Gregory', 'mail': '@gmail.com'}, {'name': 'Kathrin', 'mail': '@something.com'}]
key_func = lambda student: student['mail']
students.sort(key=key_func)
for key, group in itertools.groupby(students, key=key_func):
print(key)
print(list(group))
# @gmail.com
# [{'mail': '@gmail.com', 'name': 'Paul'},
# {'mail': '@gmail.com', 'name': 'Jim'},
# {'mail': '@gmail.com', 'name': 'Gregory'}]
# @something.com
# [{'mail': '@something.com', 'name': 'Jules'},
# {'mail': '@something.com', 'name': 'Kathrin'}]
# @yahoo.com
# [{'mail': '@yahoo.com', 'name': 'Tom'}]