python-从numpy数组中删除出现多次的元素

2024-03-20 21:48:22

问题是,如何才能完全删除数组中多次出现的元素.在下面,您会发现对于大型数组,这种方法非常慢.
知道这样做的想法吗？提前致谢.

import numpy as np

count = 0
result = []
input = np.array([[1,1], [1,1], [2,3], [4,5], [1,1]]) # array with points [x, y]

# count appearance of elements with same x and y coordinate
# append to result if element appears just once

for i in input:
    for j in input:
        if (j[0] == i [0]) and (j[1] == i[1]):
            count += 1
    if count == 1:
        result.append(i)
    count = 0

print np.array(result)

更新：由于过分简化

再次明确一点：如何从数组/列表中删除与某个属性有关的多次出现的元素？此处：元素长度为6的列表,如果每个元素的第一项和第二项在列表中均出现多次,请从列表中删除所有相关元素.希望我不要混淆. Eumiro在这方面为我提供了很多帮助,但是我没有设法将输出列表弄平,因为它应该是:(

import numpy as np 
import collections

input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]

# here, from input there should be removed input[0], input[1] and input[4] because
# first and second entry appears more than once in the list, got it? :)

d = {}

for a in input:
    d.setdefault(tuple(a[:2]), []).append(a[2:])

outputDict = [list(k)+list(v) for k,v in d.iteritems() if len(v) == 1 ]

result = []

def flatten(x):
    if isinstance(x, collections.Iterable):
        return [a for i in x for a in flatten(i)]
    else:
        return [x]

# I took flatten(x) from https://*.com/a/2158522/1132378
# And I need it, because output is a nested list :(

for i in outputDict:
    result.append(flatten(i))

print np.array(result)

因此,这可行,但是对于大列表来说是不可行的.
首先我得到了
    RuntimeError：超过最大递归深度,以cmp为单位
申请后
    sys.setrecursionlimit(10000)
我有
    分段故障
如何为大型列表实施Eumiros解决方案> 100000个元素？

解决方法:

np.array(list(set(map(tuple, input))))

退货

array([[4, 5],
       [2, 3],
       [1, 1]])

更新1：如果您也想删除[1,1](因为它多次出现),则可以执行以下操作：

from collections import Counter

np.array([k for k, v in Counter(map(tuple, input)).iteritems() if v == 1])

退货

array([[4, 5],
       [2, 3]])

更新2：输入为[[1,1,2],[1,1,3],[2、3、4],[4、5、5],[1,1,7]]：

input=[[1,1,2], [1,1,3], [2,3,4], [4,5,5], [1,1,7]]

d = {}
for a in input:
    d.setdefault(tuple(a[:2]), []).append(a[2])

d现在是：

{(1, 1): [2, 3, 7],
 (2, 3): [4],
 (4, 5): [5]}

因此,我们要获取所有具有单个值的键值对,然后重新创建数组：

np.array([k+tuple(v) for k,v in d.iteritems() if len(v) == 1])

array([[4, 5, 5],
       [2, 3, 4]])

更新3：对于更大的阵列,您可以将我以前的解决方案改编为：

import numpy as np
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
d = {}
for a in input:
    d.setdefault(tuple(a[:2]), []).append(a)
np.array([v for v in d.itervalues() if len(v) == 1])

array([[[456,   6,   5, 343, 435,   5]],
       [[  1,   3,   4,   5,   6,   7]],
       [[  3,   4,   6,   7,   7,   6]],
       [[  3,   3,   3,   3,   3,   3]]])

码农公寓

相关文章