如何通过Python中的for循环传递列表列表?

我有一份清单清单:

sample = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]
count = [[4,3],[4,2]]
correctionfactor  = [[1.33, 1.5],[1.33,2]]

我计算每个字符(pi)的频率,将其平方然后求和(然后我计算het = 1 – sum).

The desired output [[1,2],[1,2]] #NOTE: This is NOT the real values of expected output. I just need the real values to be in this format. 

问题:我不知道如何在这个循环中传递列表列表(sample,count)来提取所需的值.我以前只使用此代码传递了一个列表(例如[‘TACT’,’TTTT’..]).

>我怀疑我需要添加一个更大的for循环,对样本中的每个元素进行索引(即样本上的索引[0] = [‘TTTT’,’CCCZ’]和样本[1] = [‘ATTA’,’ CZZC’].我不知道如何将其纳入代码中.

**

list_of_hets = []
for idx, element in enumerate(sample):
    count_dict = {}
    square_dict = {}
    for base in list(element):
         if base in count_dict:
            count_dict[base] += 1
        else:
            count_dict[base] = 1
    for allele in count_dict: #Calculate frequency of every character
        square_freq = (count_dict[allele] / count[idx])**2 #Square the frequencies
        square_dict[allele] = square_freq        
    pf = 0.0
    for i in square_dict:
        pf += square_dict[i]   # pf --> pi^2 + pj^2...pn^2 #Sum the frequencies
    het = 1-pf                    
    list_of_hets.append(het)
print list_of_hets

"Failed" OUTPUT:
line 70, in <module>
square_freq = (count_dict[allele] / count[idx])**2
TypeError: unsupported operand type(s) for /: 'int' and 'list'er

解决方法:

我不完全清楚你想如何处理数据中的’Z’项,但是这段代码在https://eval.in/658468中复制了样本数据的输出

from __future__ import division

bases = set('ACGT')
#sample = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]
sample = [['ATTA', 'TTGA'], ['TTCA', 'TTTA']]

list_of_hets = []
for element in sample:
    hets = []
    for seq in element:
        count_dict = {}
        for base in seq:
            if base in count_dict:
                count_dict[base] += 1
            else:
                count_dict[base] = 1
        print count_dict

        #Calculate frequency of every character
        count = sum(1 for u in seq if u in bases)
        pf = sum((base / count) ** 2 for base in count_dict.values())
        hets.append(1 - pf)
    list_of_hets.append(hets)

print list_of_hets

产量

{'A': 2, 'T': 2}
{'A': 1, 'T': 2, 'G': 1}
{'A': 1, 'C': 1, 'T': 2}
{'A': 1, 'T': 3}
[[0.5, 0.625], [0.625, 0.375]]

通过使用collections.Counter而不是count_dict,可以进一步简化此代码.

顺便说一句,如果不在’ACGT’中的符号总是’Z’,那么我们可以加快计数计算.摆脱bases = set(‘ACGT’)并改变

count = sum(1 for u in seq if u in bases)

count = sum(1 for u in seq if u != 'Z')
上一篇:如何在Ironpython(Python.net)中“打印”分部的结果?


下一篇:hdu 三部曲 Contestants Division