all permutations from 2 lists but with a condition on the amount of elements(来自2个列表的所有排列，但以元素数量为条件)

本文介绍了来自2个列表的所有排列，但以元素数量为条件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在Python中我有：

a = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
b = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

我正在尝试构建从列表a或b中选择的10个字符的所有可能排列，其中将包括任何排列，其中至少3个元素来自a，至少3个元素来自b。这些元素可以重复，因为它们是从列表中选择的。

例如，有效排列：wa85ff3d56、333aaaaaaa、a33333aa33

做到这一点最有效的方法是什么？ PS：是的，我知道，答案将是非常大的。这是意料之中的。它不会存储在内存中，而是会在实时计算中进行流式处理和使用。

目前，我生成所有可能的组合，然后对它们进行置换。然而，计算几乎肯定需要几个月的时间。我有这样一段代码，它确实有效，但显然不是最有效的方式。例如，对于列表a中的3个和列表b中的7个，它生成2.864429568e+14排列。考虑到我必须这样做5次(3,7), (4,6), (5,5), (6,4), (7,3)，我总共得到了1.432214784e+15个排列。相比之下，如果没有限制，就会有36^10 = 3.65615844e+15。因此，此方法将删除几乎一半的可能排列。

import string
import itertools


a = list(string.digits)
b = list(string.ascii_lowercase)

a_combinaisons_3 = list(itertools.combinations(a, 3))
b_combinaisons_7 = list(itertools.combinations(b, 7))

a3b7_combinaisons = []

for a_element in a_combinaisons_3:
    for b_element in b_combinaisons_7:
        a3b7_combinaisons.append(a_element + b_element)

a3b7_permutations = list(itertools.permutations(a3b7_combinaisons))
print(len(a3b7_combinaisons))  # 78936000
print(len(a3b7_permutations))  # 1.432214784e+15

使用`itertools`

结合使用Python中的基本知识

重新排序和替换(重复)

了解itertools提供的各种combinatoric iterators如何工作以及它们之间的区别非常重要。

让我们考虑一个简单的候选集合A = [1, 2, 3, 4]，我们要从其中抽取包含三个元素的组合(如提问者通常所说)。

>>> A = [1,2,3,4]

itertools.combinations选择时不重新排序(即，每个输出值将以与输入中相同的顺序显示)，并且不重复结果值。因此，这只产生4个结果：

>>> list(itertools.combinations(A, 3))
[(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]

itertools.permutations表示结果可以以任何顺序出现，即输入的给定子集将以不同的顺序在输出中多次出现。

>>> list(itertools.permutations(A, 3)) # 24 results omitted for brevity

这四个combinations以六个顺序显示，总共有24个结果。

itertools.combinations_with_replacement选择而不重新排序，但允许多次选择输入中的元素：

>>> list(itertools.combinations_with_replacement(A, 3)) # 20 results omitted for brevity

有四个结果，其中每个元素被选择三次，六个双精度后面跟一个单精度(必须更高)，六个单精度后跟一个双精度，加上所有单精度的四个combinations。在一般情况下，计算结果并非易事。

如果要在输出结果中允许重复和重新排序，可以使用itertools.product：

>>> list(itertools.product(A, repeat=3)) # 64 results omitted for brevity

简单地说，我们每三次从四个元素中自由选择一次，得到4*4*4=64个结果。

itertools.product实现了所谓的Cartesian product。通常，它从多个指定的序列中各提取一个元素。但是，使用替换生成&置换等同于多次从同一序列中提取一个元素，因此提供了repeat关键字作为此操作的便捷快捷方式-因此您不必多次指定序列。我的意思是，你可以写itertools.product(*[A]*3)，但那很难看。

候选集中的重复项怎么办？

这与OP提出的问题无关；但为了完整性，重要的是要了解这些函数都不关心候选集合的元素是否相等，甚至相同：

>>> x = object()
>>> candidates = [x, x, x, x]
>>> results = list(itertools.combinations(candidates, 3))
>>> len(results)
4
>>> results[0] == results[1] == results[2] == results[3]
True

如何实现约束？

最简单的方法是生成一个包含性结果(在OP的例子中，通过将a和b候选连接在一起，并生成其中10个候选的product)，并过滤掉不符合要求的内容。然而，这是低效的，而且可能很难看(我们需要分析输出元组以确定其元素是否满足条件-在OP的例子中，它们是从a还是从b提取的。如果您确实想采用这样的方法，我通常建议您编写一个生成器函数：

def OP_problem():
    for result in itertools.product(a+b, repeat=10):
        a_count = len(x for x in result if x in a)
        # the trick is that every element was either from a or b;
        # so "at least 3 a's and at least 3 b's" is equivalent to
        # "at least 3 a's and at most 7 a's".
        if 3 <= a_count <= 7:
            yield result

或者在足够简单的情况下，生成器表达式：

(
    # maybe you don't think this is simple enough :)
    result for result in itertools.product(a+b, repeat=10)
    if 3 <= len(x for x in result if x in a) <= 7
)

通常最好尝试将问题分解成小块，然后将结果放在一起。例如，文档中有一个计算power sets的配方，它简单地将0元素、1元素等组合的结果链接到所有元素。(另一种方法是找到N个布尔值的笛卡尔乘积，每个布尔值表示是否包括给定的元素，然后将它们转换为子集。)

在我们的例子中，我们可以分别找到每个a元素计数的结果。让我们考虑一下每个列表中有5个元素的情况；应该清楚如何泛化并组合结果(提示：使用itertools.chain.from_iterable，如文档中的powerset配方所示)。

疑难情况：分区和位置选择

我想在这里展示一种高级技术，以解决从a中选择5个元素和从b中选择5个元素并将它们混合的问题。这个想法很简单，我们从所有可能的位置(即索引到10个元素的序列)中选择a将去的位置，并为每个位置生成相应的输出结果。这些职位是没有替代的组合(你知道为什么吗？)可能的索引值。

因此，如下所示：
def make_combination(letter_positions, chosen_letters, chosen_digits): result = [None] * 10 for letter, position in zip(chosen_letters, letter_positions): result[position] = letter # Figure out where the digits go, using set arithmetic to find the # remaining positions, then putting them back in ascending order. digit_positions = sorted(set(range(10)) - set(chosen_letters)) for digit, position in zip(chosen_digits, digit_positions): result[position] = digit assert None not in result return tuple(result) def five_letters_and_five_digits(): letters = 'abcdefghijklmnopqrstuvwxyz' digits = '0123456789' # It's not *easy*, but it's fairly elegant. # We separately generate the letter positions, letter selection # and digit selection, using `product` to get the cartesian product # of *those* possibilities; then for each of those, we translate # into a desired output - using `starmap` to iterate. return itertools.starmap( make_combination, itertools.product( itertools.combinations(range(10), 5), itertools.product(letters, repeat=5), itertools.product(digits, repeat=5) ) )
选择位置的技巧对于解决partitioning问题也很有用。这个想法很简单，你选择分区的位置(对于N个元素，通常有N-1个位置来放置它们)，作为组合-或者有(如果允许零大小的分区)，或者没有(否则)替换。

这篇关于来自2个列表的所有排列，但以元素数量为条件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持编程学习网！