How to understand the following C# linq code that implements an algorithm to return all combinations of k elements from n
Can anyone elaborate on this code, or even provide a non-Linq version of this algorithm:
public static IEnumerable<IEnumerable<T>> Combinations<T>
(this IEnumerable<T> elements, int k)
{
return k == 0 ? new[] { new T[0] }
: elements.SelectMany(
(e, i) =>
elements
.Skip(i + 1)
.Combinations(k - 1)
.Select(c => (new[] {e}).Concat(c)));
}
The best way to understand this code is to read Eric Lippert's excellent serial:
- Production portfolio, part one
- Production Portfolio, Part II
- Production Portfolio, Part 3
- Production Portfolio, Part IV
- Production Portfolio, Part V
Basically, if we have IEnumerable
5 items, and we want to make all combinations of size 3, we need to produce the following:
{
// 50, 60, 70, 80, 90
{50, 60, 70}, // T T T F F
{50, 60, 80}, // T T F T F
{50, 60, 90}, // T T F F T
{50, 70, 80}, // T F T T F
{50, 70, 90}, // T F T F T
{50, 80, 90}, // T F F T T
{60, 70, 80}, // F T T T F
{60, 70, 90}, // F T T F T
{60, 80, 90}, // F T F T T
{70, 80, 90} // F F T T T
}
Eric's recursive implementation:
// Takes integers n and k, both non-negative.
// Produces all sets of exactly k elements consisting only of
// integers from 0 through n - 1.
private static IEnumerable<TinySet> Combinations(int n, int k)
{
// Base case: if k is zero then there can be only one set
// regardless of the value of n: the empty set is the only set
// with zero elements.
if (k == 0)
{
yield return TinySet.Empty;
yield break;
}
// Base case: if n < k then there can be no set of exactly
// k elements containing values from 0 to n - 1, because sets
// do not contain repeated elements.
if (n < k)
yield break;
// A set containing k elements where each is an integer from
// 0 to n - 2 is also a set of k elements where each is an
// integer from 0 to n - 1, so yield all of those.
foreach(var r in Combinations(n-1, k))
yield return r;
// If we add n - 1 to all the sets of k - 1 elements where each
// is an integer from 0 to n - 2, then we get a set of k elements
// where each is an integer from 0 to n - 1.
foreach(var r in Combinations(n-1, k-1))
yield return r.Add(n-1);
}
In your case, the code works like this:
return k == 0
// if we are done, return empty array
? new[] {new T[0]}
// for each element and each number from 0 to enumerable size
: elements.SelectMany((e, i) =>
elements
//skip first i elements, as we already produced combination with them
.Skip(i + 1)
//get all the combinations with size k - 1
.Combinations(k - 1)
//add current element to all produced combinations
.Select(c => (new[] {e}).Concat(c)));
Such non-recursive form of code will be huge and unreadable, try to understand recursion:
Suppose we have 5 elements IEnumerable
: { 16, 13, 2, 4, 100 }
, and all combinations of size 2 are required (the total of the result set equals the binomial coefficients from 5 to 2= 5! / (2! * 3!) = 10
)
Your code will produce:
- For
16
us, we need all combinations of sizes1
, starting from the second position: - For elements
13
we need all size combinations0
starting from the third position - First result:
{ 16, 13 }
- skip
13
, for elements2
we need all size combinations0
starting from the fourth position - Second result:
{ 16, 2 }
- skip
13, 2
, for elements4
we need all size combinations0
starting from the fifth position - Third result:
{ 16, 4 }
- skip
13, 2, 4
, for elements100
we need all size combinations0
starting from the sixth position - Fourth result:
{ 16, 100 }
- ...repeat all from above
13
,2
,4
:
{ 13, 2 }
,{ 13, 4 }
,{ 13, 100 }
,{ 2, 4 }
,{ 2, 100 }
,{ 4, 100 }
We got all 10 combinations we needed. The overload the code author is using is this :Enumerable.SelectMany<TSource, TResult> Method (IEnumerable<TSource>, Func<TSource, Int32, IEnumerable<TResult>>)
Selector
type:System.Func<TSource, Int32, IEnumerable<TResult>>
The transformation function applied to each source element;
the second parameter of the function represents the index of the source element.