# Collecting Items : The set¶

A set is a collection of items without a defined order. We don’t refer to items by their position in the sequence, since there’s no order. We can test for presence or absence, however.

We’ll look at what Python means by a Set in What does Python mean by “Set”?. We’ll show how to create a Set in How Do We Create a Set?. We’ll look at the various operations we can perform on a Set in Operations We Can Perform On A Set.

We’ll look at the comparison operators for sets in Comparing Sets: Subset and Superset. There are a large number of method functions, which we’ll look at in Method Functions of Sets. Some of the statements we’ve already seen interact with sets; we’ll look at this in Statements and Sets. We’ll look at some built-in functions in Built-in Functions For Sets. We’ll present a moderately complex example in Example of Using Sets.

## What does Python mean by “Set”?¶

A set is a collection of values. In a way, a set is the simplest possible collection. Recall that a list or tuple keeps the values in sequential order; items are identified by their position in the sequence. A set, on the other hand, doesn’t keep the values in any particular order, and you don’t identify the values by their position. A set, since it doesn’t have sequential positions, can only have one instance of each distinct object.

This definition of set parallels the mathematical definition used in set theory. This means that we have operators available to determine the common elements (“intersection”) between two sets, the union of two sets, and the differences between two sets.

A set is sometimes handy because we have to work with a collection of things where order doesn’t matter, but we want to be sure to avoid duplicates. For example, if we are sending letters to families of children in school, each child contributes one family to the set. If we have siblings in the school, we don’t want to include their family twice. This is the central idea behind accumulating a set: some elements may be mentioned more than once, but once is enough to be a member of a set.

Mutability. We have to look at two aspects of mutability. The items in a set must be immutable: strings, numbers and tuples. We can’t easily create a set which contains a bunch of list objects.

Why not?

Let’s say we had two lists:

```list_one = [1, 2]
list_two = [1]
```

Assume, for the moment, that we could somehow create a set from these two lists.

What happens when we do this?

```list_two.append( 2 )
```

Oops. Now we have two lists in the set which appear identical.

This – clearly – must be forbidden. That’s one aspect of mutability: all items in a set must be immutable.

The second aspect of ummutability reflects sets themselves. There are two flavors of sets: set and frozenset. The ordinary set is mutable, in the same way that a list is mutable. A frozenset, on the other hand, is immutable, more like a tuple.

As with tuples, we can create a new, larger frozenset from the union of two other frozensets. The original sets doen’t change, but we can use them to create a new set.

## How Do We Create a Set?¶

Sets are created using the set() or frozenset() factory functions. Unlike sequences, there’s no way to write down a literal value for a set. We can make sets out of lists or tuples using the set() factory function.

set(sequence) → set

Creates a set from the items in sequence. If the sequence is omitted, an empty set is created.

Duplicates will be removed, for example, set([1,1,2,3,5]) == set([1, 2, 3, 5]).

Also, the original order may not be preserved.

frozenset(sequence) → set

Creates a set which can no longer be updated from the items in sequence. This set is immutable and can be used like a tuple.

Here are some examples of creating sets.

```fib=set( [1,1,2,3,5,8,13] )
prime=set( [2,3,5,7,11,13] )
_= "now is the time for all good men to come to the aid of their party"
words=set( _.split() )
craps=set( [(1,1), (1,2), (2,1), (6,6)] )
```
fib: This is a set of Fibonacci numbers. The value 1 is duplicated on the input sequence. The set can’t have duplicates, so the resulting set value will be set([1,2,3,5,8,13]). This is a set of prime numbers. There are no duplicates in the input sequence, so the set has the same number of elements. This is a set of distinct words extracted from the phrase. The len(_.split()) is 16. Then len(words) is 14. If you check carefully, you’ll see that the strings 'to' and 'the' are duplicated in the input sequence. This is a set of pairs of dice. On the first roll of a Craps game, if the shooter rolls any of these combinations, totalling 2, 3 or 12, the game is over, and the shooter has lost. Each element in the set is a 2-tuple made up of the two individual dice.

Tip

Debugging set()

A common mistake is to do something like set( 1, 2, 3 ), which passes three values to the set() function. If you get a TypeError: set expected at most 1 arguments, got n, you didn’t provide proper tuple to the set factory function.

Another interesting problem is the difference between set( ("word",) ) and set( "word" ).

• The first example provides a 1-element sequence, ("word,"), to set(), which becomes a 1-element set.
• The second example passes a 4-character string, "word", which becomes a 4-element set.

In the case of creating sets from strings, there’s no error message. The question is really “what did you mean?” Did you intend to put the entire string into the set? Or did you intend to break the string down to individual characters, and put each character into the set?

## Operations We Can Perform On A Set¶

Sets have a large number of operators. Sets are widely-studied mathematical objects, and a number of those mathematical operations are defined in Python. There are four operations we can perform on sets: union (|), intersection (&), difference (-) and symmetric difference (^).

Important

But Wait!

You may recognize these operators (|, &, - and ^) from Special Ops : Binary Data and Operators. These symbols also stand for operators that apply to individual bits in an integer value.

Remember that Python examines the objects on either side of the operator to see what type of data object they are. When you write an expression that involves two sets, Python will do the set operations. When presented two integers, Python will do the special binary operations.

The | operator. The | operator computes the union of two sets; it computes a new set which has all the elements from the two sets which are being unioned. In essence, an element is a member of s1 | s2 if it is a member of s1 or a member of s2.

Here’s the Venn diagram that uses shading to show the elements which are in the union of two sets.

Here are some examples.

```>>> fib | prime
set([1, 2, 3, 5, 7, 8, 11, 13])
>>> fib | words
set([1, 2, 3, 5, 8, 'is', 'men', 13, 'good', 'aid', 'now', 'come', 'to', 'for', 'all', 'of', 'their', 'time', 'party', 'the'])
```

In the first example, we created a union the fib set and the prime set. In the second example, we computed a fairly silly union that includes the fib set and the words set; since one set has numbers and the other set has strings, it’s not clear what we would do with this strange collection of unrelated things.

The union operator can also be written using method function notation.

```>>> fib.union( prime )
set([1, 2, 3, 5, 7, 8, 11, 13])
>>> words.union( fib )
set([1, 'all', 'good', 5, 'for', 'to', 8, 'of', 'is', 'men', 2, 13, 'their', 3, 'time', 'party', 'the', 'now', 'come', 'aid'])
```

Note that the two results of fib | words and words.union(fib) have the same elements in a different order. We can assure that this is true with something like the following:

```>>> fib | words == words.union(fib)
True
>>> fib | words == words | fib
True
```

The above two expressions show us that the essential mathematical rules are true, even if the order of the elements is sometimes different.

The & operator. The & operator computes the intersection of two sets; it computes a new set which has only the elements which are common to the two sets which are being intersected. In essence, an element is a member of s1 & s2 if it is a member of s1 and a member of s2.

Here’s the Venn diagram that uses shading to show the elements which are in the intersection of two sets.

Here are some examples.

```>>> fib & prime
set([2, 3, 5, 13])
>>> fib & words
set([])
```

In the first example, we created an intersection of the fib set and the prime set. In the second example, we computed a fairly silly intersection that shows that there are no common elements between the fib set and the words set.

The intersection operator can also be written using method function notation.

```>>> prime.intersection( fib )
set([2, 3, 5, 13])
>>> words.intersection( fib )
set([])
```

The - operator. The - operator computes the difference between two sets; it computes a new set which starts with elements from the left-hand set and then removes all the matching elements from the right-hand set. It fits well with the usual sense of subtraction. In essence, an element is a member of s1 - s2 if it is a member of s1 and not a member of s2.

Here’s the Venn diagram that uses shading to show the elements which are in the difference, s1-s2.

Here are some examples.

```>>> fib-prime
set([8, 1])
>>> prime-fib
set([11, 7])
>>> fib-words
set([1, 2, 3, 5, 8, 13])
```

In the first example, we found the elements which are in the fib set, but not in the prime set. We can think of this as starting with the fib set and removing all the values that are in the prime set. In the second example, we found the elements which are in the prime set, but not in the fib set.

The third example shows the fib set with the word set removed. In this case, it’s still the same fib set. We can prove this evaluating fib-words == fib.

The difference operator can also be written using method function notation.

```>>> prime.difference( fib )
set([11, 7])
>>> fib.difference( prime ) == fib-prime
True
```

The ^ operator. The ^ operator computes the “symmetric difference” between two sets; it computes a new set which elements that are in one or the other, but not both. Since a union is elements which are in one set or the other, and an intersection is elements which are in both, the symmetric difference of two sets is (s1|s2)-(s1&s2). Rather than have to write this out, we have a pleasant short-hand operator.

Here’s the Venn diagram that uses shading to show the elements which are in the symmetric difference of two sets.

Here are some examples.

```>>> fib^prime
set([1, 7, 8, 11])
>>> fib^words
set([1, 'all', 'good', 5, 'for', 'to', 8, 'of', 'is', 'men', 2, 13, 'their', 3, 'time', 'party', 'the', 'now', 'come', 'aid'])
```

In the first example, we found the elements which are in the fib set or the prime set, but not both. In effect, a union is computed and the common elements removed from that union. In the second example, we found the elements which are in the fib set or the words set, but not both. In this case, there are no common elements, so the symmetric difference is the same as the union.

The symmetric difference operator can also be written using method function notation.

```>>> prime.symmetric_difference( fib )
set([1, 7, 8, 11])
>>> prime.symmetric_difference( fib ) == prime ^ fib
True
>>> prime ^ fib == (prime|fib)-(prime&fib)
True
```

## Comparing Sets: Subset and Superset¶

Some of the standard comparisons (<=, >=, ==, !=, in and not in) work with sets, but some of these operators have a meaning that’s appropriate to sets. For tuples and strings, where the order of the elements matters, the collections are compared element by element. For sets, the order of the elements doesn’t matter, so the comparisons have slightly different semantics.

The in and not in operators are the same as for other collections. They check to see if a given element is in the set or not in the set.

The following Venn diagram illustrates s2 being a subset of s1.

The set comparisons are equality and subset comparisons. Therefore, s1 <= s2 asks if set s1 is a subset of s2. The == and != operations do what you’d expect, comparing to see if the two sets have the same collection of elements.

```>>> diff = prime & fib
>>> diff <= prime
True
>>> diff <= fib
True
```

In this example, we computed the intersection of prime and fib, which was the small set of numbers common to both sets, set([2, 3, 5, 13]). This set, by definition, has to be a subset of both of the original sets.

As with other set operators, we also have method function notation for these operations.

```>>> "to" in words
True
>>> diff.issubset( prime )
True
>>> prime.issuperset( diff )
True
>>> (1,2) in craps
True
```

## Method Functions of Sets¶

We’ve already seen a large number of method functions that apply to sets. These method functions all compute new sets from two existing sets. In addition to these, there methods functions to change the elements in a set. Finally, there are also method functions for updating a set based on another set.

We’ll review the set operators first, since we’ve already seen them. We’ll presume that we have two sets, s1 and s2, for each of these functions.

Operators. These method functions are the same as the various set operators. They apply an operation between two sets and create a new set.

class set
set.union(s2) → set

Returns a new set which is the union of the distinct elements of s1 and s2. This can also be written :s1|s2.

```>>> set( [ "now", "is" ] ).union( set( [ "is", "the" ] ) )
set(['is', 'now', 'the'])
```
set.intersection(s2) → set

Returns a new set which is the intersection of the elements of s1 and s2. This is only the common elements to both sets. This can also be written s1&s2.

```>>> set( [ "now", "is" ] ).intersection( set( [ "is", "the" ] ) )
set(['is'])
```
set.difference(s2) → set

Returns a new set which has only the elements from s1 that are not also elements of s2. The new set is effectively a copy of s1 with elements from s2 removed. This can also be written s1-s2.

```>>> set( [ "now", "is" ] ).difference( set( [ "is", "the" ] ) )
set(['now'])
```
set.symmetric_difference(s2) → set

Returns a new set which has elements that are unique to s1 and s2. The new set is effectively the union of s1 and s2 with the intersection elements removed. This can also be written as s1^s2.

```>>> set( [ "now", "is" ] ).symmetric_difference( set( [ "is", "the" ] ) )
set(['now', 'the'])
```

Accessors. These method functions comparison operators. They apply a comparison between two sets and create a boolean value.

class set
set.issubset(s2) → boolean

Returns True if s1 is a subset of s2. To be a subset, all elements of s1 must be present in s2. This can also be written as s1 <= s2.

```>>> set( [ "now", "is" ] ).issubset( set( [ "is", "now", "the" ] ) )
True
```
set.issuperset(s2) → boolean

Returns True if s1 is a superset of s2. To be a superset, all elements of s2 must be present in s1. This can also be written as s1 >= s2.

```>>> set( [ "now", "is" ] ).issuperset( set( [ "is", "now", "the" ] ) )
False
```

Manipulators. This next group of methods manipulate a set by adding or removing individual elements. These operations do not apply to a frozenset.

class set

Adds the given object to set s1. If the object did not previously exist in the set, it is added. If the object was already present in the set, the s1 doesn’t change.

```>>> craps=set()
>>> craps.add( (1,1) )
>>> craps.add( (6,6) )
>>> craps.add( (1,2) )
>>> craps.add( (2,1) )
>>> craps
set([(1, 2), (1, 1), (2, 1), (6, 6)])
```
set.remove(object)

Removes the given object from the set s1. If the object did not exist in the set, an KeyError exception is raised.

```>>> colors= set( [ "red", "black", "green" ] )
>>> colors.remove( "green" )
>>> colors
set(['black', 'red'])
```
set.pop() → object

Removes an object from set s1, and returns it. Since there is no defined ordering to a set, any object is eligible to be removed. If the set is already empty, a KeyError is raised.

```>>> colors= set( [ "red", "black", "green" ] )
>>> while len(colors):
...     print(colors.pop())
...
green
black
red
>>> colors
set([])
```
set.clear()

Removes all objects from the set. After this method, the set is empty.

```>>> colors= set( [ "red", "black", "green" ] )
>>> colors
set(['green', 'black', 'red'])
>>> colors.clear()
>>> colors
set([])
```

Updates. The following group of methods update a set using another set of elements. Each of these method functions parallels the operator method functions, shown above.

There is a significant difference, however. These methods actually mutate the set object to which they are attached. Each of these functions is available as an augmented assignment operator, which emphasizes the change to an set.

class set
set.update(s2)

Adds all the elements of set s2 to set s1. This can also be written as s1 |= s2.

```>>> two=set( [ (1,1) ] )
>>> three=set( [ (2,1), (1,2)] )
>>> twelve=set( [ (6,6) ] )
>>> craps=set()
>>> craps.update( two )
>>> craps.update( three )
>>> craps.update( twelve )
>>> craps
set([(1, 2), (1, 1), (2, 1), (6, 6)])
```
set.intersection_update(s2)

Updates s1 so that it is the intersection of s1&s2. In effect, this removes elements from s1 which are not also found in s2. This can also be written as s1 &= s2.

```>>> ph1="now is the time for all good men to come to the aid of their party"
>>> words=set( ph1.split() )
>>> words
set(['party', 'all', 'good', 'for', 'their', 'of', 'is', 'men', 'to', 'time', 'aid', 'the', 'now', 'come'])
>>> ph2="the quick brown fox jumped over the lazy dog"
>>> words2=set( ph2.split() )
>>> words2
set(['brown', 'lazy', 'jumped', 'over', 'fox', 'dog', 'quick', 'the'])
>>> words.intersection_update(words2)
>>> words
set(['the'])
```
set.difference_update(s2)

Updates s1 by removing all elements which are found in s2. This can also be written as s1 -= s2.

```>>> ph1="now is the time for all good men to come to the aid of their party"
>>> words=set( ph1.split() )
>>> words
set(['party', 'all', 'for', 'their', 'of', 'time', 'aid', 'now', 'come'])
>>> ph2="to do good to men unthankful is to cast water into the sea"
>>> words2=set( ph2.split() )
>>> words2
set(['do', 'good', 'cast', 'is', 'men', 'the', 'water', 'to', 'sea', 'unthankful', 'into'])
>>> words.difference_update(words2)
>>> words
set(['party', 'all', 'for', 'their', 'of', 'time', 'aid', 'now', 'come'])
```

## Statements and Sets¶

There are two statements that are associated with sets: the various kinds of assignment statements, and – because a set has an iterator – the for statement.

The Assignment Statements. We’ve seen basic assignment statement, and how it applies to sets. In the method functions section, we saw four augmented assignment statements, |=, &=, -= and ^=. These parallel the augmented assignment statements we saw in Assignment Combo Package. These augmented assignment statements are used to modify a set by adding or removing elements.

Note that the augmented assignments statements only apply to a set. A frozenset can’t be updated after it’s created.

The for Statement. As with other collections, the for statement will step through each element of a set.

```>>> fib=set( [1,1,2,3,5,8,13] )
>>> prime=set( [2,3,5,7,11,13] )
>>> for n in fib & prime:
...     print(n)
...
2
3
5
13
```

In this example, we’ve created a set, fib, of the first seven Fibonacci numbers. We also created a set, prime, of the first six prime numbers. Our for statement first computes the intersection of these two sets, then sets n to each value in that intersection.

## Built-in Functions For Sets¶

A number of built-in functions create or deal with sets. The following functions apply to all collections, including sets.

len(iterable) → integer

Return the number of items of a set, sequence or mapping.

```>>> craps= set([(1, 2), (1, 1), (2, 1), (6, 6)])
>>> len(craps)
4
```
max(iterable) → value

Returns the largest value in sequence.

```>>> craps= set([(1, 2), (1, 1), (2, 1), (6, 6)])
>>> max(craps)
(6, 6)
```

Recall that tuples are compared element-by-element. The tuple (6, 6) has a first element that is greater than all others.

min(sequence) → value

Returns the smallest value in sequence.

```>>> craps= set([(1, 2), (1, 1), (2, 1), (6, 6)])
>>> min(craps)
(1, 1)
```

Recall that tuples are compared element-by-element. The tuple (1, 1) has a first element that is less than all but one other tuple, (1, 2). If the first elements are the same, then the second element is compared.

Iteration Functions. These functions are most commonly used with a for statement to process set items.

enumerate(iterable) → iterator

Enumerate the elements of a set, sequence or mapping. This yields a sequence of tuples based on the original set. Each of the result tuples has two elements: a sequence number and the item from the original set.

Note that sets do not have a defined ordering, so this can, in principle, yield the elements of the set in different orders. As a practical matter, the ordering doesn’t spontaneously change. However, insertion or removal of an element may appear to change the enumerated set.

This is generally used with a for statement. Here’s an example:

```>>> craps= set([(1, 2), (1, 1), (2, 1), (6, 6)])
>>> for position, roll in enumerate( craps ):
...     print( position, roll, sum(roll) )
...
0 (1, 2) 3
1 (1, 1) 2
2 (2, 1) 3
3 (6, 6) 12
```
sorted( iterable [,key] [,reverse] ) → iterator

This iterates through an iterable object like a set in ascending or descending sorted order. Unlike the sort() method function of a list, this does not update the list, but leaves it alone.

This is often used with a for statement. It can also be used with the list() function to create an ordered list from a set.

Here’s an example:

```>>> craps= set([(1, 2), (1, 1), (2, 1), (6, 6)])
>>> descending= list( sorted( craps, reverse=True ) )
>>> descending
[(6, 6), (2, 1), (1, 2), (1, 1)]
>>> craps
set([(1, 2), (1, 1), (2, 1), (6, 6)])
```

We’ve created an ordered list from the original set: craps is a set; descending is a list in descending order. Sets have no defined ordering, so creating a list from a set is the only way to impose a specific order on the elements.

Aggregation Functions. The following functions create an aggregate value from a set.

sum(iterable) → number

Sum the values in the iterable (set, sequence, mapping). All of the values must be numeric.

```>>> odd_8 = set( range(1,8*2,2) )
>>> sum(odd_8)
64
>>> odd_8
set([1, 3, 5, 7, 9, 11, 13, 15])
```
all(iterable) → boolean

Return True if all values in the iterable (set, sequence, mapping) are equivalent to True.

The all() function is often used with Generator Expression, which is covered in List Construction Shortcuts.

```>>> craps= set([(1, 2), (1, 1), (2, 1), (6, 6)])
>>> hardways = set( (d1,d1) for d1 in range(1,7) )
>>> horn = hardways - craps
>>> horn
set([(3, 3), (4, 4), (5, 5), (2, 2)])
>>> all( 4 <= (d1+d2) <= 10 for d1,d2 in horn )
True
```
1. We created the set of craps rolls
2. We created the set of “hardways” rolls with a generator expression. The hard way is to roll a number with both dice equal.
3. The “horn” bets include the hardways which are not craps.
4. We evaluate :4 <= (d1+d2)<= 10 for each roll using a generator expression. All the horn bets are between 4 and 10. [This isn’t surprising, really. It’s hard to find simple examples of all() or any().]
any(iterable) → boolean

Return True if any value in the iterable (set, sequence, mapping) is equivalent to True.

The any() function is often used with Generator Expression, which is covered in List Construction Shortcuts.

```>>> craps= set([(1, 2), (1, 1), (2, 1), (6, 6)])
>>> hardways = [ d1==d2 for d1,d2 in craps ]
>>> any(hardways)
True
>>> all(hardways)
False
```
1. We created the set of craps rolls.
2. We evaluated d1==d2 in a generator expressions too see if the rolls are made “the hard way”, that is, have both dice equal.
3. The any() function tells us that at least one element is True.
4. The all() function tells us that not all elements are True.

## Example of Using Sets¶

Sets are all about membership and deciding if some value is in the set or out of the set. This, it turns out, is the essence of many of the basic rules for casino games. The random device (dice, wheel or cards) picks a value. Some set of bets are winners. If your bet is in that set, you’ll get paid.

We’ll break this example into two parts. The first part will show how to build some sets. Then we’ll move on to use those sets.

set_example.py, Part 1

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23``` ```from __future__ import print_function dice=set() r1_win=set() r1_lose=set() point=set() hardways=set() r1_win.add( (5,6) ) r1_win.add( (6,5) ) r1_lose= set( [(1,1),(6,6),(2,1),(1,2)] ) for d1 in range(1,7): r1_win.add( (d1,7-d1) ) for d2 in range(1,7): dice.add( (d1,d2) ) hardways= set( [(2,2),(3,3),(4,4),(5,5)] ) point= dice-r1_win-r1_lose print("winners", r1_win) print("losers ", r1_lose) print("points ", point) assert hardways <= point assert r1_win | r1_lose | point == dice ```
1. First, we create a number of empty sets that we’ll use to examine throws of the dice in a Craps game. The dice set will contain the complete set of all 36 possible outcomes. The r1_win set will contain the different ways we can win on the first throw; it will have the various ways we can throw 7 or 11. The r1_lose set will contain the different ways we can lose on the first throw; it will have the various ways we can throw 2, 3 or 12. The point set is all of the remaining throws, which establish a point. Finally, the hardways set contains the various points on which the two dice are equal, rolling a value “the hard way”.
1. We insert the two ways of rolling 11 into the r1_win set.
1. We insert the ways of rolling 2, 12, and 3 into the r1_lose set.
2. We’ve set d1 to all values from 1 to-one-before 7. Therefore, the value of (d1,7-d1) will be one of the six ways to roll a 7. We add this to the r1_win set.
1. We’ve set d1 to all the values from 1 to-one-before 7; independently, we’ve set d2 to all values from 1 to 6. We put every combination of dice rolls into dice.
1. We create a set containing the four point rolls where the two dice are equal and assign this set to the variable hardways.
2. Finally, we take the complete set of dice, remove the roll 1 wins, remove the roll 1 losers, and assign this set to the variable point.

Note the two assertions that we make as part of our initialization:

• We assert that the dice rolls in hardways are a subset of the dice rolls in points. This is a matter of definition in Craps, and we need to be sure that the preceding statements actually accomplish this.
• We assert that the union of r1_win, r1_lost and point is the entire set of possible dice rolls. This, also, is a matter of definition, and we need be sure that our initialization procedure has established the proper conditions.

Once we’ve built some sets, we can now use the sets to evaluate some dice rolls. We can use this kind of dice-rolling experiment to evaluate a betting strategy.

set_example.py, Part 2

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14``` ```import random for i in range(10): d1=random.randrange(1,7) d2=random.randrange(1,7) roll= (d1,d2) if roll in r1_win: print(roll, "winner") elif roll in r1_lose: print(roll, "loser") else: if roll in hardways: print(roll, "hard point") else: print(roll, "point") ```
1. We import the random module so that we can use the randrange() function to generate random die rolls.
1. After picking two numbers in the range of 1 to-one-before 7, we assemble the variable roll as the dice roll.
2. If roll is in the r1_win set, we have a winner on the first roll.
1. If roll is in the r1_lose set, we have a loser on the first roll.
1. Otherwise, we have a roll that has established a point. We can check for membership in the hardways set to see if it was one of the special ways to roll a 4, 6, 8 or 10.

## Set Exercises¶

1. Unique Words.

You can use Python’s triple-quoted string to create a larger passage of text. You can split this into words, make the words lower-case, and then accumulate a set of distinct words in the text.

Perhaps the hardest part of this is removing the punctuation. However, the list of punctuation marks is rather short, and you can generally replace all punctuation marks with spaces when doing simple kinds of analysis of English text.

```text="""The next day being Sunday, the hands were turned up to divisions, and
the weather not being favourable, instead of the service the articles
of war were read with all due respect shown to the same, the captain,
officers, and crew, with their hats off in a mizzling rain. Jack, who
had been told by the captain that these articles of war were the rules
and regulations of the service, by which the captain, officers, and
men, were equally bound, listened to them as they were read by the
clerk with the greatest attention.  He little thought that there were
about five hundred orders from the Admiralty tacked on to them, which,
like the numerous codicils of some wills, contained the most important
matter, and to a certain degree make the will nugatory."""

clean=text.replace('.',' ').replace(',',' ').lower()
```

The rest of the program can split the text into individual words, create a set from those words and then display the unique words which occur in the paragraph.

Once you have that working, you can create a set of common English words, including “the”, a”, “to”, “of”, “in”, “on”, “by”, “as”, “and”, “or”, “not”, “be”, “make”, “do”, etc. The difference between your complete set of words and this set of common English words will be the unique or unusual words in the paragraph.

2. Dice Rolls.

The game of Craps is defined around a large number of sets. The game has two parts: the first roll (usually called the “come out” roll, or “point off” roll), and the remaining rolls (or “point on” rolls) of the game.

• On the point-off roll. There are first-roll winners (all the ways of rolling 7 or 11), first-roll losers (all the ways of rolling 2, 3 or 12). All remaining first-roll dice establish a point.
• On the point-on rolls. There are losers (all the ways of rolling 7), winners (all the ways of rolling the point). All remaining rolls do not resolve the game.

It’s very handy to have a list of sets. Each set in the list contains all the ways of rolling that number. We can create the empty list of sets as follows. This will give you a list, named rolls, that has empty sets in positions 2 through 12. It also has two empty sets in positions 0 and 1, but these won’t be used for anything.

```rolls= []
for n in range(13):
rolls.append( set() )
```

Once you have the list named rolls, you can then enumerate all 36 dice combinations with a pair of nest loops like the following:

```for d1 in range(1,7):
for d2 in range(1,7):
make a two-tuple (d1,d2)
compute the sum, d1+d2
add to the appropriate set in the rolls list
```

Once you have the list of sets, you can compute sets which contains all the rolls for a win on the first roll and all the rolls which would lose on the first roll. These are simple union operations, using elements in the rolls list. Specifically, you’ll have to union rolls[2], rolls[3] and rolls[12] for the first roll losers.

## Set FAQ’s¶

Sets are too mathematical and abstract; why are they in here?

That’s more of a complaint than a question. However, the point is that sets are useful and can simplify certain types of programs.

Also, and more importantly, it’s important to see all of the various kinds of collections that Python offers. Most programming is about a collection of data. The more collections you’ve seen, the more you can exploit the various kinds of collections to build the program you need to write.

Many introductory books on programming will focus on a particular collection (often the list). This can leave the newbie to founder when it comes to doing things that don’t fit well with the strengths of the list collection.

When would you ever need a frozenset?

It isn’t obvious at this point, but in the next chapter (Mappings : The dict), we’ll uncover some reasons why Python has to have a frozenset. Looking forward a bit, the problem centers around mutability. A mutable object (like a list or a set) can’t be used as a key for a dictionary.

Consider this: the dictionary key has to be a fixed label or tag for the element in the dictionary. Think of a word in the big old dictionary sitting on the corner of your desk. Words don’t change their spelling. If we change the value of a set, it’s now a new value; that’s like changing the misspelling a word. Where is the new word in the old dictionary?

A frozenset can’t be changed, and can be used as the key to a dictionary.

I can do all the set operations using just lists; why have the complexity of a set?
Agreed, you can implement each set operation on a list. You’ll note, however, that they’re wordier than using the basic set operations.