Send
Close Add comments:
(status displays here)
Got it! This site "robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website. Note: This appears on each machine/browser from which this site is accessed.
Set theory for data science
1. Set theory for data science
2. Notation
A
set is an unordered collection of objects without repetition.
The set notation
{ red, white, blue }
is read as, "
the set whose members are red, white, and blue".
3. Set
Is the following a set?
{ red, red, white }
This is not a set because there are duplicates. That is,
red appears twice.
4. The same sets
Consider the following sets.
{ red, white, blue }
{ white, red, blue }
{ blue, white, red }
Are these the same sets?
These are the same sets because the ordering of the set members does not matter.
5. Bag
Why are bags important?
Many word analysis algorithms for text comparison work on a bag of words model.
6. Python sets
A bag, or multiset, is a set where objects can appear more than once.
Python supports sets and set operations.
Here is the Python code [#1]
Here is the output of the Python code.
7. Empty set
The empty set is written as follows.
{}
However, in Python empty curly braces represents an empty dictionary.
In Python, use
set() to get an empty set.
8. Python set operations
Python supports the following set operations.
add()
update()
discard()
remove()
pop()
clean()
9. Named sets
A set can be named.
A = { red, white, blue }
Set
A is the set whose members are
red,
white, and
blue.
Note: In mathematics, an upper case letter is often used to name a set. In computers, an upper case letter/name is usually a constant.
10. Set membership
The notation is read as "white is a member of the set A".
11. Universe
The universe is the set of all possible elements under consideration.
12. Venn diagrams
A
Venn diagram, named after
John Venn (inventor of Venn diagrams) , is a pictoral representation of sets and set relationships.
The simplest Venn diagram represents the universe
U as a rectangle.
What is the universe for the roll of a six-sided die?
The universe for the roll of a 6-sided die is (the set) as follows.
{ 1, 2, 3, 4, 5, 6}
This Universe is the sample space of a 6-sided die - all the possible values.
13. Meaning
What is the difference in meaning of the following two sentences?
The members of the basketball team compliment each other very well.
The members of the basketball team complement each other very well.
Note: Both compliment and complement come from the Latin word "
complere" meaning to complete
The term complement means a nice thing said about something or someone.
One meaning of the term
compliment is "
nice words" as in an "
act of praise".
The members of the basketball team complement each other very well.
Thank you very much.
The term complement means one or more things that together form a unit.
One meaning of the term
complement is "
one of two things" that, together, form a unit.
The members of the basketball team compliment each other very well.
14. Complementary colors
Some complementary colors are as follows.
red and green (Christmas)
purple and yellow (Easter)
orange and blue (Savannah State University)
15. Color models
16. Set complement
The
complement of set
A consists of all those members of the universe who are not in set
A.
The logical operation
NOT, math symbol "
¬", is related to the complement set operation.
What is { 2, 4, 6 }
c for the roll of a six-sided die?
{ 2, 4, 6 }
c = { 1, 3, 5 }
17. Line intersection
An
intersection of two things is the part that both items have in common.
Two lines in a plane either intersect at a point or are parallel.
18. Traffic intersection
A traffic intersection is the part that is in common between two roads.
19. Set intersection
The
intersection of sets
A and
B consists of all those members who are in both set
A and set
B.
The logical operation
AND, math symbol "
∧", is related to the intersection set operation.
What is { 2, 4, 6 }
∩ { 1, 2, 3 } for the roll of a six-sided die?
{ 2, 4, 6 }
∩ { 1, 2, 3 } = { 2 }
20. Python set intersection
Here is the Python code [#2]
Here is the output of the Python code.
21. Union
22. Soviet Union
A
union includes everything from the joining together of both parts.
The Soviet Union consisted of a union of a number of states (e.g., Russia, Ukraine, etc.).
23. State of the Union
Each year, the President gives a "
State of the Union" speech.
Before the Civil War: The United States are a country.
After the Civil War: The United States is a country.
24. Set union
The
union of sets
A and
B consists of all those members who are in
either A or set
B or both.
The logical operation
OR, math symbol "
∨", is related to the union set operation.
What is { 2, 4, 6 }
∪ { 1, 2, 3 } for the roll of a six-sided die?
{ 2, 4, 6 }
∪ { 1, 2, 3 } = { 1, 2, 3, 4, 6 }
25. Python set intersection
Here is the Python code [#3]
Here is the output of the Python code.
26. Mutual exclusion
Sets
A and
B are said to be
mutually exclusive if they have no members in common.
{ 2, 4, 6 }
∩ { 1, 3, 5 } = { }
27. Collective exhaustion
Sets
A and
B are said to be
collectively exhaustive if they include all members in the universe.
{ 2, 4, 6 }
∪ { 1, 3, 5 } = U
28. Mutually exclusive and collectively exhaustive
It is possible for
A and
B to be both mutually exclusive and collectively exhaustive.
{ 2, 4, 6 }
∩ { 1, 3, 5 } = { }
{ 2, 4, 6 }
∪ { 1, 3, 5 } = U
29. Difference
The Python set difference operator is "-" (minus sign, example omitted).
30. Symmetric difference
The Python symmetric set difference operator is "^N" (caret, example omitted).
31. Membership
The Python membership operator is "in" (example omitted).
32. End of page