Send Close Add comments: (status displays here)
Got it!  This site "robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website.  Note: This appears on each machine/browser from which this site is accessed.
Data clustering
by RS  admin@robinsnyder.com : 1024 x 640


1. Data clustering
Clustering is an important technique for grouping data. Some examples include the following.

2. Exact and approximate clustering
Clustering is based on some idea of what it means for two entities to be "equal" or, in most cases, "almost equal" in some sense.

We will look at the following.

3. Human visualization
Humans have a unique ability to abstract and recognize patterns and make abstract inferences from those recognized patterns.

4. Abstraction
Abstraction 1
To abstract is to take away from the essentials and thereby to ignore certain differences.
The similarity is what is the same. The difference is what is different.

Human brains are built for complex abstraction.

The Latin word "abstractus""take away from". In abstract art, something is taken away, something remains, one needs to then interpret what is meant or intended.

Information sign More: Abstraction
Define abstraction and give a specific example.

5. Higher level intelligence
Abstraction is the key to higher level intelligence. That is why so many questions are of the form, "What is the primary similarity and difference between ...".

Much of computer science programming languages involve looking at patterns between text and making abstractions.

6. Triangles: Seeing and thinking
Kaniza TriangleHow many triangles do you see? There are no triangles! Your brain makes the triangles using abstraction (built into the brain).

Programming a computer involves a lot of abstraction of code text without thinking like a computer.

Information sign More: Triangles: Seeing and thinking

7. Abstraction
In simple terms, abstraction is looking at similarities and ignoring differences.
Abstraction arises from a recognition of similarities between certain objects, situations, or processes in the real world, and the decision to concentrate on these similarities, and to ignore for the time being the differences. Tony Hoare (British computer scientist)

Dahl, O., Dijkstra, E., & Hoare, C. (1972). Structured programming. New York: Academic Press., p. 83.

Information sign More: Tony Hoare

8. Programming abstractions
In programming terms, to abstract is to replace one or more parts of a program with a name that refers to the replaced parts (thus hiding the details). Here are some programming constructs that are used for abstraction.

9. Dimensions
Humans can easily visualize 2D or 3D in graphics but higher dimensions are harder to visualize.

In data science, one often learns concepts using examples in 2D or 3D and then generalize via abstraction to many more dimensions.

Working in 2D or 3D can thus help one understand the method that then generalizes to higher dimensions.

10. End of page

by RS  admin@robinsnyder.com : 1024 x 640