Send
Close Add comments:
(status displays here)
Got it! This site "robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website. Note: This appears on each machine/browser from which this site is accessed.
Data science overview
1. Data science overview
Problems in data science arise in many areas.
This page is an overview of some of these areas.
2. Viewpoints
There are two primary viewpoints of data science.
business analytics - business human view to make decisions
machine learning - machine view to make decisions
There is a lot of overlap between the two areas.
3. Artificial intelligence
artificial intelligence is the search for ways to make machines more intelligent
To be more precise,
AI (Artificial Intelligence) (i.e., machine intelligence) is the search for ways to make machines (i.e., computers) more intelligent, whatever that means (no one seems to agree on what it exactly means).
4. Comparison
Compare and contrast:
artificial intelligence
natural stupidity
5. Paradox
The paradox of artificial intelligence is that once something (that appears intelligent) is understood, it is no longer considered intelligent.
6. Failures
After a number of hyped failures, the term AI is not so popular today. Instead, many areas of AI are now studied under the terms intelligent system, data mining, decision support systems, etc.
A more modern term for AI is
ML (Machine Learning).
7. Intelligence
8. Intelligence
Intelligence can be hard to define.
Artificial intelligence might be better called machine intelligence.
When is a computer intelligent?
9. Alan Turing
Alan Turing (1912-1954) defined a test to determine if a computer is intelligent.
10. Enigma machine
Alan Turing was one of many mathematicians that helped solved the German enigma code cipher. In doing so, it helped defeat Nazi Germany during World War II.
11. Turing test
The Turing test to determine if a computer is intelligent goes as follows.
Put a person in one room with a teletype terminal to another room to type questions and get answers.
12. Thinking computers
If that person cannot tell if there is a person or a computer at the other end, and it is a computer, then that computer is considered intelligent.
13. Instant messaging
14. Data and semantic overload
Whatever data/information is, the amount of new data/information is growing at a continually increasing rate.
15. Data growth over time
One problem is not that there is not enough data. The problem is that there is too much data.
Internet search results
Tax code
programming systems, such as C++
What is semantic overload? Give a specific example.
16. Missing data
Not only is there too much data, there is missing data.
Whenever an advertisement for hardware or software is read, what the advertisement doesn't tell you is often more important than what it tells you.
17. Semantic overload
We are often overloaded with data/information. And sometimes disinformation.
Having too much data/information is sometimes called
semantic overload, sometimes referred to as
TMI (Too Much Information).
18. Coleridge
Water, water, everywhere, nor any drop to drink. The Rime of the Ancient Mariner, Samuel Taylor Coleridge.
Note: In the ocean, there is water all around, but if you drink the salty seawater, you will die from dehydration.
19. Time is valuable
Time is valuable. In business, time is money (or can be reduced or made equivalent to money).
Example: sorting through email (SPAM is junk email).
Opportunity cost: Is your time is valuable?
20. Data everywhere
There is so much data around, it can be hard to find the information.
Data, data, everywhere, and not a thought to think. Computer Scientist Jesse Shera
We are drowning in a sea of information and we can easily lose sight of our goals.
21. Insight
A primary purpose of data science is insight, not always specific instances on which to base a decision.
Example: Moneyball and the Oakland Athletics baseball team.
22. Roadkill
Sometimes the phrase "
roadkill on the information superhighway" is used to refer to being inundated with too much information. Or, "
not my job".
Why was the term "
information superhighway" once used? Why is it not used very much today?
23. Knowledge discovery
Data mining involves using the use of many and varied techniques to find useful patterns in data.
The term data mining is often used in conjunction with
Knowledge Discovery to describe intelligent and automated methods that are used to extract useful information from databases.
24. Knowledge discovery
Knowledge discovery and/or data mining goes through many stages.
raw data
target data
preprocessed data
transformed data
pattern recognition
knowledge
25. Correlations
Data mining has found that people who buy baby diapers are likely to buy beer (some dispute this claim). Likely scenario:
Mother sends the husband out to get baby diapers.
The husband gets some beer while he is there. If you put the two together, you sell more beer.
It was also found that greeting cards and perfume/cosmetics went well together.
Most results of data mining are considered a competitive advantage and will be kept secret.
26. Grocery cards
One reason grocery stores give discounts for using their special bar-coded card is that they can then track what you are buying and look for trends in your purchases.
The store can then analyze the data using data mining techniques and use the results for marketing decisions.
On the other hand, some people feel that this invades their privacy.
27. Decision support systems
One useful business application of
AI techniques is that of a
DSS (Decision Support System)
A
decision support system is a system designed to filter data and information in order to allow a human to make a better decision.
The United States tax code is large, complex, and changing on a regular basis.
If you go to the local bookstore to find a book on Microsoft Excel, you may find 10 or more. Which one do you pick?
28. Magic
Arthur C. Clarke (science fiction author) has said that "
Sufficiently advanced technology and magic are indistinguishable.".
Arthur C. Clarke was the author of 2001: A Space Odyssey (1968)
Sometimes artificial intelligence, machine learning, etc. seem like magic.
... more to be added ...
29. Left and right brain
In general, the left and right side of the brain perform different functions.
30. Left brain
The left brain is logical, analytical, quantitative, and objective.
31. Right brain
The right brain is artistic, emotional, and recognizes patterns.
32. Splitting the brain
The left and right brain can each operate independently of the other. Patients with severe epileptic seizures have had their connections severed.
Functionality is lost, however, in that the right and left brains can no longer communicate.
33. Seeing
So, if an object is put in the left hand of a patient, that feeling is sensed by the right side of the brain.
Note: The brain sees, the eye facilitates gathering the information.
34. Brain communication
Since the language center is in the left side of the brain (in most people), the person will be able to feel the object and know what it is, but not be able to verbalize the name of the object.
Many researchers believe that two sides of the brain, each able to operate independently, means that everyone has two different personalities that are, somehow, merged together (at least in most people).
35. Left and right brain
Read the colors of the letters. Do it fast.
36. Foot circles
1. Lift your right foot about 6 inches off the ground.
2. Start moving it in circles, clockwise. Keep doing this.
3. Using your right hand, draw the number 6 in the air. Do not let your right foot change direction.
What happened?
Now try it by just thinking about drawing the number 6 in the air.
What happened?
Now try it by using the number 8.
What happened?
The body is designed so that moving the legs, as in walking, can become somewhat automatic, so that we do not think about every part of every step.
We can abstract and think about more important things.
37. Counting words
Here is a famous quote.
Ask not what your country can do for you.
Ask what you can do for your country.
Who said this?
38. Country
President John F. Kennedy made the statement.
My fellow Americans,
ask not what your country can do for you,
ask what you can do for your country.
Without looking, answer the following. How many words are in it? Do not use your fingers.
What is the problem?
The problem is that the same part of your brain is trying to solve two different problems (i.e., counting and remembering) at the same time.
39. Expert systems
Expert systems are a
left-brained approach to AI that:
make use of facts and rules in a systematic logical manner.
arrive at conclusions that mimic experts in a field.
allow novices to perform more like experts in handling discipline-specific problems
do not replace the experts. Experts are still needed.
40. Practical applications of expert systems
Some practical examples of expert systems are the following.
tax preparation
grammar checkers
legal advice
help desks
medical diagnosis
41. Medications
Many pharmacies will use an expert system to help determine which drug combinations are not to be used.
For example, if there are 1000 available drugs, then there are 1000*999/2 = 999,000/2 = 499,500 combinations of any 2 drugs.
And, there may by additional considerations to make the problem harder to diagnose.
42. Rules
Expert systems require a precise set of rules.
Forget a rule, and the expert system may bomb.
How did one early expert system answer the following question?
Question: How do I get more disk space?
Answer: Erase the files on your disk.
In practice, expert systems are developed until they prove sufficiently useful.
43. Neural networks
A neural networks are a right-brained approach to artificial intelligence that is used to recognize patterns based on previous training.
By contrast, a left-brained approach would be a rules-based approach.
44. Neural networks
A neural network has:
input layer
one or more hidden layers
output layer
45. Neural networks
Practical applications of neural networks include adaptive noise canceling, mortgage risk evaluation, bomb sniffing, word recognition, forecasting, and handwriting analysis.
46. Facial recognition
Facial recognition attempts to match faces with identities.
47. Face recognition
An early application of facial recognition is the following.
During the 2001 Super Bowl in Tampa (Baltimore Ravens vs. New York Giants), cameras focused on the crowd were used to search for faces with criminal records. They found some. But many people were very upset about the use of this technology in this context.
A second use of the technology was at the 2001 Super Bowl in Tampa, where pictures were taken of every attendee as they entered the stadium through the turnstiles and compared against a database of some undisclosed kind. The authorities would not say who was in that database, but the software did flag 19 individuals. The police indicated that some of those were false alarms, and no one flagged by the system was anything more than a petty criminal such as a ticket scalper. Press reports indicate that New Orleans authorities are considering using it again at the 2002 Super Bowl. http://www.findbiometrics.com/Pages/face_articles/face_2.html (as of 2006-09-05)
There were many false positives - people were identified as felons who were not actually felons.
A false negative is where a felon is not identified as a felon.
Some cities, particularly in Great Britain, use cameras to look for faces walking down the street.
Great Britain uses cameras in many areas for surveillance purposes (e.g., to deter crime).
Most convenience stores, hotels, etc., have cameras.
Some of this is for after-the-fact analysis.
More recent applications include use in China, etc.
48. Fuzzy systems
The primary usefulness of
fuzzy systems is for process control applications.
49. Water faucet
Think of maintaining the temperature of water running from a water facet.
Fuzzy rules can be used.
50. Fuzzy rules
If the water is very much colder than desired, turn the control to very warm.
If the water gets a little colder than desired, turn the control to a little warmer.
If the water is the right temperature, do not turn the control.
If the water is a little warmer than desired, turn the control to a little warmer.
If the water is very much warmer than desired, turn the control to very cold.
51. Multi-valued logic
Fuzzy systems use a multi-valued logic system. Some fuzzy products are:
Fuzzy cameras can easily focus automatically in a variety of conditions.
Fuzzy systems can drive high-speed subway trains better than human operators
Fuzzy washing machines can detect and set many conditions on how to wash clothes.
Fuzzy vacuuming machines can detect and set many conditions needed to vacuum various types of carpet.
52. Backing up a truck
Fuzzy-neural systems have been used to back up 18-wheeler trucks (at least via simulation).
53. Genetic algorithms
Genetic algorithms are used to create systems that learn in that the systems organize themselves and adapt to their environment via positive and negative feedback.
54. Natural selection
These systems are motivated by the natural selection (survival of the fittest) theory of evolution. As such, genetic algorithms provide a class of solution techniques that can be used to find good, but not necessarily optimum, solutions to hard (i.e., intractable) problems.
55. Genetic algorithms
One can think of genetic algorithms as approximating a function by swapping bits randomly to find solutions that are better than other solutions. (i.e., hill climbing but jumping around from time to time)
56. Virtual reality
One way to create a
VR (Virtual Reality) is to use 3-D computer graphics to simulate a virtual world - an artificial reality.
Note: Reading a book (e.g., fiction) is a simple form of virtual reality.
Virtual reality can be used to make a virtual world look real or to see things that cannot be real.
One often wears headphone and goggles (with a color LCD display) in order to immerse oneself in the
virtual reality.
57. Image guided surgery
Virtual reality is being developed and used for image guided surgery.
58. Flight simulation
Virtual reality has been used for many years for flight simulation and pilot training.
You can try your hand at flying by using the Microsoft Flight Simulator software.
59. Molecular docking
60. End of page