Send
Close Add comments:
(status displays here)
Got it! This site "robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website. Note: This appears on each machine/browser from which this site is accessed.
Data categories
1. Variations in data values
Data values can be categorized as varying in either a discrete or a continuous way.
The following are all related: discrete, count, digital, integer number
The following are all related: continuous, measure, analog, real number
2. Analog signals
An
analog signal is a continuous signal that varies continuously and is measured using a real number approximation (i.e., a number with a decimal point).
3. Digital signals
Digital signals vary discreetly. That is, digital information is information that has discrete levels and can be measured using an integer number count (i.e., a number without a decimal point).
4. Counts and measures
5. Counts and measures
A count is an exact (nonnegative integer) number.
A measure is an approximation.
Example: 6 pack of Coke
You can count the number of bottles of Coke (i.e., six).
You measure how much is in each bottle (e.g., 12 ounces).
What is the difference between a count and a measure? Give a specific example that illustrates the difference.
6. Integers
Integers are positive and negative whole numbers that can be used for counts.
Math integers:
-∞, ..., -2, -1, 0, 1, 2, 3, ..., ∞
Computer integers:
-minInt, ..., -2, -1, 0, 1, 2, 3, ..., maxInt
For example, the
minInt value for a signed byte is
-128 while the maximum value is
128.
7. Real number approximations
Real number approximations are approximations of measures.
Approximations usually include a certain number of significant digits.
0.3333...
0.6666...
1.0000...
3.1415...
8. Baseball counts and measures
When a baseball player gets
2 hits in
6 at bats, what is the batting average?
The number of hits,
2, and the number of at bats,
6, are counts.
The batting average is
2/6 =
0.333 =
33.3% which is a measure.
9. Football
Is the number of yards gained by a running back or receiver in a play of American football a count or a measure?
10. American football
Yes. A measure can be converted into a count.
For example, we measure distances. Now, the playing area of an American football field is
100 yards from goal-line to goal-line.
What is the record for the longest possible run from scrimmage (that is, from goal-line to goal-line)?
It is not
100 yards.
And it is not
99.9 yards.
Why?
Even though the distance is measured by the officials on the field, football conventions dictate that, for record purposes, each ball placement is rounded off appropriately to the nearest yard.
So the longest possible run is
99 yards. And for the players who have done such a run, it is a record that cannot be broken (why?).
11. Analogy
Counts relate to digital signals. Nothing is lost, unless you decide to lose some of the signal.
Measures relate to analog signals. There is always a loss with analog signals.
12. Forms of information
digital, discrete, count, integers
analog, continuous, measure, real number
13. Data categories
Four possible ways to measure data are as
nominal level data,
ordinal level data,
interval level data, and
ratio level data.
The type of data determines how one processes this data and how one chooses ways to visualize this data.
14. Nominal level data
Nominal level data is data that can be classified and counted, but that otherwise has no meaningful order.
Is there an order to the colors in a package of M&M's?
We can pick an order, but the order is no more meaningful than any other order.
Nominal level data is often presented by arranging the values in alphabetical order. Example.
Company
-------
Apple
Google
IBM
Microsoft
Oracle
Can this create problems?
What was one factor in Apple Computer choosing "
Apple" as the name of the company in the late 1976?
15. Apple computer
Since ratings of computer companies (i.e., nominal level data) are usually published in alphabetical order, Steve Jobs and Steve Wozniak, founders of Apple Computer, wanted to appear near the top of the list.
Note that Google eventually changed it's parent name to Alphabet.
16. Phone books
Look in any (historical) phone book. Many companies would start their company name with "AAA" in order to be easy to find in the phone book.
17. Political candidates
What about listing candidates for election in alphabetical order?
Note: The same could happen in high school elections, etc.
The names of the states are nominal data.
The population or size of the states are ordinal data.
Is the United States a democracy? Do the people elect the President based on popular vote?
18. Ordinal level data
Ordinal level data is data that can be classified, counted, and rank ordered, but the order is more qualitative than quantitative.
What is the temperature like in this room?
Temperature
-----------
very hot
hot
warm
nice
cool
cold
very cold
Notice that the categories are qualitative in that the idea of warm may vary from person to person.
I tend to have a wide range of temperatures that I consider comfortable (i.e., not uncomfortable).
I say that I am temperature tolerant and that my wife is temperature intolerant.
My wife says that I am temperature insensitive and that she is temperature sensitive.
Who is right?
We both are right. It is just that sometimes saying the same thing in different ways can sound better or worse than saying the same thing in another way.
19. Temperature colors
How might we assign colors to temperatures?
20. Interval level data
Interval level data is data that can be classified, counted, rank ordered, and where differences between data items have meaningful significance.
Note: Some people consider temperature or year interval level data.
Temperature and year values can have meaning as ordinal level data.
Temperature and year intervals can have meaning as interval level data.
21. Ratio level data
22. Data sets
Ratio level data is data that can be classified, counted, rank ordered, and where differences and ratios between data items have meaningful significance.
A data set in the form of a table is a collection of data organized into named columns where each row, called a record, contains related information.
Data sets in the form of tables are ideally represented using spreadsheets.
23. End of page