Send
Close Add comments:
(status displays here)
Got it! This site "robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website. Note: This appears on each machine/browser from which this site is accessed.
Data science steps
1. Data science steps
Here are the basic steps in a process involving data, metrics, models, etc. Note that each step depends on all of the previous steps.
2. Step 1
Step 1: Save all relevant data (logs, etc.) - which can be a lot. This is often unstructured text data from all web sessions, etc.
3. Step 2
Step 2: Be able to quickly process this data when needed. This is where big data models (e.g., Hadoop clusters running MapReduce, etc.) come in. Whether Hadoop is needed depends on the volume, velocity, and variety of the data that needs to be processed.
4. Step 3
Step 3: Rather than using a human, have the computer look for correlations, possible causations, etc. This is where machine learning techniques come in so that, rather than guessing models, have the machine guess the models.
5. Step 4
Step 4: Have some interface to present the results so that humans can make the preliminary decisions on what is happening and what to do about it. This is where data visualization comes in.
6. Step 5
Step 5: Present the steps to management for final decisions on the next steps. This is where information visualization comes in.
A decision is the ultimate goal of data science - whether by a human or by a computer.
7. Notes
Note that in some cases, it may be clear as to the alternatives and consequences. In other cases, the web system can be automated in such a way as to create random sampling techniques to determine the best course of action.
8. Automation
It is best to automate steps 1 and 2 before making changes. Otherwise, it is hard to be certain that the changes made were what caused any changes in the bottom line.
9. End of page