Commit 7377f41d authored by Bert Balcaen's avatar Bert Balcaen

Starting making notes on 'intro to datascience' course.

parent a8f8269b
# Missing data
approaches:
- partial deletion
- imputation
## Partial deletion
- **listwise deletion**: *entire record* is excluded from analysis if any single value is missing
- **pairwise deletion**: *only specific missing values* excluded
## Imputation
**imputation**
- = making an educated guess at missing data
- = approximating missing values
- different techniques; hard to get right
## Imputation using linear regression
= create an equation for finding missing values using available data
disadvantages:
- overemphasizes existing trends
- will produce exact values for the missing entries, which would suggest a greater certainty in the missing values than we actually have
- http://hyperpolyglot.org/scripting
- http://www.marinamele.com/install-and-configure-atom-editor-for-python
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment