Outlier: Difference between revisions
CSV import |
CSV import |
||
| Line 32: | Line 32: | ||
{{stub}} | {{stub}} | ||
{{dictionary-stub1}} | {{dictionary-stub1}} | ||
<gallery> | |||
File:Michelsonmorley-boxplot.svg|Michelson-Morley Experiment Boxplot | |||
File:Standard_deviation_diagram_micro.svg|Standard Deviation Diagram | |||
File:Wiki_q_inter_def.jpg|Outlier | |||
</gallery> | |||
Latest revision as of 01:51, 18 February 2025
Outlier is a term used in statistics to describe an observation that lies an abnormal distance from other values in a random sample from a population. In a sense, this makes the definition of what is an outlier subjective. The American Statistical Association defines an outlier as "an observation (or subset of observations) which appears to be inconsistent with the remainder of that set of data."
Identification of Outliers[edit]
Identifying outliers in a data set is subjective and depends on the context. The most common method to identify outliers is the use of box plots, a type of graph used to display patterns of quantitative data. Other methods include the Z-score method and the IQR method.
Impact of Outliers[edit]
Outliers can significantly impact the results of data analysis and statistical modeling. They can lead to a skewed representation of the underlying data and may also be indicative of errors in data collection or preparation.
Handling Outliers[edit]
There are several ways to handle outliers:
- Deletion: Outliers can be deleted if it is determined that they are due to errors in data collection or preparation.
- Transformation: Applying a mathematical transformation can reduce the impact of outliers.
- Imputation: This involves replacing the outlier with a substituted value.
- Separate analysis: Outliers can be separated and analyzed separately.



