Exploratory data analysis

Exploratory data analysis

In statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of varia

Comment
enIn statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of varia
Depiction
Data visualization process v1.png
Tips-hist1.png
Tips-hist2.png
Tips-scat1.png
Tips-scat2.png
Has abstract
enIn statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. EDA encompasses IDA.
Hypernym
Approach
Is primary topic of
Exploratory data analysis
Label
enExploratory data analysis
Link from a Wikipage to an external page
journals.sagepub.com/doi/pdf/10.3102/0091732X008001085
link.springer.com/book/10.1007%2F978-1-4612-4950-4
www.sciencedirect.com/science/book/9780123800909
archive.org/details/applicationsbasi00vell
www.uv.es/visualstats/Book
www.itl.nist.gov/div898/handbook/eda/eda.htm
archive.org/details/exploratorydataa00tuke_0
archive.org/details/exploringdatatab0000unse
www.unc.edu/~rcm/book/factornew.htm
oli.cmu.edu/courses/free-open/statistics-course-details/
Link from a Wikipage to another Wikipage
Analytic function
Andrew S. C. Ehrenberg
Anscombe's quartet
Arthur Lyon Bowley
Bar chart
Bell Labs
Bootstrapping (statistics)
Box plot
Bradley Efron
Category:Exploratory data analysis
Causality
Chernoff face
Computational statistics
Configural frequency analysis
Data
Data analysis
Data dredging
Data mining
Data reduction
Data set
Data visualization
Deborah F. Swayne
Decile
Descriptive statistics
Design of experiments
Dianne Cook (statistician)
Dimensionality reduction
Empirical distribution function
Exponential family
Extreme value
File:Data visualization process v1.png
Five number summary
Five-number summary
Francis Galton
Gottfried Noether
Heat map
Heavy-tailed distribution
Heteroscedasticity
Histogram
Iconography of correlations
JMP (statistical software)
John Tukey
John W. Tukey
KNIME
Machine learning
Maximum
Mean value
Median
Median polish
Median test
Minimum
Minitab
Multidimensional scaling
Multilinear principal component analysis
Multi-vari chart
Nonlinear dimensionality reduction
Nonparametric statistics
Odds ratio
Open-source software
Open University
Orange (software)
Order statistic
Ordination (statistics)
Outlier
Parallel coordinates
Pareto chart
Pattern recognition
Phenomenon
Pierre-Simon Laplace
Predictive analytics
Principal component analysis
Python (programming language)
Quantile
Quantity
Quartile
R (programming language)
Resampling (statistics)
Robust statistics
Run chart
S (programming language)
SAS Institute
Scatter plot
Seven-number summary
Skewness
S-PLUS
Standard deviation
Statistical graphics
Statistical hypothesis testing
Statistical inference
Statistical model
Statistical theory
Statistics
Stemplot
Structured data analysis (statistics)
Survey sampling
Systematic error
Targeted projection pursuit
Testing hypotheses suggested by the data
TinkerPlots
Tip rate
Trend estimation
Trimean
Weka (machine learning)
SameAs
4128896-8
Análise exploratória de dados
Análisis exploratorio de datos
Badania eksploracyjne
Datuen azterketa esploratzaile
Explorační analýza dat
Explorative Datenanalyse
Keşifsel veri analizi
LM15
m.025t2d
Q1322871
Разведочный анализ данных
Розвідувальний аналіз
تحلیل کاوشگرانه داده‌ها
探索的データ解析
탐색적 자료 분석
Subject
Category:Exploratory data analysis
Thumbnail
Data visualization process v1.png?width=300
WasDerivedFrom
Exploratory data analysis?oldid=1111359043&ns=0
WikiPageLength
17345
Wikipage page ID
416589
Wikipage revision ID
1111359043
WikiPageUsesTemplate
Template:'s
Template:Authority control
Template:Cite book
Template:Data Visualization
Template:ISBN
Template:Reflist
Template:Short description
Template:Social surveys