
Exploratory data analysis
In statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of varia
- Comment
- enIn statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of varia
- Depiction
- Has abstract
- enIn statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. EDA encompasses IDA.
- Hypernym
- Approach
- Is primary topic of
- Exploratory data analysis
- Label
- enExploratory data analysis
- Link from a Wikipage to an external page
- journals.sagepub.com/doi/pdf/10.3102/0091732X008001085
- link.springer.com/book/10.1007%2F978-1-4612-4950-4
- www.sciencedirect.com/science/book/9780123800909
- archive.org/details/applicationsbasi00vell
- www.uv.es/visualstats/Book
- www.itl.nist.gov/div898/handbook/eda/eda.htm
- archive.org/details/exploratorydataa00tuke_0
- archive.org/details/exploringdatatab0000unse
- www.unc.edu/~rcm/book/factornew.htm
- oli.cmu.edu/courses/free-open/statistics-course-details/
- Link from a Wikipage to another Wikipage
- Analytic function
- Andrew S. C. Ehrenberg
- Anscombe's quartet
- Arthur Lyon Bowley
- Bar chart
- Bell Labs
- Bootstrapping (statistics)
- Box plot
- Bradley Efron
- Category:Exploratory data analysis
- Causality
- Chernoff face
- Computational statistics
- Configural frequency analysis
- Data
- Data analysis
- Data dredging
- Data mining
- Data reduction
- Data set
- Data visualization
- Deborah F. Swayne
- Decile
- Descriptive statistics
- Design of experiments
- Dianne Cook (statistician)
- Dimensionality reduction
- Empirical distribution function
- Exponential family
- Extreme value
- File:Data visualization process v1.png
- Five number summary
- Five-number summary
- Francis Galton
- Gottfried Noether
- Heat map
- Heavy-tailed distribution
- Heteroscedasticity
- Histogram
- Iconography of correlations
- JMP (statistical software)
- John Tukey
- John W. Tukey
- KNIME
- Machine learning
- Maximum
- Mean value
- Median
- Median polish
- Median test
- Minimum
- Minitab
- Multidimensional scaling
- Multilinear principal component analysis
- Multi-vari chart
- Nonlinear dimensionality reduction
- Nonparametric statistics
- Odds ratio
- Open-source software
- Open University
- Orange (software)
- Order statistic
- Ordination (statistics)
- Outlier
- Parallel coordinates
- Pareto chart
- Pattern recognition
- Phenomenon
- Pierre-Simon Laplace
- Predictive analytics
- Principal component analysis
- Python (programming language)
- Quantile
- Quantity
- Quartile
- R (programming language)
- Resampling (statistics)
- Robust statistics
- Run chart
- S (programming language)
- SAS Institute
- Scatter plot
- Seven-number summary
- Skewness
- S-PLUS
- Standard deviation
- Statistical graphics
- Statistical hypothesis testing
- Statistical inference
- Statistical model
- Statistical theory
- Statistics
- Stemplot
- Structured data analysis (statistics)
- Survey sampling
- Systematic error
- Targeted projection pursuit
- Testing hypotheses suggested by the data
- TinkerPlots
- Tip rate
- Trend estimation
- Trimean
- Weka (machine learning)
- SameAs
- 4128896-8
- Análise exploratória de dados
- Análisis exploratorio de datos
- Badania eksploracyjne
- Datuen azterketa esploratzaile
- Explorační analýza dat
- Explorative Datenanalyse
- Keşifsel veri analizi
- LM15
- m.025t2d
- Q1322871
- Разведочный анализ данных
- Розвідувальний аналіз
- تحلیل کاوشگرانه دادهها
- 探索的データ解析
- 탐색적 자료 분석
- Subject
- Category:Exploratory data analysis
- Thumbnail
- WasDerivedFrom
- Exploratory data analysis?oldid=1111359043&ns=0
- WikiPageLength
- 17345
- Wikipage page ID
- 416589
- Wikipage revision ID
- 1111359043
- WikiPageUsesTemplate
- Template:'s
- Template:Authority control
- Template:Cite book
- Template:Data Visualization
- Template:ISBN
- Template:Reflist
- Template:Short description
- Template:Social surveys