Pourquoi un modèle statistique serait-il surchargé s'il était doté d'un énorme ensemble de données?
Mon projet actuel peut m'obliger à construire un modèle pour prédire le comportement d'un certain groupe de personnes. l'ensemble de données de formation ne contient que 6 variables (id est uniquement à des fins d'identification): id, age, income, gender, job category, monthly spend dans laquelle...
modeling
large-data
overfitting
clustering
algorithms
error
spatial
r
regression
predictive-models
linear-model
average
measurement-error
weighted-mean
error-propagation
python
standard-error
weighted-regression
hypothesis-testing
time-series
machine-learning
self-study
arima
regression
correlation
anova
statistical-significance
excel
r
regression
distributions
statistical-significance
contingency-tables
regression
optimization
measurement-error
loss-functions
image-processing
java
panel-data
probability
conditional-probability
r
lme4-nlme
model-comparison
time-series
probability
probability
conditional-probability
logistic
multiple-regression
model-selection
r
regression
model-based-clustering
svm
feature-selection
feature-construction
time-series
forecasting
stationarity
r
distributions
bootstrap
r
distributions
estimation
maximum-likelihood
garch
references
probability
conditional-probability
regression
logistic
regression-coefficients
model-comparison
confidence-interval
r
regression
r
generalized-linear-model
outliers
robust
regression
classification
categorical-data
r
association-rules
machine-learning
distributions
posterior
likelihood
r
hypothesis-testing
normality-assumption
missing-data
convergence
expectation-maximization
regression
self-study
categorical-data
regression
simulation
regression
self-study
self-study
gamma-distribution
modeling
microarray
synthetic-data