boards

Statistics版 - Logistic regression: binary response: rare event

Applied Logistic Regression问个modeling data size的问题
Logistic Regression model fit 的指标看哪一组较合适呢？sample size vs. number of regressors

logistic regression in case-control study谁给说说marketing analysis主要做什么
logistic regression on 3 billion records (转载)[合集] R问题 求助... 谢谢

 1 (共1页)
 S******y发帖数: 1123 1I am using logistic regression to model rare event, i.e., y=0 98.5% y=1 1.5% N= 11 million I am thinking of over-sampling "y=1" observations to increase their percentage from 1.5% to 10%. Then I will perform logistic regression. Is this method valid? Will my estimates be biased? Thanks. A*******s发帖数: 3942 2see the discussion on http://www.mitbbs.com/article_t/Statistics/31211743.html 【在 S******y 的大作中提到】: I am using logistic regression to model rare event, i.e.,: y=0 98.5%: y=1 1.5%: N= 11 million: I am thinking of over-sampling "y=1" observations to increase their: percentage from 1.5% to 10%. Then I will perform logistic regression.: Is this method valid? Will my estimates be biased?: Thanks. j*****e发帖数: 182 3You can sure oversample the rare event. This is known as the case-control study. But, only the slope estimate is meaningful, the intercept estiamte is not. Unless you know the marginal probability of the rare event, you wouldn't be able to predict the binary outcome. There are a little bit of discussion given by Agresti's book. You can also read Hosmer and Lemoshow's Applied logistic regression for more. j*m发帖数: 190 4as my limited experience, the prediction might not be valid(I mean, bias...) since the incidence rate is rare.
 1 (共1页)

[合集] R问题 求助... 谢谢请教：怎么能把Logistic regression的OR转化成probablity

SAS sampling的问题logistic regression in case-control study
CONTINUOUS PREDICTOR AND BINARY OUTCOMElogistic regression on 3 billion records (转载)

Applied Logistic Regression问个modeling data size的问题
Logistic Regression model fit 的指标看哪一组较合适呢？sample size vs. number of regressors