model sample size重要吗? - Statistics版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - model sample size重要吗?

相关主题
● model和variables都sig.但每个category都不sig	● 急问：用stata或R算predicted probabiltiy (logistic regressi
● 关于 Risk model	● 算晕了！请教一个组合问题，包子谢。
● [合集] 电话面试完了，肯定没戏，大家帮我看看题目，就算学习吧	● logistic regression issue
● 请问如何验证已知的logistic regression models是不是能很好predict 自己的dataset	● 请教：怎么能把Logistic regression的OR转化成probablity
● regression problem - go confused	● logistic regression on 3 billion records (转载)
● 请教logistic regression的independent variable是categorical	● 问个sas的问题
● 请教SAS达人关于编循环regression的问题，多谢！	● sample size vs. number of regressors
● 急问高手,怎样在SAS实现logistic regression里independent variable重要性排序？	● 用什么model来model proportion比较好？

相关话题的讨论汇总
话题: sample话题: model话题: size话题: variables

进入Statistics版参与讨论

1

(共1页)

c****s 发帖数: 395	1 想起前一阵面试一家公司被问到做model选多少records做sample 当时说选了10,000 from 2.5 millions 很明显这个答案遭到对方鄙视回来后有点不解，选多少个做sample算不上什么有技术含量的问题 sample多少根本不会影响你的model结果难道是离学术界久了，这个问题很tricky?
l***a 发帖数: 12410	2 有多少选多少，留下validate的【在 c****s 的大作中提到】 : 想起前一阵面试一家公司 : 被问到做model选多少records做sample : 当时说选了10,000 from 2.5 millions : 很明显这个答案遭到对方鄙视 : 回来后有点不解， : 选多少个做sample算不上什么有技术含量的问题 : sample多少根本不会影响你的model结果 : 难道是离学术界久了，这个问题很tricky?
c****s 发帖数: 395	3 不可能做Model based on whole dataset or half of the dataset,it is huge 【在 l***a 的大作中提到】 : 有多少选多少，留下validate的
s*r 发帖数: 2757	4 depends on how many predictor variables you have, and how complex of the models you are talking about
A*******s 发帖数: 3942	5 what kind of model u used? for regression, millions of record is practical for SAS. 【在 c****s 的大作中提到】 : 不可能做Model based on whole dataset or half of the dataset,it is huge
c****s 发帖数: 395	6 logistics regression i tried using both whole data and a sample. the interesting is that using whole data some levels which are not significant under small sample becomes highly significant . 【在 A*******s 的大作中提到】 : what kind of model u used? for regression, millions of record is practical : for SAS.
A*******s 发帖数: 3942	7 the most intuitive reason is that u happen to sample a subset in which those variables are significant. my further guess is smaller sample size brings in collinearity and it then causes changes in coefficient estimates. 【在 c****s 的大作中提到】 : logistics regression : i tried using both whole data and a sample. : the interesting is that using whole data some levels which are not : significant under small sample becomes highly significant .

1

(共1页)

进入Statistics版参与讨论

相关主题
● 用什么model来model proportion比较好？	● regression problem - go confused
● Order of Independent Variables in Linear Multiple Regression	● 请教logistic regression的independent variable是categorical
● 请教一个相关性分析（correlation)的问题	● 请教SAS达人关于编循环regression的问题，多谢！
● 请问关于LOGISTIC REGRESSION FORWARD VS BACKWARD	● 急问高手,怎样在SAS实现logistic regression里independent variable重要性排序？
● model和variables都sig.但每个category都不sig	● 急问：用stata或R算predicted probabiltiy (logistic regressi
● 关于 Risk model	● 算晕了！请教一个组合问题，包子谢。
● [合集] 电话面试完了，肯定没戏，大家帮我看看题目，就算学习吧	● logistic regression issue
● 请问如何验证已知的logistic regression models是不是能很好predict 自己的dataset	● 请教：怎么能把Logistic regression的OR转化成probablity

相关话题的讨论汇总
话题: sample话题: model话题: size话题: variables

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)