l*g 发帖数: 46 | 1 有多个这样的equations,每个包含不同变量,outcome都是death
我尝试用自己的dataset代入所需要的X,算出Y,就是individual death rate,然后需
要找cutoff point决定如何分组(death or not)
问题是:1)怎么找到cutoff point?提示要用sensitivity/specificity test来找,
可是我没明白怎么找。我试过用眼睛大概看看dataset的death分布和算出的death rate
来估计一个cutoff point,可是这样做并不符合要求
2)分好组后以这个作为outcome(0/1),再用之前的变量一起建model,就是想还原已
知的model,以便做diagnostics,可是还原不出。。。
抱歉,我比较菜。。。做的比较混乱,请教大家!谢谢! |
s*****n 发帖数: 2174 | 2 你问的问题, 本质上是logitic regression model 的
model diagnostics. 这个问题本身就是一个困难的问题.
你要考虑的, 不是给Y分组, 而是给X分组.
分两种情况
(1) Logistic model with replication
同样的X下, 有多个Y(0/1)的观测. 这种情况比较容易.
你可以比较
Y hat = Model(X)
Y empirical = #1/(#1+#0) under X
然后算一下correlation什么的.
(2) Logistic model without replication
同样的X下, 只有一次Y的观察, 或者是0或者是1. 这种
情况下, 必须借助额外的assumption. 常用的就是model
的连续性, 即相似的X意味着相似的P(Y=1)
这时, 需要把所有的observation根据X进行clustering.
然后在每个cluster内, 看成是replication. 进行第一
种情况那样的算empirical rate和predicted rate.
可是如果X的维度较高, 高 |
l*g 发帖数: 46 | 3 谢谢ls!
我这个data是without replication的,这个project的要求之一就是要找cutoff point
,所以恐怕不是考虑成连续的来看correlation,而是找cutoff point后得到一个o/1变
量,再和data中已知的实际的death/not做chi-square比较。
我是用stata来做的,不晓得怎么能把model还原,在之前的尝试还原中,有的能run的
出来,得到coefficients,不过和已知的也不一样,有的根本就出不来model。。。很
confused。。。 |
j*****e 发帖数: 182 | 4 Read some materials on ROC curve. If you use SAS, you can easily get what
you want in proc logistic. |
l*g 发帖数: 46 | |
l*******l 发帖数: 204 | 6 There are two methods to find the "best" cut point using ROC.
1) the point on the ROC that minimize the distance to the point (0,1)
2) the point on the ROC that minimize the distance to the line y=1
Cheer
【在 l*g 的大作中提到】 : 好的,谢谢ls,我去摸摸sas的做法
|
D******n 发帖数: 2836 | 7 在proc logistic里面画roc好像不是很直接。要用ods html来着。
不过这是SAS一向的怪僻。
【在 j*****e 的大作中提到】 : Read some materials on ROC curve. If you use SAS, you can easily get what : you want in proc logistic.
|
s*******9 发帖数: 35 | 8 I think that stata has very simple commands to help you get the ROC and the
cutoff point.
after running your logistic regression model, simply run 'lroc' and you will
get a nice ROC in stata.
and after that you can run 'roctab youroutcome p, detail' to get a series of
cutoff points. |
l*g 发帖数: 46 | 9 Thank you, ls! The problem is that I cannot get the models with the known
coefficients...
I need to see if the original known models can predict my dataset well. How
can I put those models into Stata? |
l*g 发帖数: 46 | 10 I tried to use the Y which I calculated as the outcome and run the models in
SAS, but SAS reported " Validity of the model fit is questionable"...
and...still cannot get the similar coefficients as the original ones. |