第10页 - 关于variable的讨论汇总 - 话题女王

全部话题 - 话题: variable

p******r
发帖数: 1279

来自主题: Statistics版 - 如果dep variable严重skewed，如何做ordinal regression？

有点不明白哦， odinal regression不是基于MLE的吗？
还是说你觉得索性把response variable看出continuous的，然后用OLS regression来
做？
如果ppl不愿意接收他们 highly depressed，then用什么model来做比较好呢？谢谢啊！

p******r
发帖数: 1279

来自主题: Statistics版 - 如果dep variable严重skewed，如何做ordinal regression？

那请问一般如果x variable严重righ skewed，除了用log来搞，还可以怎样搞呢？来个
x+x^2 ？

the

A*******s
发帖数: 3942

来自主题: Statistics版 - 如果dep variable严重skewed，如何做ordinal regression？

my understanding is transforming X for introducing nonlinearity. does it
matter that X variables are skewed?

p******r
发帖数: 1279

来自主题: Statistics版 - 如果dep variable严重skewed，如何做ordinal regression？

well 我觉得x variable没有必要normal dist的，其实只要residue满足normal dist就
好。
不过我的y和其中一个x都很skewed，我把它们用log变来变去好像最后residue还是不满
足normal dist，所以people may question我的significance of coefficient
estimates，because t-test is not valid。。。

m*****a
发帖数: 658

来自主题: Statistics版 - how to write a series of variables Q29B4B-Q29B30B in array ?

Array abc(*) Q29B:;
select all the variables initial with Q29B.
Hope it helps.

s********8
发帖数: 50

来自主题: Statistics版 - how to write a series of variables Q29B4B-Q29B30B in array ?

Thanks for thinking for it , it is a good trial.
but after I tried the code below, I didn;t think it worked.
data Farray;
length Q29B4B A Q29B5B B Q29B6B C Q29B7B D Q29B8B E Q29B9B F Q29B10B G
Q29B11B H Q29B12B I Q29B13B G Q29B14B K Q29B15B $8;
%let lastletter = B;
Array Ftesta(*) Q29B4&lastletter--Q29B15&lastletter;
Array Ftestb(*) Q29B4&lastletter-Q29B15&lastletter;
K=DIM(Ftesta);
P=DIM(Ftestb);
put K= P=;
run;
in LOG K=22 instead of 12;
and P can not get any value since
ERROR: Missing nu... 阅读全帖

m****r
发帖数: 202

来自主题: Statistics版 - sas date variable exchange

Thank you both.
1.When use input, my problem is the length of the variable (Date "03/03/2010
") since it is substred from other original database. I checked and found
that its length is $92. Probably that's the reason why input(Date, mmddyy10
.) doesn't work here.
2.mdy()function works well here.
Thanks again

t***r
发帖数: 157

来自主题: Statistics版 - how to find the distribution of the sum of discrete and continuous uniform variable

how to find the distribution of the sum of discrete and continuous uniform variable
x~uniform(0,1)
y discrete uniform 0 with p, 1 with probability 1-p
how to find the distribution of x+y?
thanks

r***n
发帖数: 6

来自主题: Statistics版 - how to find the distribution of the sum of discrete and continuous uniform variable

cdf(s)=
s<0
=0
0 =s(1-p)
1 =(s-1)p
2 1

uniform variable

h******3
发帖数: 190

来自主题: Statistics版 - SAS question: assigning dummy variable

I am new to sas... not sure if the following answers your question.
define the variable as "class" and set a reference group.

d******g
发帖数: 130

来自主题: Statistics版 - 230 Variables and 4400 Observations 算是high-dimensional data么

我知道关于high-dimensional data的很多research是关于p>>n的情况，我理解的是因
为对于variables数量比较大的data,如果observations数量不足，会造成estimation的
问题。诸如题目中描述的data算是high-dimensional么？

m****r
发帖数: 202

来自主题: Statistics版 - SAS how to change variables' name

Thanks. Could you please give me more detail?
My variable names are AS01d,AS01m,AS01y,AS02a,AS02b,AS02c,AS02_age.
I transposed them to observations, but still don't know how to add "_30" to
them.
Thanks a lot

R*********i
发帖数: 7643

来自主题: Statistics版 - SAS how to change variables' name

If you only have that limited number of variables as listed above you may
want to use "rename" directly.

m****r
发帖数: 202

来自主题: Statistics版 - SAS how to change variables' name

proc contents data=old
out=try(keep=varnum name)
noprint;
run;
data new (keep=newname varnum);
set try;
do i=1 to 1500;
if varnum=(i)
then newname=compress(name)||"_30";
end;
run;
proc transpose data=new
out=aa30d(drop=i _name_);
id newname;
run;
The problem here is some original cha variables have been changed to num
ones. This is not exactly I expect.

A*******s
发帖数: 3942

来自主题: Statistics版 - SAS how to change variables' name

u need to understand why and how to use substr function to truncate the
variable names--to control the length up to 32 characters.

32)

m****t
发帖数: 754

来自主题: Statistics版 - sampling weight variable怎么用到linear regression里啊？

拜托牛人解答一下谢谢

variable

s*********r
发帖数: 909

来自主题: Statistics版 - [sas] how to recode these variables

I am confused.We usually do frequency on categorical variables. why do you
need change char to num?

b*******l
发帖数: 4

来自主题: Statistics版 - regression continuous dependent variable

如果dependent variable 是 continuous, 除了用linear regression build model 外
。还可以用那种regression build model.

l*****8
发帖数: 483

来自主题: Statistics版 - SAS help : The scope of macro variables

1. In a macro, if you use SYMPUT in data step, the macro variable you
created is NOT always global.
try the following 2 examples...
%macro prtrost(num=1);
data _null_;
call symput(’today’,trim(left(put(today(),mmddyy10.))));
run;
%mend prtrost;
%prtrost(num=8);
%put _all_;

A*******s
发帖数: 3942

来自主题: Statistics版 - one question about variable selection in SAS

is there any common SAS option available to "bond" two or more variables
together in forward/backward/stepwise selection? Say if we have many
predictors, we would like two of them stay in model or be dropped out
together. Like Rose said to Jack, you jump I jump. :)
Thanks in advance.

s*******e
发帖数: 226

来自主题: Statistics版 - 请教: How to interpret the effect of two lagged independent variable.

The model is
Y_it=aX_t-1+bX_t-2 + control variables_it
a and b are both significant, how to effectively explain the effect of X_t-1
and X_t-2 ( the same variable with different lags)together?
Thank you very much in advance.

d***2
发帖数: 341

来自主题: Statistics版 - 请教: How to interpret the effect of two lagged independent variable.

So which X is more significant? And how's the correlation between Xs?
Sounds like that X is a highly significant variable to Y, but X itself
varies a lot across time.

-1

A*******s
发帖数: 3942

来自主题: Statistics版 - 弱问个categorical variable有关的问题

continuous-
pros: simple, only 1 df
cons: may not have linear relationship, doesn't make sense sometimes
catogorical-
pros: fit data better, makes sense
cons: for m-level variable, u have m-1 df, could be overfitting. when m is
too large, cannot have stable estimates for different intercepts. hard to
implement it in production
something in between-
treat it as ordinal/categorical and do binning, grouping or clustering
levels by using bivariate relationship (between Y and X). CART, CHAID,
Greenac... 阅读全帖

t********1
发帖数: 799

来自主题: Statistics版 - sas里怎么计算同一个variable下面不同row的差值

sas里怎么计算同一个variable下面不同row的差值?
xiexie

a********s
发帖数: 188

来自主题: Statistics版 - 也弱问一个SAS里面genotype/SNP variable recoding的问题

You can write a SAS Macro to assign 0,1,2 to each SNP, one by one. If you
want, you can refer the following steps:
(1) Use PROC CONTENTS and PROC SQL to output all SNP names into a macro
variable, separated by " "
(2) Use DO WHILE ... (statement) ... END to assign 0,1,2 to each snp based
on alleles' frequencies.
(2.1) Inside the (statement), use PROC FREQ, and data MERGE functions to
calculate frequencies, assign 0,1,2, and merge dataset

d*******y
发帖数: 1154

来自主题: Statistics版 - 请教backward, forward and stepwise在做variable selection时的区别

backward selection > forward selection b/c you can have a global look on
variable.
stepwise should be similar to backward

c*******o
发帖数: 8869

来自主题: Statistics版 - 问个傻问题，如何做三个variable的correlation

如果是一对variables, 很容易做correlation, scatter plot 计算r square, 但是如
果是三个变量，可以做三维的scatter plot, 但是如果计算r square呢？还是说只能
分着计算三对pairwise correlation 的r square?

n****o
发帖数: 1167

来自主题: Statistics版 - txt数据文档太大，如何提取到variable list？

用infile，但不知道input哪些variable，求教

C**********o
发帖数: 658

来自主题: Statistics版 - 如何改 variable names in SAS or SQL

Case:
I have 107 variables to be renamed. Can I do it in this way (SAS) --
proc select xxx,xxx,xxx, ...., xxx
as xxxnew1,xxxxnew2,xxxnew3,....,xxxnew107
?
Or do I have to rename them one by one, such as
Proc select xxx as xxxnew11, xxx as xxxnew2,... ??

C**********o
发帖数: 658

来自主题: Statistics版 - 如何改 variable names in SAS or SQL

Thank you. I end up copying the old and new variables in excel, then add a '
=' between them, then copy it into notepad, then copy it into SAS.

s*****p
发帖数: 299

来自主题: Statistics版 - 请教一个用SAS建立新variable的问题

数据里有每个学生的数学成绩，如何建一个新variable rank呢？
先谢谢了！

S********a
发帖数: 359

来自主题: Statistics版 - 【包子】弱问个dummy variable问题

reference group可以任意指定，不一定非用sas default的那个，我的错误是在建
dummy variable的时候忽略了一个missing，造成slightly difference on estimate,
after correcting it, they are all the same. Thank you for your time. 包子奉
上。

default the opposite is the case.
,
margin, you have very serious troubles, indicating your coding is wrong.

z*******n
发帖数: 15481

来自主题: Statistics版 - correlation between the explanatory variables

Effects of multicollinearity
{ fitted values are probably OK
{ estimate of parameters have high std errs
{ difficult to interpret the estimate of parameter
{ great sensitivity to minor changes in model/data
(e.g., if we remove a variable or case)
给点包子吧谢谢！

l*******s
发帖数: 437

来自主题: Statistics版 - sas 中如何给一个variable 加密？

一个variable，是email 的subject，里面可能会出现人名，住址，电话号码等等私人
信息，现在被要求将这些敏感信息全部用abc代替，其他还是正常显示。请问应该如何
做？
非常感谢！

g****8
发帖数: 2828

来自主题: Statistics版 - sas 中如何给一个variable 加密？

You may write a macro to do so. Try google it, I think there are some
existing macros there.
But more efficient, you can use a cross reference table to store the real
subjects with random numbers as identifier variable, and use the identifier
in your original table.

o****o
发帖数: 8077

来自主题: Statistics版 - sas 中如何给一个variable 加密？

hash those variables and drop the original ones?

g****8
发帖数: 2828

来自主题: Statistics版 - sas 中如何给一个variable 加密？

好像是看错了你的问题了。
你这个问题，确实要 regular expression 来解决。不过，具体的，我也不会了。
最后好是你那些敏感信息，还有其他的单独variable。要不然，还要自动detect 这些
敏感信息？

m********g
发帖数: 65

来自主题: Statistics版 - 请教PLS里面Variable Importance in Projection(VIP) 图里的第二条虚线表示什么意思啊？

求教：
PLS里面Variable Importance in Projection(VIP) 图里的第二条虚线表示什么意思啊？
model quality的Q^2是复数，这个怎么解释啊？
谢谢

l******e
发帖数: 162

来自主题: Statistics版 - create index with correlated variables

又没有人能帮个忙的，多谢了！
My question is to create an index to measure a manager's earnings forecast
ability. I use earnings forecast accuracy (dollars difference) and the
earnings forecast horizon (days difference) to capture the ability. However,
those two variables condition on each other. How should I weight them and
create index? The smaller dollar difference and the longer horizon will
reflect higher forecast ability.
谢谢。。。

r*****g
发帖数: 99

来自主题: Statistics版 - 求助SAS CODE：如何同时对90个variables进行log transformation?

我有90多个nutritional variables 需要进行log transformation,新的变量名就是旧
的变量名前加log,请教高手如何能同时对这些变量进行转换？

p***r
发帖数: 920

来自主题: Statistics版 - 求助SAS CODE：如何同时对90个variables进行log transformation?

or you can do it in another brutal way
*WIDE TO LONG;
PROC TRANSPOSE DATA=data1 OUT=data2;
BY var_id;
VAR _ALL_;
RUN;
data data3;
set data2;
log_var=log(col1);
run;
*LONG TO WIDE;
PROC TRANSPOSE DATA=data3 OUT=data4
BY var_ID;
ID variable;
VAR log_var;
RUN;

p****e
发帖数: 165

来自主题: Statistics版 - 在R中ifelse如何运用于variable recoding?

data set 里有个叫做“dimension”的categorical variable, 里面有各个产品的尺寸
（"XL", "LL", "M", "S", "SP") , 现在我想把这个“XL, LL"这两个尺寸合并为“L",
其他的尺寸保持不变，我想用ifelse这样做：
d$dimension_new <- ifelse(d$dimension == 'XL'| d$dimension == 'LL', 'L', d$
dimension)
可是当我列出新变量时，发现尺寸变成“1”，“2”， “3"，如下：
>level(dimension_new)
"L" "1" "2" "3"
>head(d)看了一下，果然dimension_new变成"L" "1" "2" "3"
有没有什么办法在ifelse这一步就可以让新变量dimension_new变成：L, M, S, SP ?
我下面要用dimension_new在regression中，不想用"L" "1" "2" "3" ,这样
我还得一个个去校对他们所对应的真... 阅读全帖

L*****2
发帖数: 66

来自主题: Statistics版 - correlated variables in the model

The model has three variables X, Y and Z, where Z=Y/X. The linear
correlations among X, Y and Z are low. Is this model OK? Any comments are
highly appreciated!

k*******a
发帖数: 772

来自主题: Statistics版 - 请教一个SAS macro variable 问题

所有的CAT放在一个macrovariable输入的时候
CAT = CAT1 CAT2 CAT3..., DOG= DOG1 DOG2 ...
MACRO里面用scan把这些信息自动读进去转化成各个variable

b*****o
发帖数: 482

来自主题: Statistics版 - 请教一个SAS macro variable 问题

sas有一个通配符-冒号(:)
如果你写 CAT: 他就会自动匹配所有CAT打头的variable
不过我不知道在做macro变量传递的时候能不能这样用, 你可以试试.

l******1
发帖数: 292

来自主题: Statistics版 - 怎样算一个variable前5个值的mean

现在一个variable有100个value，现在我只想算出前五个值的平均值，应该怎么写code
？谢谢

l******o
发帖数: 3764

来自主题: Statistics版 - 求问stata import data from CSV之后variable name的问题

stata import data from csv file之后, 本来csv里面variable name当中的空格就被
自动去掉了，比如date of birth变成dateofbirth
但是我看到前面的人留下的data，空格变成了下划线date of birth=> date_of_birth
我需要apending data，所以名字得保持一致
谁能告诉我怎么样才能变成下划线啊
many many thanks

D*G
发帖数: 471

来自主题: Statistics版 - model和variables都sig.但每个category都不sig

我的意思就是只留下某些category的exp（b）比较大的变量。如果所有category的OR都
很小，这个
variable就可以扔掉。
比如你的model里面"age"变量的各个category之间差距巨大（OR〉2 for example），
你就可以
保留这个变量。如果OR很小，比如1.001，即便结果是significant的这个变量都不用保
留。你最开
始说的几个组之间都不significant只有overall significant的情况也可以扔掉。
具体OR多大你愿意留下得看具体情况，有些数据OR到1.2都很大了，有些数据到5才算大
也可能，都是相
对的。就像有些人在葡萄堆里挑大葡萄，有些人在西瓜堆里挑大西瓜。不知道你具体的
数据和领域很难
给你说一个具体的值。

k******u
发帖数: 250

来自主题: Statistics版 - SAS 问题求助 -- create new variable

创建一个新的variable z，用如下code
data bank;
infile 'C:bankdata.txt' firstobs =2;
input Name $ 1-15
Acct $ 16-20
x 21-26
y 27-30;
z = x * y;
run;
proc print data = bank;
run;
这个程序works good，
但是当我把infile 去掉，用datalines；的statement输入x y的值，同时在data中计算
z=x*y,却被告知statement is not valid。
为什么呢？

p*****V
发帖数: 43

来自主题: Statistics版 - What's the SAS V5 variable?

Thank you.
73.Which name is a valid SAS V5 variable name?
A. _AESTDTC
B. AESTARTDTC
C. AE-STDTC
D. AE_START_DTC
Which one is the right answer?

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天