由买买提看人间百态

topics

全部话题 - 话题: princomp
1 (共1页)
h******a
发帖数: 198
1
来自主题: Statistics版 - 请教PCA
R里PCA的命令是princomp(data, cor=TRUE)。cor=TRUE是用correlation matrix来做
PCA。问题是,我用princomp(data, cor=TRUE)$score得到的principal component和自
己算的差的有点远。不知道哪里错了?附件是data
我自己是这样算的:
data是1256 x 8的matrix,一行是一个observation。用eigen()函数算data的
correlation matrix的单位化正交特征向量,每一列是一个特征向量,按特征值大小排
好序。最后用data * eigen,理论上也应该是component。
code:
score<-princomp(data,cor=TRUE)$scores
cor_pca<-cor(data)
score1<-data%*%eigen(cor_pca)$vectors
s*****n
发帖数: 2174
2
来自主题: Statistics版 - 请教PCA
PCA本质上就是高维变量空间的一个旋转, 这里包含不同维度之间的加权问题. 根据情况
不同, 你可以选择normalize或者不normalize, 但是要一致. 你把correlation matrix
进行 eigen value decomposition, 表示你认为各个维度是等权重的(normalized), 也
就是说的计算是在一个adjusted scaled空间里面. 但是你乘的data没有经过normalize
还是在原始空间里面. 这两个不能乘在一起.
如果你是直接eigen(cov(data)), 就可以和data乘在一起了, 因为两个都是在原始变量
空间里面. 这等价于princomp(..., cor = F)
不过做PCA, 推荐用prcomp(). 这个是基于SVD的, 要比基于EVD的princomp()精度高. 而
且我觉得prcomp()的输出也要比princomp更clear.
s***r
发帖数: 1121
3
来自主题: Statistics版 - SAS包子请教 - use PCA to create an index
我想用以下三个变量建立一个index:Age, gender, and nationality. 以下是SAS的
output
(using Proc princomp).
我的问题是:是不是Eigenvectors就是coefficient weight? 我的index可不可以这样
建立?
index = age * (0.460490) + gender * (0.689249)+nationality *(0.559361)
================================
Principal Component Analysis 77176
The PRINCOMP Procedure
Observations 15553
Variables 3... 阅读全帖
l****k
发帖数: 16
4
返回的是一个COEFF matix, 看不懂... 根据这个matrix, 怎么知道那些是principle
components?
j**u
发帖数: 6059
r****y
发帖数: 1437
6
This is up to your input matrix, as mentioned in manual,
row of X is obs, column of x is variable
then coeff(:, 1) is the first PC (in decreasing component variance),
and so on.
scores are then in row manner again, scores(1, :) is the score for
the
first obs vector, and so on.

principle
r****y
发帖数: 1437
7
来自主题: Mathematics版 - question about Principal Component Analysis
why? Just solve the eigenvalues and eigenvectors for
covariance matrix iteratively. There is numerical routine ready for this,
the first one is always 1st PC, etc.
In matlab, they have built-in PCA command, try princomp.
n*********3
发帖数: 21
8
来自主题: Quant版 - PCA and how to estimate sigmas
谢谢详细解释。
我用的是principle component analysis.
我的做法是用princomp method in MATLAB to derive eigenvalues and
eigenvectors. Then find the most important eigenvalues which represents
the covariance matrix. Hence, r = 2.
Yes, with 2 known eigenvectors (corresponding to the selected
eigenvalues)
Then, what is the next? to derive the parameters given r = 2?

.,
independent
p********a
发帖数: 5352
9
来自主题: Statistics版 - [合集] 多变量回归
☆─────────────────────────────────────☆
waterpuma (waterpuma) 于 (Wed Jan 2 14:15:08 2008) 提到:
This is a problem form onsite interview.
There are more than 100 variables. Based on those variables, how to
regression a model for one depend variable? We need use all those variables
or pick up some variables? If we need choose some variables, which is the
best way to choose it, let the option of stepwise do it or we should use
factor and princomp do it?
Thanks for any response!
☆───────────────
A*****s
发帖数: 13748
10
来自主题: Statistics版 - 有没有熟悉 proc princomp的童鞋啊?
我有个4 variable, 10 record的data set,做完pca后,"out="这个option给我输出了
一个10*4的矩阵,和原来10*4的data组成一个新data set,"out="输出的那个10*4矩阵
到底是什么东西啊?看了半天没明白。。。
它的variable是: prin1 prin2 prin3 prin4
l********s
发帖数: 430
11
来自主题: Statistics版 - 有没有熟悉 proc princomp的童鞋啊?
是特征向量eigenvector吧
A*****s
发帖数: 13748
12
来自主题: Statistics版 - 有没有熟悉 proc princomp的童鞋啊?
不是啊,特征向量是4个4维的啊,最多4*4的矩阵啊
那个矩阵是10*4的
我忽略它了,直接用outstat=这个option,能把特征向量给我输出了
h******a
发帖数: 198
13
R 里面也不少用LINPACK写的,比如princomp,做PCA的。
s*****n
发帖数: 2174
14
来自主题: Statistics版 - 请教PCA
PCA在做之前, 要把矩阵normalize.
princomp里面自动包含了这个过程
你自己用特征值计算之前要自己做.
用 scale() 函数就可以.
s*****n
发帖数: 2174
15
来自主题: Statistics版 - 请教PCA
对啊, princomp() 就自动为你做了.
可是你自己分解特征值那样算的时候, 也要相应的normalize啊.
l********s
发帖数: 430
16
来自主题: Statistics版 - 请教PCA
我记得r里面还有一个prcomp的,和princomp有些不一样。
o****o
发帖数: 8077
17
来自主题: Statistics版 - 请教principal component analysis
proc princomp data=X out=out;
var X1-Xn;
run;
proc plot data=out;
plot Prin1*Prin2="*";
plot Prin1*Prin3="+"
run;
quit;
o****o
发帖数: 8077
18
To obtain Eigen Decomposition from PCA, you need to observe the relationship below:
eigen decomposition of square matrix obtains the same eigen vector matrix as in PCA (the V matrix)
and Eigen values are those satisfy: AV=A[v1, v2...vk]=[\lambda1, \lambda2...\lambda_k].*V
so that you can first use PROC PRINCOMP NOINT COV outstat=_V(where=(_TYPE_='USCORE'))
then conduct matrix multiplication of A%*%V=\Omega
load \Omega and V into a data set, divid each element of \Omega by corresponding element i
o****o
发帖数: 8077
19
来自主题: Statistics版 - 请问哪里有PCA的SAS code 啊
what do u want to do with PCA?
PROC PRINCOMP for PCA
PROC PLS for PCA REGRESSION
z**********i
发帖数: 12276
20
我就用PROC REG 加个OPTION,可以得到VIF.
VARCLUS是新听说的.又学新知识了.不知这个和proc princomp什么关系.
d*********k
发帖数: 1239
21
用PCA的时候,如果N比P大怎么办啊?就是large P,small N的问题?
直接用R的 princomp()就直接报错啦啊
谢谢啊
d*********k
发帖数: 1239
22
prcomp()
貌似用这个就行了~
可能是prcomp()和princomp()的算法不一样,一个是直接svd算,一个是用variance-
covariance matrix算的,我的理解对么?
谢谢各位了啊~
f******y
发帖数: 2971
23
来自主题: Statistics版 - PCA and linear regression
suppose two random variables, X and Y, mean of them are very small.
I can get the slope by linear regression lm(Y~X);
I can also do PCA,
data = data.frame(X=X, Y=Y);
princomp(data);
I expected the slope of the first PC vector to be very close to the slope
given by linear regression. I tried it in R, the results are very different.
Anyone can explain?
t*****w
发帖数: 254
24
来自主题: Statistics版 - 请问面试 R 应该怎么准备?
When I had my job interview, they always tested my SAS skill.However I use R
all the time. To help your preparation, read my R codes to see how much you
can understand it.
%in%
?keyword
a<-matrix(0,nrow=3,ncol=3,byrow=T)
a1 <- a1/(t(a1)%*%spooled%*%a1)^.5 #standadization in discrim
a1<- a>=2; a[a1]
abline(h = -1:5, v = -2:3, col = "lightgray", lty=3)
abline(h=0, v=0, col = "gray60")
abs(r2[i])>r0
aggregate(iris[,1:4], list(iris$Species), mean)
AND: &; OR: |; NOT: !
anova(lm(data1[,3]~data1[,1... 阅读全帖
s****b
发帖数: 2039
25
来自主题: Statistics版 - PCA作图
是不是4个PC以上,PCA就是高维空间的,不可能作出图来?
您是用PROC PRINCOMP还是用PROC FACTOR?为什么一定有2个PC?
R******d
发帖数: 1436
26
来自主题: Statistics版 - PCA作图
我用的PROC PRINCOMP,有几个变量就有几个PC,作图的时候可以选两个三个的。4个以
上图应该做不出。3个的用SAS我也不会做。
1 (共1页)