由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Statistics版 - Help for freq subset
相关主题
data set problem (SAS)--first two will have two(baozi) Happy holiday每个ID出现一次,missing去掉,请问高手用SAS怎么做?
问一个data subset的问题sas问题
sas问题how to output cumulative percent to a dataset from Proc Freq?
ask a sum function求教proc sql 问题
新人报道,兼问SAS data set的问题请教这种freq 该用什么code算(sas)?Thanks!
SAS问题请教Need help! 如何用sas做一个n*n的count tabulate
SAS code help神奇的proc means
SAS code求教SAS怎么把所要的frequency都display在一个表中?
相关话题的讨论汇总
话题: personid话题: subset话题: count话题: freq话题: xxxx5
进入Statistics版参与讨论
1 (共1页)
m**********u
发帖数: 2
1
I am using SAS to deal with a huge data file with over 10 millions of
observations. “personid” is a variable. The structure of “personid” is
like, for example, xxxx2, xxxx2, xxxx2, xxxx3, xxx3, xxxx5, xxxx5, xxxx5,
xxxx6, xxxx7……., for a unique “personid”, there are several observations
as shown in the example above. I am trying to get subset that the frequency
of the unique “personid” has certain frequency, say, frequency=3, in the
case of example above, that means I want to obtain a subset:
g*******y
发帖数: 380
2
proc sql;
CREATE TABLE a AS
SELECT *, count(*) as count_id from b
group by personid
where calculated count_id=3;
QUIT;
Not sure it works, but just represent a idea.
o****o
发帖数: 8077
3
might be this way:
proc sql;
create table new as
select a.*
from yourdata as a
left join (select personid, count(*) as count
from yourdata(keep=personid)
group by personid
) as b
on a.personid=b.personid
where b.count>=3
;
quit;
or a SAS way
proc freq data=yourdata noprint;
table personid/out=_freq_(where=(count>=3)
keep=personid count


【在 g*******y 的大作中提到】
: proc sql;
: CREATE TABLE a AS
: SELECT *, count(*) as count_id from b
: group by personid
: where calculated count_id=3;
: QUIT;
: Not sure it works, but just represent a idea.

s*******2
发帖数: 791
4
I use first. and last. in data step. It works well.
I will try sql later.
proc sort data=raw;
by personid;
run;
data subset (drop=count);
set raw;
by personid;
if first.personid then count=1;
else count+1;
if count<=3 then output;
proc print data=subset;
title 'Subset size is 3';
run;
s*******2
发帖数: 791
5
oloolo可不可以再看看,SQL的结果还是原来的数据集合
count 是计算每个group里的number of non-missing values,我想是不能用count限制
新的数据集合里每个group里只有3个相同的value。
正在学习SQL中,也不是很明白。请指点.... 谢谢
SAS step我是完全看不懂(从declear statement开始)爆汗

【在 o****o 的大作中提到】
: might be this way:
: proc sql;
: create table new as
: select a.*
: from yourdata as a
: left join (select personid, count(*) as count
: from yourdata(keep=personid)
: group by personid
: ) as b
: on a.personid=b.personid

s*r
发帖数: 2757
6
use 'having'

【在 g*******y 的大作中提到】
: proc sql;
: CREATE TABLE a AS
: SELECT *, count(*) as count_id from b
: group by personid
: where calculated count_id=3;
: QUIT;
: Not sure it works, but just represent a idea.

o****o
发帖数: 8077
7
sorry , use join, not left join
"sir" is right, you should use having statement in your code in place of 'where'
proc sql;
create table new as
select a.*
from yourdata as a
join (select personid, count(*) as count
from yourdata(keep=personid)
group by personid
) as b
on a.personid=b.personid
where b.count>=3
;
quit;
m**********u
发帖数: 2
8
Thanks a lot for all your help!
1 (共1页)
进入Statistics版参与讨论
相关主题
SAS怎么把所要的frequency都display在一个表中?新人报道,兼问SAS data set的问题
排序的问题,请问高手用SAS怎么做?SAS问题请教
Standardize city names in SASSAS code help
SAS 求助, 一个小问题, 包子答谢SAS code求教
data set problem (SAS)--first two will have two(baozi) Happy holiday每个ID出现一次,missing去掉,请问高手用SAS怎么做?
问一个data subset的问题sas问题
sas问题how to output cumulative percent to a dataset from Proc Freq?
ask a sum function求教proc sql 问题
相关话题的讨论汇总
话题: personid话题: subset话题: count话题: freq话题: xxxx5