s********l 发帖数: 245 | 1 我在做一个Project, 遇到一点问题不知道怎么解决,想请教一下这里的朋友:
我有一组数据,包括两个变量, date and id such as:
year id
1991 1
1991 2
1991 3
1991 8
1992 1
1992 2
1992 5
1992 10
1992 34
1993 2
1993 4
1993 5
1993 10
1993 19
1993 23
1994 1
1994 2
1994 3
1994 4
1994 9
1994 19
1994 29
......
我现在算连续两年有多少个不同的id, 所以需要定义个新的变量group, 如果year =
1991 or 1992 则 group =1, year =1992 or 1993 则,group =2, year =1993 or
1994 则group =3...
我该怎么解决这个问题呀?谢谢! | s********l 发帖数: 245 | | s*****n 发帖数: 2174 | 3 你把问题说清楚一些, 比如你期望的输出是什么样子的?
如果只是要计算两年里面unique id的话, 根本没必要设定什么
group variable啊. 对于给定的group
length(unique(data$id[data$year >= group + 1990 &
data$year <= group + 1991]))
就行了 | c*******o 发帖数: 8869 | 4 如果用SAS的话, 试试这个MACRO:
%macro a;
proc sql;
select min(year) into: min from yourdata;
select count(distinct year) into: n from yourdata;
%do i=1 %to &n;
select count(distinct id) as count_distinct_id,
min(year) as start_year, max(year) as end_year
from yourdata where &min+&i-1<=year<=&min+&i;
%end;
quit;
%mend;
%a;
【在 s********l 的大作中提到】 : 我在做一个Project, 遇到一点问题不知道怎么解决,想请教一下这里的朋友: : 我有一组数据,包括两个变量, date and id such as: : year id : 1991 1 : 1991 2 : 1991 3 : 1991 8 : 1992 1 : 1992 2 : 1992 5
| s********l 发帖数: 245 | 5 非常感谢!that works the way I expected! | s*********e 发帖数: 1051 | 6 try this one and it is simpler.
proc sql;
create table
two as
select
b.year as min_year,
b.year + 1 as max_year,
count(distinct a.id) as count
from
yourdata as a, yourdata as b
where
a.year <= b.year + 1 and a.year >= b.year
group by
b.year;
quit; | c*******o 发帖数: 8869 | 7 很强大.
这个问题看起来简单, 但是颇动了一番脑筋之后, 我最后还是不得不用MACRO搞定
【在 s*********e 的大作中提到】 : try this one and it is simpler. : proc sql; : create table : two as : select : b.year as min_year, : b.year + 1 as max_year, : count(distinct a.id) as count : from : yourdata as a, yourdata as b
|
|