由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Statistics版 - [R]how to sample all possible continuous subset from ordered data
相关主题
R 问题model sample size重要吗?
怎样用R subset character stringa question about SAS code
R:matrix请问一个开问题的例子与用处
please help on R subtable question !R一问
R 扫描matrix如何把model fitting statistics 读出来(R)
问个SAS问题Dashagen请进
R问题请教:如何从data frame按条件取出部分columnGenerate and Retrieve Many Objects with Sequential Names
[R] a row of a matrix is not a matrix?sas macro 问题
相关话题的讨论汇总
话题: subset话题: continuous话题: sample话题: possible话题: data
进入Statistics版参与讨论
1 (共1页)
j***3
发帖数: 142
1
if I have table like this:
1 1.234
2 1.657
3 1.564
4 2.343
..
,,
the first column is order information and the second column is value.
is there a way to sample all possible continuous subset of x row from the table?
thanks
r********0
发帖数: 65
2
How to define your "continuous subset" here?
l***a
发帖数: 12410
3
if there is no duplicate and the order var has no missing, try this
%macro subset;
proc sql;
select count(order) into :n
from data0;
quit;
%do i=1 %to &n.;
%do j=1 %to &n.-&i.+1;
data data&i._&j.;
set data0;
if order>=&j. and order<=&n-&j+1;
run;
%end;
%end;
%mend subset;

table?

【在 j***3 的大作中提到】
: if I have table like this:
: 1 1.234
: 2 1.657
: 3 1.564
: 4 2.343
: ..
: ,,
: the first column is order information and the second column is value.
: is there a way to sample all possible continuous subset of x row from the table?
: thanks

j***3
发帖数: 142
4
thanks for the reply,
continuous set just means one block of rows from the table.
I was think if there are ways not use (or use less) loops because the
dataset is huge. and R is not efficient handling loop
r********0
发帖数: 65
5
row <- seq(1:100)
row <- sample(row) ## it will randomize the 100 numbers
then you can pick like first n numbers(depends on the size of each subset
you need) as your subset index.
c <- row[1:n]
data[c,]
if you want all the possible sizes of subset i'm afraid you still need to
use loop.
Hope it might help you
D******n
发帖数: 2836
6
thats totally n(n+1)/2 subsets.

table?

【在 j***3 的大作中提到】
: if I have table like this:
: 1 1.234
: 2 1.657
: 3 1.564
: 4 2.343
: ..
: ,,
: the first column is order information and the second column is value.
: is there a way to sample all possible continuous subset of x row from the table?
: thanks

j***3
发帖数: 142
7
maybe I did not express myself clear enough,
what I need is a subsets that has continuous row number, the solution
rabbit1860 give is random row number.
the solution libra give is in SAS ? the syntax looks strange to me.
DaShagen, I think it is basically a sliding window of size x and step 1.
g********r
发帖数: 8017
8
你的N有多大?如果是几万量级的,loop速度也可以接受.
for(i in 1:(n-m+1))
{
do.something(a[i:(i+m-1),])
}

【在 j***3 的大作中提到】
: maybe I did not express myself clear enough,
: what I need is a subsets that has continuous row number, the solution
: rabbit1860 give is random row number.
: the solution libra give is in SAS ? the syntax looks strange to me.
: DaShagen, I think it is basically a sliding window of size x and step 1.
:

D******n
发帖数: 2836
9
result<-lapply(1:nrow(data)-x+1,function(t) data[t:(t+x-1)])

【在 j***3 的大作中提到】
: maybe I did not express myself clear enough,
: what I need is a subsets that has continuous row number, the solution
: rabbit1860 give is random row number.
: the solution libra give is in SAS ? the syntax looks strange to me.
: DaShagen, I think it is basically a sliding window of size x and step 1.
:

j***3
发帖数: 142
10
Thanks goldmember, it is in the order if billions so I'd best avoid loop.
Thanks DaShagen, I think it might be the best solution.
an alternative is use rollapply in zoo package, but lapply is more
straightforward.
1 (共1页)
进入Statistics版参与讨论
相关主题
sas macro 问题R 扫描matrix
Urgent R Question问个SAS问题
a R loop questionR问题请教:如何从data frame按条件取出部分column
大牛指点下面的R Code 怎么用Loop来实现[R] a row of a matrix is not a matrix?
R 问题model sample size重要吗?
怎样用R subset character stringa question about SAS code
R:matrix请问一个开问题的例子与用处
please help on R subtable question !R一问
相关话题的讨论汇总
话题: subset话题: continuous话题: sample话题: possible话题: data